World influence of infectious diseases from Wikipedia network analysis (2018)
[1] : Institut UTINAM (UMR 6213) (Université de Franche-Comté)
[2] : Observatoire des Sciences de l'Univers - Terre, Homme, Environnement, Temps, Astronomie (UAR 3245) (Université de Franche-Comté)
[3] : Laboratoire de Physique Théorique (UMR 5152)
Description :
We consider the network of 5416537 articles of English Wikipedia of 2017. Using the recent reduced Google matrix (REGOMAX) method we construct the reduced network of 230 articles (nodes) of infectious diseases and 195 articles of world countries. This method generates the reduced directed network between all 425 nodes taking into account all direct and indirect links with pathways via the huge global network. PageRank and CheiRank algorithms are used to determine the most influential diseases
with the top PageRank diseases being Tuberculosis, HIV/AIDS and Malaria. From the reduced Google matrix we determine the sensitivity of world countries to specific diseases integrating their influence over all their history including the times of ancient Egyptian mummies. The obtained results are compared with the World Health Organization (WHO) data demonstrating that the Wikipedia network analysis provides reliable results with up to about 80 percent overlap between WHO and REGOMAX analyses.
with the top PageRank diseases being Tuberculosis, HIV/AIDS and Malaria. From the reduced Google matrix we determine the sensitivity of world countries to specific diseases integrating their influence over all their history including the times of ancient Egyptian mummies. The obtained results are compared with the World Health Organization (WHO) data demonstrating that the Wikipedia network analysis provides reliable results with up to about 80 percent overlap between WHO and REGOMAX analyses.
Disciplines :
computer science, information systems (engineering science), microbiology (fundamental biology), infectious diseases (medical research), physics, mathematical (physics), multidisciplinary sciences
General metadata
Data acquisition date :
from 1 May 2017 to 31 May 2017
Data acquisition methods :
- Derived or compiled data : Web crawling of Wikipedia editions (May 2017) to retrieve information.
- Simulation or computational data : PageRank, CheiRank and 2DRank algorithms have been used to rank articles of the English Wikipedia language edition (May 2017).
Reduced Google matrix method has been used to infer interaction between articles.
Language :
English (eng)
Formats :
application/pdf, image/png, image/x-eps, text/csv, text/html
Audience :
General, Research, Stakeholder, Policy maker
Publications :
- World Influence of Infectious Diseases from Wikipedia Network Analysis (doi:10.1109/ACCESS.2019.2899339)
Collection :
Publisher :
Institut UTINAM (UMR 6213)
Projects and funders :
-
APEX - Analyse Physique des résEaux compleXes
- Projet recherche, financement 2017 (Region Bourgogne Franche-Comté)
-
GNETWORKS - Google matrix analysis of real complex networks
- I-SITE UBFC (COMUE UBFC)
DOI and links
10.25666/DATAOSU-2019-01-10-02
https://dx.doi.org/doi:10.25666/DATAOSU-2019-01-10-02
https://search-data.ubfc.fr/FR-18008901306731-2019-01-10-02
Quotation
José Lages, Dima Shepelyansky, Guillaume Rollin (2018): World influence of infectious diseases from Wikipedia network analysis. UTINAM. doi:10.25666/DATAOSU-2019-01-10-02
Record created 10 Jan 2019 by José Lages.
Last modification : 18 Mar 2019.
Local identifier: FR-18008901306731-2019-01-10-02.