121 
Home Page  

  • SciELO

  • SciELO


Finisterra - Revista Portuguesa de Geografia

 ISSN 0430-5027

GIOIA, Thamy Barbara; BARROS, Juliana Ramalho    SILVA, Renato Rodrigues da. Socioeconomic factors and machine learning algorithms applied to neglected diseases risk prediction. Case study in the municipalities of the Goiás State and Federal District, Brazil. []. , 121, pp.109-123.   31--2022. ISSN 0430-5027.  https://doi.org/10.18055/finis28635.

Analyzing the relation between socioeconomic variables and neglected tropical diseases can help managers in the conception of public policies to reduce cases. The objective of this study was to evaluate, based on machine learning algorithms, which socioeconomic variables are more important for the risk classification of three neglected diseases: leprosy, cutaneous leishmaniasis, and dengue. Three algorithms based on decision trees were evaluated: Random Forest (RF), XGBoost, and C5.0. As a study area, the municipalities of the state of Goiás and of the Federal District - Brazil, were delimited. For the dengue risk classes, both the RF algorithm and the XGBoost showed accuracy values above 0.6. Both emphasizing the low-income conditions, literacy, and race as the most important predictive variables. In the leprosy risk classes case, the three algorithms presented accuracy results above 0.6, indicating the variables water supply, literacy, race, and housing as important. For the cutaneous leishmaniasis risk classes, the algorithms showed an accuracy lower than 0.4, making the evaluation of possible predictive variables to the model unfeasible. The three evaluated algorithms revealed approximate predictive performance; however, the RF was slightly higher. The most important socioeconomic variables for dengue and leprosy risk classes prediction were similar.

: Neglected tropical diseases; social determinants; XGBoost; Random Forest; C5.0.

        · | | |     ·     · ( pdf )