Introduction: old scars, new codes
Leprosy, an affliction that once shaped empires and exiles, continues to cast a long shadow across the twenty-first century in several areas of the world. Despite the remarkable decline in prevalence following the introduction of multidrug therapy (MDT) in the 1980s, approximately 200,000 new cases are reported annually worldwide, disproportionately affecting the most marginalized communities1,2. Diagnosis still rests largely on the trained human eye: recognition of hypopigmented or erythematous patches with sensory loss, thickened nerves, or the presence of acid-fast bacilli on slit-skin smear1. Yet these signs often arrive late, when neural damage and stigma have already taken root.
The World Health Organization (WHO) has repeatedly emphasized the urgency of accelerating diagnosis and breaking transmission1. However, dwindling clinical expertise, particularly in regions where leprosy incidence is now sporadic, has deepened the diagnostic gap3. Here lies the quiet promise of artificial intelligence (AI): an ensemble of algorithms capable of learning patterns beyond human perception, translating the language of skin, nerve, and data into signals that could shorten the diagnostic odyssey.
Recent years have witnessed a burgeoning interest in digital health interventions for neglected tropical diseases (NTDs), with AI at their frontier4,5. While melanoma, psoriasis, and atopic dermatitis have already been subjects of robust machine learning (ML) pipelines, leprosy has only begun to find its place in this digital landscape6. What follows in this review is both a cartography and a critique: a mapping of where AI has already walked alongside leprosy, and where its footprints remain faint.
AI primer: teaching machines to see
AI, at its core, is the science of teaching machines to see patterns where humans falter. It thrives on three intertwined streams: ML, which allows computers to learn from data without explicit programming; deep learning (DL), which uses multilayered neural networks to extract hierarchies of meaning; and convolutional neural networks (CNNs), architectures inspired by the visual cortex that excel in recognizing features within images4.
In dermatology, these architectures have become particularly powerful. A CNN can parse the subtle borders of a melanoma, distinguish the erythema of eczema from psoriasis, or map acne severity with accuracy rivalling expert clinicians4. This success provides a fertile precedent for NTDs, where diagnostic expertise is scarce and stigma delays recognition.
The learning process of AI can be explained through an analogy drawn from animal behavior. Consider a deer named Shakuntala, whose survival depends on learning from her surroundings. Early in life, Shakuntala learns by repetition. Certain sounds or movements are repeatedly followed by danger, while others are harmless. Over time, she learns to respond correctly, even though she does not know the reason behind each signal. This is similar to ML, where systems learn by repeatedly seeing labeled examples.
With experience, Shakuntala’s learning becomes more advanced. She no longer depends on a single sign. Instead, she combines several cues at once, such as changes in sound, movement, and wind direction. Looking at these cues together allows her to make better decisions. This reflects DL, where models learn complex patterns by analyzing large amounts of data rather than following fixed rules.
When observing her environment, Shakuntala does not take in everything at once. She notices small details first, such as movement at the edge of vision or contrast among leaves, and then combines them to understand the larger scene. CNNs work in a similar way by analyzing small parts of an image and gradually combining them to reach an accurate conclusion.
In dermatology, disease recognition is fundamentally visual; the defining “identity” of a condition lies in the morphology of the lesion, including its shape, border characteristics, color variations, and surface texture. AI systems are trained to interpret these same visual cues. Through exposure to large, annotated image datasets, AI algorithms learn to differentiate the characteristic features of leprosy from those of other dermatoses, much like how a child learns to recognize familiar faces within a family through repeated observation and pattern recognition4.
For leprosy, the biological canvas is both complex and fragile: hypopigmented or erythematous patches, nodules, infiltrated skin, and the silent thickening of nerves. These lesions are visual, yet their patterns are subtle, overlapping with other dermatoses such as vitiligo, pityriasis alba, tinea, and even eczema6. The promise of AI is that it can distil thousands of such images into clusters of recognition, seeing regularities invisible to the human gaze.
But AI is not merely an “eye.” It can be trained to integrate multimodal data: clinical metadata (age, sex, geography), sensory test results, slit-skin smears, histopathological slides, or even molecular signatures7. This layered approach mirrors the Ridley-Jopling spectrum2, where disease expression exists along a continuum rather than in binaries. Algorithms can learn the “grey zones” that human classification often struggles with.
Thus, AI in leprosy begins not as a replacement for the clinician but as a new interpreter of signals: signals etched into skin, whispered through nerves, and translated into code.
Diagnosis: spotting the lesions we miss
Diagnosis has always been leprosy’s first betrayal: the faint hypopigmented patch mistaken for pityriasis alba, the tingling nerve dismissed as fatigue, the nodule that waits too long before being named. Even expert clinicians may falter when early signs mimic more common dermatoses. AI offers a new lens, not a perfect one, but one that does not tire, forget, or look away.
Two recent systematic reviews chart the landscape. Fernandes et al. synthesized 21 studies and found that most models, built on CNNs, achieved high diagnostic accuracy but suffered from limited datasets and poor external validation6. De Andrade et al. expanded the view to 30 studies, highlighting CNNs, support vector machines (SVMs), and neural networks trained on diverse image sets, again promising, but hindered by heterogeneity8.
Dermoscopy, long considered an extension of the dermatologist’s eye, is now becoming the bridge between human perception and machine vision. In a multicentric study, Ankad et al. showed that distinct dermatoscopic patterns, including distorted pigment networks (90.6%), focal white areas (75.5%), and reduced follicular openings (81.1%), could reliably indicate leprosy and its spectral variations, providing non-invasive diagnostic cues9. Parallel advances in AI have already transformed dermoscopic analysis in other skin diseases. Olayah et al. trained hybrid CNNs (AlexNet-GoogLeNet-VGG16) to interpret dermoscopy images with near-human precision (accuracy 96.1%)10. Together, these findings suggest that integrating dermatoscopic imaging into AI pipelines could enable early, non-invasive identification of leprosy lesions, where pigment, texture, and vascular clues become quantifiable patterns for machines to learn.
Among primary studies, Barbieri et al. demonstrated the AI4Leprosy system, combining clinical photographs with metadata to achieve over 90% accuracy (area under curve (AUC) of 96.46%)11. Beesetty et al. used few-shot learning on only 368 images, suggesting that even with scarce data, algorithms can reach ~73% accuracy12. Baweja et al. went further, introducing “LeprosyNet,” an explainable CNN architecture. By visualizing which parts of the lesion influenced classification, their model not only reached 98% accuracy but also opened the “black box” of AI to clinicians13.
Together, these studies (Table 1) suggest that AI can already “see” the lesions we miss, though for now, only in curated datasets. The real challenge is not accuracy in silico, but translation into crowded clinics and rural outposts where stigma and silence still prevail.
Table 1 Diagnostic applications of AI in leprosy: evidence from recent studies
| Study | Dataset | AI method | Performance | Contribution |
|---|---|---|---|---|
| Fernandes et al. (JCM SLR)6 | 21 studies | Mostly CNNs | Accuracies generally high | First pooled evidence, highlights small datasets |
| De Andrade et al. (PLOS SLR)8 | 30 studies | CNNs, SVM, NN | Promising, heterogeneous | Most comprehensive systematic review |
| Barbieri et al. (Lancet Reg Health)11 | 1,229 images + 585 metadata | CNN + logistic regression | 90% accuracy, AUC 0.96 | Proof-of-concept, strong real-world dataset |
| Beesetty et al. (IJL)12 | 368 images | Few-shot learning CNN | ~73% accuracy | Shows feasibility with minimal data |
| Baweja et al. (IEEE)13 | Custom dataset | LeprosyNet (explainable CNN) | 98% accuracy | Introduces explainability via Grad-CAM |
AI: artificial intelligence; CNNs: convolutional neural networks; SVMs: support vector machines; NNs: neural networks; AUC: area under curve; Grad-CAM: gradient-weighted class activation mapping.
Across the five key resources6,8,11-13, AI models for leprosy diagnosis were primarily trained on curated datasets of visible skin lesions, including patches, plaques, macules, papules, and nodules, captured under standardized imaging conditions or sourced from public repositories. Only Barbieri et al.11 integrated clinical and sensory metadata (e.g., loss of thermal sensation, paraesthesia, scaling), whereas others like Beesetty et al.12 focused purely on image-based pattern recognition. No study included leprosy reactions (Type 1 or Type 2) or neural leprosy in training datasets, nor did they analyze AI’s ability to distinguish reactional states. Thus, while AI systems can now differentiate “leprosy-like lesions” with high accuracy in silico, their clinical input variables remain largely confined to morphological and sensory features of primary lesions, not inflammatory or reactional phases.
Classification: from Ridley-Jopling to algorithms
The Ridley-Jopling spectrum, with its polar forms (tuberculoid at one end, lepromatous at the other) and the borderline states in between, has long been the compass of leprosy classification2. Yet, in practice, the spectrum is often blurred. Clinical assessment can be subjective; slit-skin smears and histopathology, while useful, remain inconsistently available. This diagnostic ambiguity between paucibacillary (PB) and multibacillary (MB) disease carries therapeutic consequences, since treatment regimens depend on classification.
AI is now being trained to navigate these grey zones. De Souza et al. used registry data from Brazil’s SINAN system (National Notifiable Diseases Information System) to develop a mobile health app that applies ML algorithms (random forest, decision trees, logistic regression) to classify patients as PB or MB. Their best-performing model achieved 94% sensitivity and 87% specificity, suggesting AI could assist even non-specialist health workers in the field14.
Simões et al. developed the ML for Leprosy Suspicion Questionnaire Screening, applying SVMs and other models to patient-reported data. Their SVM classifier reached 85.7% sensitivity and 69.2% specificity, a performance notable for using only a short, 14-item questionnaire15.
Each of these algorithms brings distinct strengths. Decision trees split data into simple yes/no questions, like a flowchart, making them easy to interpret. Random forests combine many such trees to reduce errors and improve stability. Logistic regression uses probability to separate categories, a more statistical approach. SVMs work differently: they draw a “boundary” in the data that best separates PB from MB cases, even when the differences are subtle4.
These examples (Table 2) show that AI is not limited to images; it can also classify through metadata, questionnaires, and registry inputs. Just as Ridley and Jopling once gave clinicians a framework to interpret leprosy’s variability, algorithms may provide new maps: not to replace judgment, but to refine it, especially where expert dermatologists are scarce.
Table 2 Classification attempts of leprosy using AI methods
| Study | Dataset/Input | AI method | Performance | Contribution |
|---|---|---|---|---|
| De Souza et al. (JMIR)14 | Brazil SINAN registry data | Random forest, decision tree, logistic regression | Sensitivity 94%, specificity 87% | PB versus MB classification in mHealth app |
| Mendonça Ramos Simões et al. (Sci Rep)15 | 14-item questionnaire (LSQ) | SVM, decision trees, ensemble models | SVM: 85.7% sensitivity, 69.2% specificity | Demonstrated AI screening via questionnaires |
PB: paucibacillary; MB: multibacillary; SVMs: support vector machines; LSQ: leprosy suspicion questionnaire; SINAN: sistema de informação de agravos de notificação.
Monitoring: pixels that remember healing
Treatment in leprosy is measured not only in the swallowing of MDT pills but in the slow fading of patches, the softening of nodules, and the return or absence of sensation. Yet clinicians know this trajectory is often uncertain: lesions can persist, relapse may masquerade as reaction, and bacilli can linger long after apparent cure. Here, AI has begun to step in, offering a way to quantify what the human eye cannot reliably measure.
In a recent study, Novack et al.16 developed and validated a diagnostic method based on Fourier-transform mid-infrared spectrophotometry (MIR-FTIR) combined with chemometric modeling for the early detection and therapeutic monitoring of leprosy. Using plasma samples from untreated patients, post-treatment patients, and healthy controls, the authors applied principal component analysis and partial least-squares discriminant analysis to classify spectra with sensitivities and specificities approaching 97-100%. Although the study demonstrated excellent diagnostic accuracy, the authors did not specify which spectral bands or biochemical constituents contributed most to group separation. Thus, while the method shows promise as a low-cost, non-invasive biochemical fingerprinting approach, further work is needed to identify the specific molecular alterations underlying the spectral differentiation between active, treated, and healthy states of leprosy. Such techniques, while laboratory-based, suggest a low-cost, scalable method of monitoring therapy.
At the cellular level, AI could also integrate molecular and immunogenetic signatures. Li et al. have described how the host immune response and genetic regulation shape outcomes in leprosy, including vaccine responsiveness and susceptibility7. Linking such biomarkers with ML pipelines may enable predictive models that forecast relapse or resistance before they manifest clinically.
In practice, these tools (Fig. 1) can translate into a vision of longitudinal AI monitoring: a patient’s clinical lesions and dermoscopic pattern photographed monthly, their plasma profiled periodically, their data compared not only to their own baseline but to global patterns learned across thousands of patients. Pixels remember what the eye forgets, and in doing so, they may prevent the cycle of relapse that has haunted leprosy control for decades.
Public health: epidemiology in the age of algorithms
If diagnosis is the intimate encounter between patient and clinician, epidemiology is the wide lens, mapping how disease moves through space and time. In leprosy, surveillance is hampered by stigma, underreporting, and limited resources. AI has begun to extend its reach into this domain, helping to chart “hidden geographies” of transmission.
Rodrigues da Motta and colleagues, writing in Leprosy Review, highlighted how AI-enhanced data systems can refine Brazil’s surveillance, detecting clusters and predicting incidence in regions where official numbers understate the true burden17. Their letter underscored the urgency of integrating AI-driven predictive analytics with routine health information systems.
In Senegal, Deutsche Lepra- und Tuberkulosehilfe (DAHW) and Belle.ai partnered with the Ministry of Health to deploy smartphone-based AI tools that assist frontline workers in identifying leprosy and other NTDs in remote areas. The system includes a geographic information system platform to track infections and coordinate treatment. By enabling early detection and rapid referral to specialists, the project demonstrates the potential of AI to strengthen surveillance and patient care in low-resource settings18.
At the global level, the WHO Skin NTD app, although not exclusively AI-driven, has introduced structured digital tools for frontline workers. When coupled with AI modules in development, it could enable syndromic surveillance across multiple NTDs, with leprosy as a key beneficiary19.
Together, these examples show that AI’s promise in public health lies not only in precision but also in scale: algorithms that can sift through scattered reports, images, and registries to generate maps, “cartographies of contagion,” where human vision alone cannot reach.
Challenges: bias, bugs, and barriers
AI does not arrive in leprosy care as a neutral tool. It inherits the flaws of the datasets that train it, the inequities of the systems that deploy it, and the silences of the patients who remain unseen. The promise of precision must be weighed against the dangers of distortion.
A policy perspective from the Malaysian Health Technology Assessment Section concluded that evidence on AI applications for leprosy is very limited. Existing models are based on small pilot studies, use different input datasets, and lack sufficient validation for routine use. The report emphasized that larger studies with more representative data are needed before these tools can be reliably deployed in public health settings20. In leprosy, this gap is sharper: models trained on carefully curated images may misclassify in field conditions, where lighting, camera quality, and skin tone vary widely.
Deps et al. underscored another critical issue: the scarcity and uneven quality of data. Stigma and underdiagnosis already limit case reporting in many endemic regions, creating a large pool of undetected patients. Training AI models on narrow, geographically restricted datasets risks encoding local biases into tools intended for global use21. Moreover, explainability remains uneven. While some models (such as LeprosyNet) highlight the parts of a lesion that influenced the decision13, others still provide results without showing their reasoning, making it difficult for clinicians to judge reliability.
Finally, there are ethical and legal questions: who owns lesion photographs captured in rural clinics? How are patients informed about AI use in their diagnosis? What safeguards exist if a model misclassifies a case? These questions remain unresolved but must accompany any technical advance.
Bias, bugs, and barriers are not abstract defects: they are reminders that AI in leprosy can amplify inequities if not designed, validated, and deployed with humility.
Future towards zero: neural nets and beyond
The aspiration of leprosy control has always been elimination, yet the road remains long. AI, if wisely shaped, could become part of that journey, not as a miracle cure, but as an amplifier of human effort.
Deps reminds us that AI’s future in leprosy depends on equity of access, since models trained and validated only in high-resource centers will fail the peripheries where they are needed most21. Still, there are emerging strategies to bridge these divides. Federated learning, for instance, allows algorithms to be trained across multiple regions without pooling sensitive data, ensuring both privacy and diversity of input22.
Multimodal approaches also loom on the horizon. Instead of relying solely on photographs, future models may combine clinical images, nerve function tests, genomic data, and immunological markers into unified predictions. Han and Solanki note that such integrations already show promise in other infectious dermatoses, suggesting leprosy may follow4,5.
Another frontier lies in stigma reduction. By embedding AI within mobile tools that guide primary health workers, diagnosis may occur earlier and closer to home, reducing the delay that fuels disability and discrimination. In this way, the algorithm is not merely a classifier, but a silent advocate for patients who often remain unseen.
The path to zero will not be paved by neural nets alone, but by their careful alignment with public health systems, ethical safeguards, and communities themselves. AI will not end leprosy, but it may help us imagine an end.
Sociocultural context: the human face of the machine
Beyond accuracy curves and performance metrics, AI in leprosy will always encounter the most enduring variable: people. Leprosy has never been just a bacterium in a nerve and the skin; it is stigma, silence, and centuries of exclusion. Any algorithm that ignores this context risks becoming irrelevant, or worse, harmful.
The Leprosy Exists blog captures this tension, documenting voices of those affected who express both hope and unease toward digital tools23. Some patients view AI-driven mobile apps as symbols of modernity, offering earlier recognition and dignity in care. Others worry that a camera capturing their lesion is yet another intrusion, another layer of labeling that reduces them to data points.
Clinicians, too, occupy an ambivalent space. While some see AI as a relief from diagnostic uncertainty, others worry it may erode the intimacy of clinical touch, replacing the careful palpation of a thickened nerve with the sterile scan of a smartphone.
Ultimately, the social fabric in which AI is introduced will shape its acceptance more than any technical detail. Trust must be built, not assumed. For AI in leprosy to succeed, it must carry not only precision but also compassion, remembering that behind every dataset is a face, a voice, a life.
Conclusion: algorithms as allies, not replacements
The history of leprosy is a history of persistence: of a bacterium that evades eradication, of a stigma that outlives cure, of a disease that still hides at the margins. AI enters this story not as a cure, but as a companion technology, a set of tools that may sharpen our gaze, extend our reach, and accelerate our responses.
The evidence we have reviewed demonstrates real progress. Algorithms can already distinguish lesions with accuracy rivalling specialists6,8,11-13, classify patients into PB or MB with high sensitivity12,13, and even monitor biochemical signatures of treatment response7,16. They can scan maps for clusters17, guide mobile screening in rural campaigns18, and sit inside WHO’s global digital health frameworks19. These advances are neither trivial nor complete.
Yet AI must remain an ally, not an authority. Its limitations: data scarcity, uneven generalizability, and ethical uncertainties demand humility21. As the future unfolds, federated learning, multimodal integration, and patient-centered apps may bring us closer to the goal of leprosy eradication. However, no network, however deep, can replace the clinician’s judgment, the health worker’s trust, or the patient’s story.
The most powerful vision, then, is not of algorithms replacing human care, but of algorithms walking beside it; steadying, amplifying, illuminating. In that alliance, the centuries-long shadow of leprosy may finally begin to fade.














