Machine learning to identify persons at high-risk of HIV acquisition in rural Kenya and Uganda


Background: In generalized epidemic settings, strategies are needed to prioritize individuals at higher risk of human immunodeficiency virus (HIV) acquisition for prevention services. We used population-level HIV testing data from rural Kenya and Uganda to construct HIV risk scores and assessed their ability to identify seroconversions.

Methods: During 2013-2017, >75% of residents in 16 communities in the SEARCH study were tested annually for HIV. In this population, we evaluated 3 strategies for using demographic factors to predict the 1-year risk of HIV seroconversion: membership in ≥1 known "risk group" (eg, having a spouse living with HIV), a "model-based" risk score constructed with logistic regression, and a "machine learning" risk score constructed with the Super Learner algorithm. We hypothesized machine learning would identify high-risk individuals more efficiently (fewer persons targeted for a fixed sensitivity) and with higher sensitivity (for a fixed number targeted) than either other approach.

Results: A total of 75 558 persons contributed 166 723 person-years of follow-up; 519 seroconverted. Machine learning improved efficiency. To achieve a fixed sensitivity of 50%, the risk-group strategy targeted 42% of the population, the model-based strategy targeted 27%, and machine learning targeted 18%. Machine learning also improved sensitivity. With an upper limit of 45% targeted, the risk-group strategy correctly classified 58% of seroconversions, the model-based strategy 68%, and machine learning 78%.

Conclusions: Machine learning improved classification of individuals at risk of HIV acquisition compared with a model-based approach or reliance on known risk groups and could inform targeting of prevention strategies in generalized epidemic settings.

Clinical trials registration: NCT01864603.

Keywords: HIV prevention; HIV risk score; PrEP; SEARCH Study; clinical prediction rule.

Balzer LB, Havlir DV, Kamya MR, Chamie G, Charlebois ED, Clark TD, Koss CA, Kwarisiima D, Ayieko J, Sang N, Kabami J, Atukunda M, Jain V, Camlin CS, Cohen CR, Bukusi EA, van der Laan M, Petersen ML.
Publication date: 
December 3, 2020
Publication type: 
Journal Article