Background: Cigarette smoking is widespread among HIV-infected patients, who confront increased risk of smoking-related co-morbidities. The effects of HIV infection and HIV-related variables on smoking and smoking cessation are incompletely understood. We investigated the correlates of smoking and quitting in an HIV-infected cohort using a validated natural language processor to determine smoking status. Method We developed and validated an algorithm using natural language processing (NLP) to ascertain smoking status from electronic health record data. The algorithm was applied to records for a cohort of 3487 HIV-infected from a large health care system in Boston, USA, and 9446 uninfected control patients matched 3:1 on age, gender, race and clinical encounters. NLP was used to identify and classify smoking-related portions of free-text notes. These classifications were combined into patient-year smoking status and used to classify patients as ever versus never smokers and current smokers versus non-smokers. Generalized linear models were used to assess associations of HIV with 3 outcomes, ever smoking, current smoking, and current smoking in analyses limited to ever smokers (persistent smoking), while adjusting for demographics, cardiovascular risk factors, and psychiatric illness. Analyses were repeated within the HIV cohort, with the addition of CD4 cell count and HIV viral load to assess associations of these HIV-related factors with the smoking outcomes. Results: Using the natural language processing algorithm to assign annual smoking status yielded sensitivity of 92.4, specificity of 86.2, and AUC of 0.89 (95% confidence interval [CI] 0.88–0.91). Ever and current smoking were more common in HIV-infected patients than controls (54% vs. 44% and 42% vs. 30%, respectively, both P<0.001). In multivariate models HIV was independently associated with ever smoking (adjusted rate ratio [ARR] 1.18, 95% CI 1.13–1.24, P <0.001), current smoking (ARR 1.33, 95% CI 1.25–1.40, P<0.001), and persistent smoking (ARR 1.11, 95% CI 1.07–1.15, P<0.001). Within the HIV cohort, having a detectable HIV RNA was significantly associated with all three smoking outcomes. Conclusions: HIV was independently associated with both smoking and not quitting smoking, using a novel algorithm to ascertain smoking status from electronic health record data and accounting for multiple confounding clinical factors. Further research is needed to identify HIV-related barriers to smoking cessation and develop aggressive interventions specific to HIV-infected patients
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.