Search CORE

7 research outputs found

Cost-sensitive ordinal classification methods to predict SARS-CoV-2 pneumonia severity

Author: España Yandiola P.P.
García F.
Hayet-Otero M.
Lee D.J.
Martínez-Minaya J.
Menéndez R.
Nieves Ermecheo M.
Quintana J.M.
Torres A.
Urrutia Landa I.
Zalacain Jorge R.
Publication venue
Publication date: 08/02/2024
Field of study

Objective: To study the suitability of cost-sensitive ordinal artificial intelligence-machine learning (AI-ML) strategies in the prognosis of SARS-CoV-2 pneumonia severity. Materials & methods: Observational, retrospective, longitudinal, cohort study in 4 hospitals in Spain. Information regarding demographic and clinical status was supplemented by socioeconomic data and air pollution exposures. We proposed AI-ML algorithms for ordinal classification via ordinal decomposition and for cost-sensitive learning via resampling techniques. For performance-based model selection, we defined a custom score including per-class sensitivities and asymmetric misprognosis costs. 260 distinct AI-ML models were evaluated via 10 repetitions of 5×5 nested cross-validation with hyperparameter tuning. Model selection was followed by the calibration of predicted probabilities. Final overall performance was compared against five well-established clinical severity scores and against a ‘standard’ (non-cost sensitive, non-ordinal) AI-ML baseline. In our best model, we also evaluated its explainability with respect to each of the input variables. Results: The study enrolled

n

=1548 patients: 712 experienced low, 238 medium, and 598 high clinical severity.

d

=131 variables were collected, becoming

d′

=148 features after categorical encoding. Model selection resulted in our best-performing AI-ML pipeline having: a) no imputation of missing data, b) no feature selection (i.e. using the full set of

d′

features), c) ‘Ordered Partitions’ ordinal decomposition, d) cost-based reimbalance, and e) a Histogram-based Gradient Boosting classifier. This best model (calibrated) obtained a median accuracy of 68.1% [67.3%, 68.8%] (95% confidence interval), a balanced accuracy of 57.0% [55.6%, 57.9%], and an overall area under the curve (AUC) 0.802 [0.795, 0.808]. In our dataset, it outperformed all five clinical severity scores and the ‘standard’ AI-ML baseline. Discussion & conclusion: We conducted an exhaustive exploration of AI-ML methods designed for both ordinal and cost-sensitive classification, motivated by a real-world application domain (clinical severity prognosis) in which these topics arise naturally. Our model with the best classification performance exploited successfully the ordering information of ground truth classes, coping with imbalance and asymmetric costs. However, these ordinal and cost-sensitive aspects are seldom explored in the literature

BCAM's Institutional Repository Data

Extracting relevant predictive variables for COVID-19 severity prognosis: An exhaustive comparison of feature selection techniques

Author: Arostegui I.
España Yandiola P.P.
García F.
Hayet-Otero M.
Lee D.-J.
Martínez-Minaya J.
Menéndez R.
Nieves Ermecheo M.
Quintana J.M.
Torres A.
Urrutia Landa I.
Zalacain Jorge R.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2023
Field of study

With the COVID-19 pandemic having caused unprecedented numbers of infections and deaths, large research efforts have been undertaken to increase our understanding of the disease and the factors which determine diverse clinical evolutions. Here we focused on a fully data-driven exploration regarding which factors (clinical or otherwise) were most informative for SARS-CoV-2 pneumonia severity prediction via machine learning (ML). In particular, feature selection techniques (FS), designed to reduce the dimensionality of data, allowed us to characterize which of our variables were the most useful for ML prognosis. We conducted a multi-centre clinical study, enrolling n=1548 patients hospitalized due to SARS-CoV-2 pneumonia: where 792, 238, and 598 patients experienced low, medium and high-severity evolutions, respectively. Up to 106 patient-specific clinical variables were collected at admission, although 14 of them had to be discarded for containing ⩾60% missing values. Alongside 7 socioeconomic attributes and 32 exposures to air pollution (chronic and acute), these became d=148 features after variable encoding. We addressed this ordinal classification problem both as a ML classification and regression task. Two imputation techniques for missing data were explored, along with a total of 166 unique FS algorithm configurations: 46 filters, 100 wrappers and 20 embeddeds. Of these, 21 setups achieved satisfactory bootstrap stability (⩾0.70) with reasonable computation times: 16 filters, 2 wrappers, and 3 embeddeds. The subsets of features selected by each technique showed modest Jaccard similarities across them. However, they consistently pointed out the importance of certain explanatory variables. Namely: patient’s C-reactive protein (CRP), pneumonia severity index (PSI), respiratory rate (RR) and oxygen levels –saturation SpO2, quotients SpO2/RR and arterial SatO2/FiO2 –, the neutrophil-to-lymphocyte ratio (NLR) –to certain extent, also neutrophil and lymphocyte counts separately–, lactate dehydrogenase (LDH), and procalcitonin (PCT) levels in blood. A remarkable agreement has been found a posteriori between our strategy and independent clinical research works investigating risk factors for COVID-19 severity. Hence, these findings stress the suitability of this type of fully data-driven approaches for knowledge extraction, as a complementary to clinical perspectives

BCAM's Institutional Repository Data

Impact of outdoor air pollution on severity and mortality in COVID-19 pneumonia

Author: Arostegui I.
Bronte Moreno O.
Cillóniz C.
España Yandiola P.P.
García F.
Lee D.-J.
Martínez-Minaya J.
Menéndez Villanueva R.
Méndez Ocaña R.
Nieves Ermecheo M.
Quintana J.M.
Ruiz Iturriaga L.A.
Serrano Fernández L.
Torres Marti A.
Uranga Echeverria A.
Urrutia Landa I.
Zalacain Jorge R.
Publication venue
Publication date: 01/06/2023
Field of study

The relationship between exposure to air pollution and the severity of coronavirus disease 2019 (COVID-19) pneumonia and other outcomes is poorly understood. Beyond age and comorbidity, risk factors for adverse outcomes including death have been poorly studied. The main objective of our study was to examine the relationship between exposure to outdoor air pollution and the risk of death in patients with COVID-19 pneumonia using individual-level data. The secondary objective was to investigate the impact of air pollutants on gas exchange and systemic inflammation in this disease. This cohort study included 1548 patients hospitalised for COVID-19 pneumonia between February and May 2020 in one of four hospitals. Local agencies supplied daily data on environmental air pollutants (

PM_{10}

PM_{2.5}

O_3

NO_2

NO

and

NO_X

) and meteorological conditions (temperature and humidity) in the year before hospital admission (from January 2019 to December 2019). Daily exposure to pollution and meteorological conditions by individual postcode of residence was estimated using geospatial Bayesian generalised additive models. The influence of air pollution on pneumonia severity was studied using generalised additive models which included: age, sex, Charlson comorbidity index, hospital, average income, air temperature and humidity, and exposure to each pollutant. Additionally, generalised additive models were generated for exploring the effect of air pollution on C-reactive protein (CRP) level and Sp

O_2

/Fi

O_2

at admission. According to our results, both risk of COVID-19 death and CRP level increased significantly with median exposure to

PM_{10}

NO_2

NO

and

NO_X

, while higher exposure to

NO_2

NO

and

NO_X

was associated with lower Sp

O_2

/Fi

O_2

ratios. In conclusion, after controlling for socioeconomic, demographic and health-related variables, we found evidence of a significant positive relationship between air pollution and mortality in patients hospitalised for COVID-19 pneumonia. Additionally, inflammation (CRP) and gas exchange (Sp

O_2

/Fi

O_2

) in these patients were significantly related to exposure to air pollution

BCAM's Institutional Repository Data

Impacto cuantitativo de la contaminación en la probabilidad de muerte por neumonía por SARS-CoV-2

Author: Arostegui I.
Artaraz Ereño A.
Bronte Moreno O.
Cillóniz C.
España Yandiola P.P.
García Hontoria P.
García F.
Jódar Samper A.
Lee D.-J.
Martínez-Minaya J.
Menéndez Villanueva R.
Méndez Ocaña R.
Ruiz Iturriaga L.A.
Serrano Fernández L.
Torres Marti A.
Urrutia Landa I.
Zalacain Jorge R.
Publication venue
Publication date: 01/11/2021
Field of study

Introducción La evidencia científica disponible señala que la contaminación del aire exterior podría agravar la severidad de la COVID-19 y por ende, incrementar las probabilidades de fallecimiento. Material y métodos Estudio observacional longitudinal retrospectivo de cohortes, multicéntrico en 4 hospitales: 2 en Bizkaia (1 urbano, 1 urbano-rural), Valencia y Barcelona (urbanos). Se incluyeron ingresos por neumonía SARS-CoV-2 en el primer pico epidémico de COVID-19 (febrero-mayo 2020). Para determinar la exposición a contaminación por PM

_{10}

y NO

_{2}

, se obtuvieron los datos publicados por los organismos autonómicos de calidad del aire, para 2019 y 1er semestre 2020. Se utilizó un Modelo Aditivo Generalizado (GAM) para estimar el nivel diario de contaminante en cada código postal, en función de las coordenadas geográficas y la altitud de las estaciones de medición [Figura 1]. Para determinar la exposición crónica, se calcularon media y máximo en 2019; la aguda se caracterizó por media y máximo en los 7 días anteriores al ingreso. Se estudió la razón de probabilidades (‘odds ratio’, OR) de muerte frente a supervivencia entre nuestra cohorte. Se modeló mediante un GAM con regresión logística, incorporando como efectos fijos sexo, edad y contaminante; hospital como efecto aleatorio e índice de comorbilidad de Charlson como función suave mediantes splines penalizados. Resultados De los 1548 pacientes reclutados, 243 (15.7%) fallecieron durante su hospitalización y/o 30 días postingreso. Según los modelos [Tabla 1], existe evidencia estadística significativa de que la exposición crónica a PM

_{10}

y NO

_{2}

incrementan la probabilidad de muerte por neumonía SARS-CoV-2. Compensando por sexo, edad y Charlson -todos factores relacionados positivamente con el OR de muerte- así como por hospital; por cada incremento de 10 μg/m

^{3}

en el nivel de PM

_{10}

(máximo anual) el OR aumenta en 10.5%, linealmente proporcional al incremento en la contaminación. Mientras, cada 10 μg/m

^{3}

más de NO2 (media anual) aumentan OR en 35.7%; cada 10 μg/m

^{3}

más en exposición aguda a NO2 (media semana pre-ingreso): 62.9%; y NO

_{2}

(máximo semana): 34.4%. Conclusiones Se cuantificaron y compensaron los efectos de los factores sexo, edad, Charlson y hospital. A igualdad de estos, incrementos en la exposición crónica y aguda a PM

_{10}

y NO

_{2}

aumentan de manera lineal y estadísticamente significativa la probabilidad de muerte por neumonía SARS-CoV-2

BCAM's Institutional Repository Data

Predicción de la gravedad de neumonías por SARS-CoV-2 a partir de información clínica y contaminación, mediante inteligencia artificial

Author: Arostegui I.
Bronte Moreno O.
Cillóniz C.
España Yandiola P.P.
García Hontoria P.
García F.
Jódar Samper A.
Lee D.-J.
Martínez-Minaya J.
Menéndez Villanueva R.
Méndez Ocaña R.
Ruiz Aldaiturriaga L.A.
Serrano Fernández L.
Torres Marti A.
Uranga Echeverria A.
Urrutia Landa I.
Zalacain Jorge R.
Publication venue
Publication date: 01/11/2021
Field of study

Introducción La contaminación del aire exterior se ha relacionado con mayor gravedad de las infecciones respiratorias. Por tanto, su inclusión en algoritmos predictivos podrían añadir información para pronosticar la gravedad de neumonías SARS-CoV-2. Material y métodos Estudio observacional longitudinal retrospectivo de cohortes, multicéntrico en 4 hospitales. Se incluyeron ingresos por neumonía SARS-CoV-2 en el primer pico epidémico de COVID-19 (febrero-mayo 2020). Se recogieron hasta 93 variables clínicas, analíticas y radiológicas por cada paciente (sexo, edad, peso, comorbilidades, síntomas, variables fisiológicas en urgencias, sangre, gasometría, etc.). Además, se calcularon los niveles exposición a contaminación por PM

_{10}

, PM

_{2.5}

, O

_{3}

, NO

_{2}

, NO, NO

_{X}

, SO

_{2}

y CO en su código postal. En función de la evolución clínica de la neumonía, se definieron 3 niveles de gravedad [Tabla 1]. Para predecir dicha gravedad, se desarrolló un algoritmo de inteligencia artificial (IA), tipo ‘Random Forest’ con balanceo y ajuste automático de sus parámetros internos. El algoritmo se entrenó y evaluó mediante 20 repeticiones de validación cruzada 10-fold (90% entrenamiento, 10% validación), estratificando aleatoriamente por hospital y gravedad. Resultados En los conjuntos de validación, el algoritmo alcanzó una capacidad predictiva (área bajo la curva ROC) promedio AUC=0.834 para gravedad nivel 0, AUC=0.724 para 1 y AUC=0.850 para 2 [Figura 1]. Sin la información de contaminantes, su capacidad predictiva se degradó ligeramente (AUCs = 0.829, 0.722, 0.844; respectivamente). Conclusiones Nuestro algoritmo IA es capaz de predecir de manera satisfactoria la evolución de la gravedad en la neumonía; en particular para los casos más leves y más severos. El algoritmo IA extrae las reglas más relevantes a partir principalmente de la información clínica, analítica y radiológica de cada individuo; no obstante, la incorporación de la exposición a contaminantes mejora ligeramente la capacidad predictiva. El impacto de la contaminación podría estar ya reflejado en las analíticas de sangre, a través de su efecto en los niveles de inflamación del paciente (PCT, PCR, LDH, etc.)

BCAM's Institutional Repository Data

Tuberculosis en la población inmigrante de Bilbao

Author: A. Capelastegui Saiz
Altube
Blum
Bwire
C. Salinas Solano
De March Ayuela
Duran
Esteban
Hardie
J.M. Quintana López
Kochi
L. Altube Urrengoetxea
MacIntyre
McKenna
P.P. España Yandiola
Rieder
Rieder
Rivas-Clemente
Rossman
Verver
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Tuberculosis e infección por VIH. Analisis de 36 casos

Author: A. Capelastegui Saiz
Algueró
Ausina
Bouza
C. Salinas Solano
Casabona
Chaisson
Chopewcll
Coleburnders
Cosin
Cosín
Goodman
Handweger
Hopewell
J. Mayo Suárez
J.I. Aguirregomoscorta urquijo
M. Oribe Ibañez
Mallolas
March
Ocaña
P.P. España Yandiola
Perronne
Pitchenik
Rieder
Selwyn
Soriano
Sunderam
Valencia
Vidal Pía
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref