5 research outputs found
Multiomics and eXplainable artificial intelligence for decision support in insulin resistance early diagnosis: A pediatric population-based longitudinal study
Supplementary Material: https://sci2s.ugr.es/MultiOmics_IR_PredPediatric obesity can drastically heighten the risk of cardiometabolic alterations later in life, with insulin
resistance standing as the cornerstone linking adiposity to the increased cardiovascular risk. Puberty has been
pointed out as a critical stage after which obesity-associated insulin resistance is more difficult to revert. Timely
prediction of insulin resistance in pediatric obesity is therefore vital for mitigating the risk of its associated
comorbidities. The construction of effective and robust predictive systems for a complex health outcome like
insulin resistance during the early stages of life demands the adoption of longitudinal designs for more causal
inferences, and the integration of factors of varying nature involved in its onset. In this work, we propose
an eXplainable Artificial Intelligence-based decision support pipeline for early diagnosis of insulin resistance
in a longitudinal cohort of 90 children. For that, we leverage multi-omics (genomics and epigenomics) and
clinical data from the pre-pubertal stage. Different data layers combinations, pre-processing techniques (missing values, feature selection, class imbalance, etc.), algorithms, training procedures were considered following good practices for Machine Learning. SHapley Additive exPlanations were provided for specialists to understand both the decision-making mechanisms of the system and the impact of the features on each automatic decision, an essential issue in high-risk areas such as this one where system decisions may affect people’s lives. The system showed a relevant predictive ability (AUC and G-mean of 0.92). A deep exploration, both at the global and the local level, revealed promising biomarkers of insulin resistance in our population, highlighting classical markers, such as Body Mass Index z-score or leptin/adiponectin ratio, and novel ones such as methylation patterns of relevant genes, such as HDAC4, PTPRN2, MATN2, RASGRF1 and EBF1. Our findings highlight the importance of integrating multi-omics data and following eXplainable Artificial Intelligence trends when building decision support systemsDepartment of Biochemistry and Molecular Biology II, School of Pharmacy, ‘‘José Mataix Verdú’’ Institute of Nutrition and Food Technology (INYTA) and Center of Biomedical Research, University of Granada, Granada, 18071, Spai
Shared gene expression signatures between visceral adipose and skeletal muscle tissues are associated with cardiometabolic traits in children with obesity
Obesity in children is related to the development of cardiometabolic complications later in life, where
molecular changes of visceral adipose tissue (VAT) and skeletal muscle tissue (SMT) have been proven to
be fundamental. The aim of this study is to unveil the gene expression architecture of both tissues in a cohort
of Spanish boys with obesity, using a clustering method known as weighted gene co-expression network
analysis. For this purpose, we have followed a multi-objective analytic pipeline consisting of three main
approaches; identification of gene co-expression clusters associated with childhood obesity, individually in
VAT and SMT (intra-tissue, approach I); identification of gene co-expression clusters associated with obesitymetabolic
alterations, individually in VAT and SMT (intra-tissue, approach II); and identification of gene
co-expression clusters associated with obesity-metabolic alterations simultaneously in VAT and SMT (intertissue,
approach III). In both tissues, we identified independent and inter-tissue gene co-expression signatures
associated with obesity and cardiovascular risk, some of which exceeded multiple-test correction filters. In these
signatures, we could identify some central hub genes (e.g., NDUFB8, GUCY1B1, KCNMA1, NPR2, PPP3CC)
participating in relevant metabolic pathways exceeding multiple-testing correction filters. We identified the
central hub genes PIK3R2, PPP3C and PTPN5 associated with MAPK signaling and insulin resistance terms. This
is the first time that these genes have been associated with childhood obesity in both tissues. Therefore, they
could be potential novel molecular targets for drugs and health interventions, opening new lines of research on
the personalized care in this pathology. This work generates interesting hypotheses about the transcriptomics
alterations underlying metabolic health alterations in obesity in the pediatric populationERDF/Health Institute Carlos
III (grant numbers PI20/00711 and PI20/00563)ERDF/Regional Government of Andalusia/Ministry of Economic Transformation,
Industry, Knowledge and Universities (grant numbers P18-
RT-2248 and B-CTS-536-UGR20
Omics Approaches in Adipose Tissue and Skeletal Muscle Addressing the Role of Extracellular Matrix in Obesity and Metabolic Dysfunction
Extracellular matrix (ECM) remodeling plays important roles in both white adipose tissue
(WAT) and the skeletal muscle (SM) metabolism. Excessive adipocyte hypertrophy causes fibrosis,
inflammation, and metabolic dysfunction in adipose tissue, as well as impaired adipogenesis.
Similarly, disturbed ECM remodeling in SM has metabolic consequences such as decreased insulin
sensitivity. Most of described ECM molecular alterations have been associated with DNA sequence
variation, alterations in gene expression patterns, and epigenetic modifications. Among others, the
most important epigenetic mechanism by which cells are able to modulate their gene expression
is DNA methylation. Epigenome-Wide Association Studies (EWAS) have become a powerful approach
to identify DNA methylation variation associated with biological traits in humans. Likewise,
Genome-Wide Association Studies (GWAS) and gene expression microarrays have allowed the study
of whole-genome genetics and transcriptomics patterns in obesity and metabolic diseases. The aim
of this review is to explore the molecular basis of ECM inWAT and SM remodeling in obesity and
the consequences of metabolic complications. For that purpose, we reviewed scientific literature
including all omics approaches reporting genetic, epigenetic, and transcriptomic (GWAS, EWAS, and
RNA-seq or cDNA arrays) ECM-related alterations in WAT and SM as associated with metabolic
dysfunction and obesity.Doctorate fellowship Formacion del Profesorado Universitario
FPU 16/03653Doctorate contract i-PFIS: Doctorados IIS-empresa en ciencias y tecnologias de la salud
IFI17/00048"Ramon-Areces Foundation", SpainJunta de Andalucí
Leveraging Machine Learning and Genetic Risk Scores for the Prediction of Metabolic Syndrome in Children with Obesity
Background and objectives: Obesity is a growing global epidemic, associated with increased
cardiometabolic disorders. Metabolic syndrome (MS) is defined by altered insulin, blood pressure,
glucose, and lipid levels. Pubertal children with obesity are highly susceptible to developing MS,
necessitating its early identification. This study aims to compute phenotype-specific genetic risk
scores for MS-related biochemical markers and evaluate their clinical utility using machine learning-
based models. Methods: Longitudinal data from the PUBMEP Spanish cohort were analyzed,
including 138 children (71 girls and 67 boys) at two time points, spanning from prepuberty to
puberty. Clinical, endogenous, environmental, and omics variables were measured. Genetic risk
scores were generated using GWAS data and PRSice-2 software. These scores, alongside non-genetic
prepubertal data (e.g., biochemical, anthropometric, and physical activity data), were integrated into
predictive models using machine learning techniques to forecast the MS status during puberty. Linear
models explored interactions between environmental factors, genetic risk scores, and disease risk.
Results: Strong associations were observed between each genetic risk score and its corresponding
phenotypic biomarker. Notably, certain scores related to obesity and high-density lipoprotein levels
exhibited significant interactions with environmental factors, such as sedentary lifestyle, modulating
disease effects. The predictive machine learning models incorporating prepubertal genetics, high-
density lipoprotein, and sedentary lifestyle achieved reasonable performance in predicting pubertal
obesity (AUC, accuracy, and sensitivity of 0.89). These models strike a favorable balance between
risk scores derived from genetic factors and clinical variables. However, when individual risk
scores were considered in isolation, limited predictive results were observed for MS and associated
altered components. Discussion: This study demonstrates the importance of the early identification
of at-risk children for MS. The integration of genetic risk scores, clinical variables, and machine
learning techniques offers promising avenues for predicting pubertal MS. While individual risk scores
have limitations in isolation, polygenic risk scores serve as valuable tools for investigating gene–
environment interactions. Following our results, polygenic risk scores lacked sufficient predictive ability in most clinical traits, limiting their clinical application. Nevertheless, they remain valuable
analytical tools for exploring the association with the environment, by consolidating the effects of
multiple single nucleotide polymorphisms into a single variable
Omics Data Preprocessing for Machine Learning: A Case Study in Childhood Obesity
The use of machine learning techniques for the construction of predictive models of disease
outcomes (based on omics and other types of molecular data) has gained enormous relevance in
the last few years in the biomedical field. Nonetheless, the virtuosity of omics studies and machine
learning tools are subject to the proper application of algorithms as well as the appropriate preprocessing
and management of input omics and molecular data. Currently, many of the available
approaches that use machine learning on omics data for predictive purposes make mistakes in
several of the following key steps: experimental design, feature selection, data pre-processing,
and algorithm selection. For this reason, we propose the current work as a guideline on how to
confront the main challenges inherent to multi-omics human data. As such, a series of best practices
and recommendations are also presented for each of the steps defined. In particular, the main
particularities of each omics data layer, the most suitable preprocessing approaches for each source,
and a compilation of best practices and tips for the study of disease development prediction using
machine learning are described. Using examples of real data, we show how to address the key
problems mentioned in multi-omics research (e.g., biological heterogeneity, technical noise, high
dimensionality, presence of missing values, and class imbalance). Finally, we define the proposals for
model improvement based on the results found, which serve as the bases for future work.ERDF/Regional Government of Andalusia/Ministry of Economic Transformation, Industry, Knowledge, and Universities P18-RT-2248
B-CTS-536-UGR20ERDF/Health Institute Carlos III/Spanish Ministry of Science, Innovation PI20/0071