497 research outputs found

    Enhancing clinical potential of liquid biopsy through a multi-omic approach: A systematic review

    Get PDF
    In the last years, liquid biopsy gained increasing clinical relevance for detecting and monitoring several cancer types, being minimally invasive, highly informative and replicable over time. This revolutionary approach can be complementary and may, in the future, replace tissue biopsy, which is still considered the gold standard for cancer diagnosis. “Classical” tissue biopsy is invasive, often cannot provide sufficient bioptic material for advanced screening, and can provide isolated information about disease evolution and heterogeneity. Recent literature highlighted how liquid biopsy is informative of proteomic, genomic, epigenetic, and metabolic alterations. These biomarkers can be detected and investigated using single-omic and, recently, in combination through multi-omic approaches. This review will provide an overview of the most suitable techniques to thoroughly characterize tumor biomarkers and their potential clinical applications, highlighting the importance of an integrated multi-omic, multi-analyte approach. Personalized medical investigations will soon allow patients to receive predictable prognostic evaluations, early disease diagnosis, and subsequent ad hoc treatments

    Data- og ekspertdreven variabelseleksjon for prediktive modeller i helsevesenet : mot økt tolkbarhet i underbestemte maskinlæringsproblemer

    Get PDF
    Modern data acquisition techniques in healthcare generate large collections of data from multiple sources, such as novel diagnosis and treatment methodologies. Some concrete examples are electronic healthcare record systems, genomics, and medical images. This leads to situations with often unstructured, high-dimensional heterogeneous patient cohort data where classical statistical methods may not be sufficient for optimal utilization of the data and informed decision-making. Instead, investigating such data structures with modern machine learning techniques promises to improve the understanding of patient health issues and may provide a better platform for informed decision-making by clinicians. Key requirements for this purpose include (a) sufficiently accurate predictions and (b) model interpretability. Achieving both aspects in parallel is difficult, particularly for datasets with few patients, which are common in the healthcare domain. In such cases, machine learning models encounter mathematically underdetermined systems and may overfit easily on the training data. An important approach to overcome this issue is feature selection, i.e., determining a subset of informative features from the original set of features with respect to the target variable. While potentially raising the predictive performance, feature selection fosters model interpretability by identifying a low number of relevant model parameters to better understand the underlying biological processes that lead to health issues. Interpretability requires that feature selection is stable, i.e., small changes in the dataset do not lead to changes in the selected feature set. A concept to address instability is ensemble feature selection, i.e. the process of repeating the feature selection multiple times on subsets of samples of the original dataset and aggregating results in a meta-model. This thesis presents two approaches for ensemble feature selection, which are tailored towards high-dimensional data in healthcare: the Repeated Elastic Net Technique for feature selection (RENT) and the User-Guided Bayesian Framework for feature selection (UBayFS). While RENT is purely data-driven and builds upon elastic net regularized models, UBayFS is a general framework for ensembles with the capabilities to include expert knowledge in the feature selection process via prior weights and side constraints. A case study modeling the overall survival of cancer patients compares these novel feature selectors and demonstrates their potential in clinical practice. Beyond the selection of single features, UBayFS also allows for selecting whole feature groups (feature blocks) that were acquired from multiple data sources, as those mentioned above. Importance quantification of such feature blocks plays a key role in tracing information about the target variable back to the acquisition modalities. Such information on feature block importance may lead to positive effects on the use of human, technical, and financial resources if systematically integrated into the planning of patient treatment by excluding the acquisition of non-informative features. Since a generalization of feature importance measures to block importance is not trivial, this thesis also investigates and compares approaches for feature block importance rankings. This thesis demonstrates that high-dimensional datasets from multiple data sources in the medical domain can be successfully tackled by the presented approaches for feature selection. Experimental evaluations demonstrate favorable properties of both predictive performance, stability, as well as interpretability of results, which carries a high potential for better data-driven decision support in clinical practice.Moderne datainnsamlingsteknikker i helsevesenet genererer store datamengder fra flere kilder, som for eksempel nye diagnose- og behandlingsmetoder. Noen konkrete eksempler er elektroniske helsejournalsystemer, genomikk og medisinske bilder. Slike pasientkohortdata er ofte ustrukturerte, høydimensjonale og heterogene og hvor klassiske statistiske metoder ikke er tilstrekkelige for optimal utnyttelse av dataene og god informasjonsbasert beslutningstaking. Derfor kan det være lovende å analysere slike datastrukturer ved bruk av moderne maskinlæringsteknikker for å øke forståelsen av pasientenes helseproblemer og for å gi klinikerne en bedre plattform for informasjonsbasert beslutningstaking. Sentrale krav til dette formålet inkluderer (a) tilstrekkelig nøyaktige prediksjoner og (b) modelltolkbarhet. Å oppnå begge aspektene samtidig er vanskelig, spesielt for datasett med få pasienter, noe som er vanlig for data i helsevesenet. I slike tilfeller må maskinlæringsmodeller håndtere matematisk underbestemte systemer og dette kan lett føre til at modellene overtilpasses treningsdataene. Variabelseleksjon er en viktig tilnærming for å håndtere dette ved å identifisere en undergruppe av informative variabler med hensyn til responsvariablen. Samtidig som variabelseleksjonsmetoder kan lede til økt prediktiv ytelse, fremmes modelltolkbarhet ved å identifisere et lavt antall relevante modellparametere. Dette kan gi bedre forståelse av de underliggende biologiske prosessene som fører til helseproblemer. Tolkbarhet krever at variabelseleksjonen er stabil, dvs. at små endringer i datasettet ikke fører til endringer i hvilke variabler som velges. Et konsept for å adressere ustabilitet er ensemblevariableseleksjon, dvs. prosessen med å gjenta variabelseleksjon flere ganger på en delmengde av prøvene i det originale datasett og aggregere resultater i en metamodell. Denne avhandlingen presenterer to tilnærminger for ensemblevariabelseleksjon, som er skreddersydd for høydimensjonale data i helsevesenet: "Repeated Elastic Net Technique for feature selection" (RENT) og "User-Guided Bayesian Framework for feature selection" (UBayFS). Mens RENT er datadrevet og bygger på elastic net-regulariserte modeller, er UBayFS et generelt rammeverk for ensembler som muliggjør inkludering av ekspertkunnskap i variabelseleksjonsprosessen gjennom forhåndsbestemte vekter og sidebegrensninger. En case-studie som modellerer overlevelsen av kreftpasienter sammenligner disse nye variabelseleksjonsmetodene og demonstrerer deres potensiale i klinisk praksis. Utover valg av enkelte variabler gjør UBayFS det også mulig å velge blokker eller grupper av variabler som representerer de ulike datakildene som ble nevnt over. Kvantifisering av viktigheten av variabelgrupper spiller en nøkkelrolle for forståelsen av hvorvidt datakildene er viktige for responsvariablen. Tilgang til slik informasjon kan føre til at bruken av menneskelige, tekniske og økonomiske ressurser kan forbedres dersom informasjonen integreres systematisk i planleggingen av pasientbehandlingen. Slik kan man redusere innsamling av ikke-informative variabler. Siden generaliseringen av viktighet av variabelgrupper ikke er triviell, undersøkes og sammenlignes også tilnærminger for rangering av viktigheten til disse variabelgruppene. Denne avhandlingen viser at høydimensjonale datasett fra flere datakilder fra det medisinske domenet effektivt kan håndteres ved bruk av variabelseleksjonmetodene som er presentert i avhandlingen. Eksperimentene viser at disse kan ha positiv en effekt på både prediktiv ytelse, stabilitet og tolkbarhet av resultatene. Bruken av disse variabelseleksjonsmetodene bærer et stort potensiale for bedre datadrevet beslutningsstøtte i klinisk praksis

    Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5

    Get PDF
    This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered. First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes. Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification. Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well

    Use of omic tools for environmental risk assessment of emerging contaminants in marine species of commercial interest

    Get PDF
    La presente tesis doctoral busca evaluar los posibles efectos toxicológicos que pueden tener los contaminantes ‘emergentes’ (CE) en los organismos marinos expuestos a ellos. En concreto, se han evaluado dos filtros solares, un repelente de insectos y un biocida, todos ellos presentes en productos de cuidado personal (PCP). Estas sustancias llegan a los sistemas marinos a través de los vertidos de aguas residuales, así como las entradas directas procedentes de actividades recreativas, tales como la natación y el baño. La exposición a estos contaminantes se hace a muy bajas concentraciones y de forma crónica, por tanto, evaluar sus posibles efectos a nivel molecular resulta un paso clave para determinar su toxicidad. Las tecnologías “ómicas” permiten el estudio global de los diferentes niveles biológicos (transcriptoma, proteoma, metaboloma…) desde un punto de vista holístico e integrado. La integración de datos ómicos, junto con la fusión de datos de bioconcentración y toxicocinética, es un método extremadamente útil para dilucidar el modo de acción (MoA) de los contaminantes. De este modo, en la presente tesis doctoral se ha implementado el uso y la integración de herramientas ómicas (metabolómica y transcriptómica) para evaluar el efecto toxicológico de dos filtros solares (4-metilbencilideno alcanfor (4-MBC) y sulisobenzona (BP-4)), un repelente de insectos (N,N-Dietil-meta-toluamida (DEET)) y un biocida (triclosán (TCS)) en dos especies marinas de interés comercial, la dorada (Sparus aurata) y la almeja japonesa (Ruditapes philippinarum). Adicionalmente se realizaron estudios de bioconcentración y toxicocinética. Primero, se realizaron siete experimentos de exposición con los contaminantes y organismos anteriormente mencionados usando un flujo continuo para reproducir escenarios de exposición ambiental bajo condiciones controladas de laboratorio. Tras la aplicación de técnicas analíticas y de alto rendimiento, se observó que la bioacumulación de los contaminantes es superior en la almeja que en los peces. Esto evidencia la importancia de tener en cuenta organismos de distintos niveles tróficos para evaluar el potencial de bioacumulación de los contaminantes ambientales. Además, se observó que el filtro solar 4-MBC presentó el factor de bioacumulación (BCF) más alto (368 565 L Kg -1) y la tasa de eliminación más baja (61.65%) que, junto con la persistencia de este compuesto en el medio, son probablemente indicativos de un riesgo potencial para el medio acuático marino que puede llegar a magnificarse a través de la red alimentaria. También se observó que este compuesto sufría varias biotransformaciones (reducción, oxidación, hidroxilación…) con el fin de ser excretado, afectando al metabolismo de las drogas y xenobióticos y al del glutatión, llegando a producir estrés oxidativo. Por otro lado, se observó que, aunque el TCS y BP-4 tenían altos BCF en la almeja (1309 y 850 L Kg -1 respectivamente), sus tasas de eliminación también eran altas (97.12 y 99.99%). Sin embargo, la exposición a estos contaminantes produjo, entre otros, alteraciones en el metabolismo de los lípidos lo que puede deberse a su capacidad como disruptores endocrinos. El DEET presentó el BCF más bajo en la almeja (9.9 L Kg -1) y una alta tasa de eliminación (98.85%). El principal impacto de la exposición de este compuesto a nivel transcriptómico se observó en el metabolismo de biodegradación de xenobióticos y en el de los carbohidratos, lo que sugiere que se estaba realizando un gran consumo energético para poder llevar a cabo la excreción del compuesto. En el músculo de la dorada se observó que tanto el DEET como la BP-4 presentaban bajos BCF (2.6 y 0.7 L Kg -1 respectivamente) mientras que el TCS presentaba un BCF de 113 L Kg -1. Sin embargo, los análisis transcriptómicos en el hígado de las doradas revelaron que tras la exposición a DEET y BP-4, se expresaron 250 y 371 genes diferencialmente, mientras que no se encontraron genes diferencialmente expresados en la dorada tras la exposición a TCS. Tras la integración con los datos metabolómicos realizados en las mismas muestras, se determinó que el DEET causaba en la dorada agotamiento energético a través de la alteración de los metabolismos de carbohidratos y aminoácidos, estrés oxidativo que daba lugar a daños en el ADN, peroxidación lipídica y daños en la membrana celular y apoptosis. También se observó la activación del metabolismo de los xenobióticos, así como una reacción inmune-inflamatoria. Finalmente, la integración de datos multiómicos reveló que la BP-4 causaba en la dorada impacto en el metabolismo energético y oxidación lipídica, así como, impacto en la biosíntesis de las hormonas esteroideas y tiroideas y en el metabolismo de los nucleótidos. En conclusión, esta tesis muestra la utilidad de integrar datos a distintos niveles biológicos y demuestra que el enfoque multiómico desarrollado y aplicado supone una gran ventaja para dilucidar modos de acción de los compuestos estudiados que, con enfoques diferentes, como el uso de una sola herramienta ómica, podrían haberse pasado por alto

    A primer on correlation-based dimension reduction methods for multi-omics analysis

    Full text link
    The continuing advances of omic technologies mean that it is now more tangible to measure the numerous features collectively reflecting the molecular properties of a sample. When multiple omic methods are used, statistical and computational approaches can exploit these large, connected profiles. Multi-omics is the integration of different omic data sources from the same biological sample. In this review, we focus on correlation-based dimension reduction approaches for single omic datasets, followed by methods for pairs of omics datasets, before detailing further techniques for three or more omic datasets. We also briefly detail network methods when three or more omic datasets are available and which complement correlation-oriented tools. To aid readers new to this area, these are all linked to relevant R packages that can implement these procedures. Finally, we discuss scenarios of experimental design and present road maps that simplify the selection of appropriate analysis methods. This review will guide researchers navigate the emerging methods for multi-omics and help them integrate diverse omic datasets appropriately and embrace the opportunity of population multi-omics.Comment: 30 pages, 2 figures, 6 table

    Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm

    Get PDF
    Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans

    Diagnostic classification of childhood cancer using multiscale transcriptomics

    Get PDF
    The causes of pediatric cancers’ distinctiveness compared to adult-onset tumors of the same type are not completely clear and not fully explained by their genomes. In this study, we used an optimized multilevel RNA clustering approach to derive molecular definitions for most childhood cancers. Applying this method to 13,313 transcriptomes, we constructed a pediatric cancer atlas to explore age-associated changes. Tumor entities were sometimes unexpectedly grouped due to common lineages, drivers or stemness profiles. Some established entities were divided into subgroups that predicted outcome better than current diagnostic approaches. These definitions account for inter-tumoral and intra-tumoral heterogeneity and have the potential of enabling reproducible, quantifiable diagnostics. As a whole, childhood tumors had more transcriptional diversity than adult tumors, maintaining greater expression flexibility. To apply these insights, we designed an ensemble convolutional neural network classifier. We show that this tool was able to match or clarify the diagnosis for 85% of childhood tumors in a prospective cohort. If further validated, this framework could be extended to derive molecular definitions for all cancer types

    Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm: A Review

    Get PDF
    Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans

    Novel Analytical Methods in Food Analysis

    Get PDF
    This reprint provides information on the novel analytical methods used to address challenges occurring at academic, regulatory, and commercial level. All topics covered include information on the basic principles, procedures, advantages, limitations, and applications. Integration of biological reagents, (nano)materials, technologies, and physical principles (spectroscopy and spectrometry) are discussed. This reprint is ideal for professionals of the food industry, regulatory bodies, as well as researchers
    corecore