13 research outputs found

    Prediction of Peptide Vascularization Inhibitory Activity in Tumor Tissue as a Possible Target for Cancer Treatment

    Get PDF
    [Abstract]The prediction of metabolic activities in silico form is crucial to be able to address all research possibilities without exceeding the experimental costs. In particular, for cancer research, the prediction of certain activities can be of great help in the discovery of different treatments. In this work it has been proposed to predict, through Machine Learning, the anti-angiogenic activity of peptides is currently being used in cancer treatment and is giving hopeful results. From a list of peptide sequences, three types of molecular descriptors were obtained (AAC, DC and TC) that offered the possibility of training different ML algorithms. After a Feature Selection process, different models were obtained with a predictive value that surpassed the current state of the art. These results shown that ML is useful for the classification and prediction of the activity of new peptides, making experimental screening cheaper and faster.Instituto Carlos III; PI17/01826Xunta de Galicia; Ref. ED431G/01Xunta de Galicia; , ED431D 2017/16Red Gallega de Investigación sobre Cáncer Colorrecta; Ref. ED431D 2017/23Ministerio de Economía y Competivividad; UNLC08-1E-002Ministerio de Economía y Competivividad; UNLC13-13-3503Ministerio de Economía y Competivividad; FJCI- 2015-2607

    Técnicas de machine learning aplicadas al diagnóstico y tratamiento oncológico de precisión mediante el análisis de datos ómicos

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01Tese por compendio de publicacións[Resumen] Gracias al abaratamiento en los costes de secuenciación, cada día se genera una mayor cantidad de datos ómicos capaces de caracterizar el cáncer molecularmente. Grandes consorcios generan gran cantidad de estos datos, poniéndolos a disposición pública. Además, los modelos de Machine Learning (ML) ofrecen una ventaja significativa para extraer patrones complejos de datos biomédicos. Se requiere un estudio de su aplicación en este campo para poder obtener resultados más robustos y generalizados. Esta tesis estudia la aplicación de modelos de ML para el análisis de datos ómicos. Gracias a una revisión de trabajos previos, se identificaron ciertas limitaciones en cuanto reproducibilidad y validación en las metodologías. A partir de este estudio se establecieron las directrices para llevar a cabo un análisis de ML robusto y reproducible con datos ómicos. Se identificaron biomarcadores y pathways alterados en pacientes con cáncer de colon, se predijeron condiciones clínicas relevantes para el desarrollo del tumor y se desarrolló un modelo de screening automático de fármacos antitumorales. Los resultados se presentan en un compendio de tres publicaciones científicas. En conclusión, esta tesis ofrece diferentes aproximaciones computacionales que ayudan al diagnóstico y al tratamiento oncológico de precisión.[Abstract] As sequencing costs have been dramatically reduced, an increasing amount of omics data have been generated to molecularly characterise cancer. Large consortiums are generating large amount of this data and making them publicly available. In addition, Machine Learning (ML) models offer a significant advantage extracting complex patterns from biomedical data. A study of their application in this field is necessary in order to obtain more robust and generalised results. This thesis studies the application of ML models to omics data analysis. Performing a review of previous work, certain limitations in terms of reproducibility and validation of the methodologies were identified. From this study, a set of guidelines for robust and reproducible ML analysis of omics data have been established, allowing to identify altered biomarkers and pathways in colon cancer patients, predict clinical conditions relevant to tumour development, and develop an automatic anti-tumour drug screening model. These results are presented as a compendium of three scientific manuscripts. In conclusion, this thesis provides a variety of computational approaches to improve diagnosis and precision oncological treatment[Resumo] Grazas aos menores custos de secuenciación, cada día xéranse máis datos ómicos capaces de caracterizar molecularmente o cancro. Grandes consorcios están a xerar gran cantidade destes datos de forma pública. Ademais, os modelos de Machine Learning (ML) ofrecen unha vantaxe significativa para extraer complexos patróns de datos biomédicos. É necesario un estudo da súa aplicación neste campo para obter resultados máis robustos e xeneralizados. Esta tese estuda a aplicación de modelos de ML para a análise de datos ómicos. Grazas a unha revisión de traballos anteriores, identificáronse certas limitacións en termos de reprodutibilidade e validación nas metodoloxías. A partir deste estudo, establecéronse pautas para realizar unha análise de ML robusta e reproducible con datos ómicos. Identificáronse biomarcadores e vías alteradas en pacientes con cancro de colon, predixéronse condicións clínicas relevantes para o desenvolvemento tumoral e desenvolveuse un modelo de detección automática de medicamentos antitumorais. Os resultados preséntanse nun compendio de tres publicacións científicas. En conclusión, esta tese ofrece diferentes enfoques computacionais que axudan ao diagnóstico e tratamento preciso do cancro

    Gene Signatures Research Involved in Cancer Using Machine Learning

    Get PDF
    [Abstract] With the cheapening of mass sequencing techniques and the rise of computer technologies, capable of analyzing a huge amount of data, it is necessary nowadays that both branches mutually benefit. Transcriptomics, in this case, is a branch of biology focused on the study of mRNA molecules, among others. The quantification of these molecules gives us information about the expression that a gene is having at a given moment. Having information on the expression of the approximately 20,000 genes harbored by human beings is a really useful source of information for the study of certain conditions and/or pathologies. In this work, patient expression -omic data data have been used to offer a new analysis methodology through Machine Learning. The results of this methodology were compared with a conventional methodology to observe how they differed and how they resembled each other. These techniques, therefore, offer a new mechanism for the search of genetic signatures involved, in this case, with cancer.Instituto de Salud Carlos III; PI17/01826Xunta de Galicia; ED431D 2017/16Red Gallega de Investigación sobre Cáncer Colorrectal; ED431D 2017/23Ministerio de Economía y Competitividad; UNLC08-1E-002Ministerio de Economía y Competitividad; UNLC13-13-3503Ministerio de Economía y Competitividad; FJCI- 2015-26071Xunta de Galicia; Ref ED431G/0

    Critical review on bone grafting during immediate implant placement

    Get PDF
    In the last 20 years, immediate implant placement has been proposed as a predictable protocol to replace failing teeth. The research conducted in preclinical and clinical studies have focused on soft and hard tissue changes following tooth extraction and immediate implant placement. Different approaches for hard and soft tissue grafting together with provisional restorations have been proposed to compensate tissue alterations. This review analyzed some relevant clinical and preclinical literature focusing on the impact of bone grafting procedures on immediate implant placement in terms of hard and soft tissue changes, aesthetic results, and patient-related outcomesS

    Machine Learning Analysis of TCGA Cancer Data

    Get PDF
    [Abstract] In recent years, machine learning (ML) researchers have changed their focus towards biological problems that are difficult to analyse with standard approaches. Large initiatives such as The Cancer Genome Atlas (TCGA) have allowed the use of omic data for the training of these algorithms. In order to study the state of the art, this review is provided to cover the main works that have used ML with TCGA data. Firstly, the principal discoveries made by the TCGA consortium are presented. Once these bases have been established, we begin with the main objective of this study, the identification and discussion of those works that have used the TCGA data for the training of different ML approaches. After a review of more than 100 different papers, it has been possible to make a classification according to following three pillars: the type of tumour, the type of algorithm and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two major algorithms: Random Forest and Support Vector Machines. We also observe the rise in the use of deep artificial neural networks. It is worth emphasizing, the increase of integrative models of multi-omic data analysis. The different biological conditions are a consequence of molecular homeostasis, driven by both protein coding regions, regulatory elements and the surrounding environment. It is notable that a large number of works make use of genetic expression data, which has been found to be the preferred method by researchers when training the different models. The biological problems addressed have been classified into five types: prognosis prediction, tumour subtypes, microsatellite instability (MSI), immunological aspects and certain pathways of interest. A clear trend was detected in the prediction of these conditions according to the type of tumour. That is the reason for which a greater number of works have focused on the BRCA cohort, while specific works for survival, for example, were centred on the GBM cohort, due to its large number of events. Throughout this review, it will be possible to go in depth into the works and the methodologies used to study TCGA cancer data. Finally, it is intended that this work will serve as a basis for future research in this field of study.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe.” and the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. ED431D 2017/16), the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23) and Competitive Reference Groups (Ref. ED431C 2018/49). CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia”, supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades” (Grant ED431G 2019/01). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscriptXunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431G 2019/0

    Machine Learning Analysis of the Human Infant Gut Microbiome Identifies Influential Species in Type 1 Diabetes

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Diabetes is a disease that is closely linked to genetics and epigenetics, yet mechanisms for clarifying the onset and/or progression of the disease have sometimes not been fully managed. In recent years and due to the large number of recent studies, it is known that changes in the balance of the microbiota can cause a high battery of diseases, including diabetes. Machine Learning (ML) techniques are able to identify complex, non-linear patterns of expression and relationships within the data set to extract intrinsic knowledge without any biological assumptions about the data. At the same time, mass sequencing techniques allow us to obtain the metagenomic profile of an individual, whether it is a body part, organ or tissue, and thus identify the composition of a given microbe. The great increase in the development of both technologies in their respective fields of study leads to the logical union of both to try to identify the bases of a complex disease such as diabetes. To this end, a Random Forest model has been developed at different taxonomic levels, obtaining results above 0.80 in AUC for families and above 0.98 at species level, following a strict experimental design to ensure that results are compared under equal conditions. It is identified how, in infants, the species Bacteroides uniformis, Bacteroides dorei and Bacteroides thetaiotaomicron are reduced in the microbiota of those with T1D, while, the populations of Prevotella copri increase slightly and that of Bacteroides vulgatus is much higher. Finally, thanks to the more specific metagenomic signature at species level, a model has been generated to predict those seroconverted patients not previously diagnosed with diabetes but who have expressed at least two of the autoantibodies analysed.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe”. and the General Directorate of Culture, Education and University Management of Xunta de Galicia, Spain (Ref. ED431D 2017/16), the “Galician Network for Colorectal Cancer Research, Spain” (Ref. ED431D 2017/23) and Competitive Reference Groups, Spain (Ref. ED431C 2018/49). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscript. CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia, Spain”, supported in an 80% through ERDF Funds, Spain, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades, Spain” (Grant ED431G 2019/01). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscript. The calculations were performed on resources provided by the Spanish Ministry of Economy and Competitiveness via funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13-3503) and the European Regional Development Funds (FEDER) . Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431G 2019/0

    Identification of Prevotella, Anaerotruncus and Eubacterium Genera by Machine Learning Analysis of Metagenomic Profiles for Stratification of Patients Affected by Type I Diabetes

    Get PDF
    [Abstract] Previous works have reported different bacterial strains and genera as the cause of different clinical pathological conditions. In our approach, using the fecal metagenomic profiles of newborns, a machine learning-based model was generated capable of discerning between patients affected by type I diabetes and controls. Furthermore, a random forest algorithm achieved a 0.915 in AUROC. The automation of processes and support to clinical decision making under metagenomic variables of interest may result in lower experimental costs in the diagnosis of complex diseases of high prevalence worldwide.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe.” and the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. ED431G/01, ED431D 2017/16), the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23) and Competitive Reference Groups (Ref. ED431C 2018/49). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscriptXunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/4

    Molecular Docking and Machine Learning Analysis of Abemaciclib in Colon Cancer

    Get PDF
    [Abstract] Background - The main challenge in cancer research is the identification of different omic variables that present a prognostic value and personalised diagnosis for each tumour. The fact that the diagnosis is personalised opens the doors to the design and discovery of new specific treatments for each patient. In this context, this work offers new ways to reuse existing databases and work to create added value in research. Three published signatures with significante prognostic value in Colon Adenocarcinoma (COAD) were indentified. These signatures were combined in a new meta-signature and validated with main Machine Learning (ML) and conventional statistical techniques. In addition, a drug repurposing experiment was carried out through Molecular Docking (MD) methodology in order to identify new potential treatments in COAD. Results - The prognostic potential of the signature was validated by means of ML algorithms and differential gene expression analysis. The results obtained supported the possibility that this meta-signature could harbor genes of interest for the prognosis and treatment of COAD. We studied drug repurposing following a molecular docking (MD) analysis, where the different protein data bank (PDB) structures of the genes of the meta-signature (in total 155) were confronted with 81 anti-cancer drugs approved by the FDA. We observed four interactions of interest: GLTP - Nilotinib, PTPRN - Venetoclax, VEGFA - Venetoclax and FABP6 - Abemaciclib. The FABP6 gene and its role within different metabolic pathways were studied in tumour and normal tissue and we observed the capability of the FABP6 gene to be a therapeutic target. Our in silico results showed a significant specificity of the union of the protein products of the FABP6 gene as well as the known action of Abemaciclib as an inhibitor of the CDK4/6 protein and therefore, of the cell cycle. Conclusions - The results of our ML and differential expression experiments have first shown the FABP6 gene as a possible new cancer biomarker due to its specificity in colonic tumour tissue and no expression in healthy adjacent tissue. Next, the MD analysis showed that the drug Abemaciclib characteristic affinity for the different protein structures of the FABP6 gene. Therefore, in silico experiments have shown a new opportunity that should be validated experimentally, thus helping to reduce the cost and speed of drug screening. For these reasons, we propose the validation of the drug Abemaciclib for the treatment of colon cancer.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe.” and the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. ED431G/01, ED431D 2017/16), the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23) and Competitive Reference Groups (Ref. ED431C 2018/49). The calculations were performed on resources provided by the Spanish Ministry of Economy and Competitiveness via funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13-3503) and the European Regional Development Funds (FEDER). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscriptXunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/4

    Integrative Multi-Omics Data-Driven Approach for Metastasis Prediction in Cancer

    Get PDF
    [Abstract] Nowadays biomedical research is generating huge amounts of omic data, covering all levels of genetic information from nucleotide sequencing to protein metabolism. In the beginning, data were analyzed independently losing a great deal of essential information in the models. Even so, complex metabolic routes and genetic diseases could be determined. In the last decade, there has been an ever-increasing number of research projects that follow a systemic biological approach by integrating multiple omic datasets obtaining more complex, powerful and informative models that provide a deeper knowledge about the genotype-phenotype interactions. These models greatly contributed to the study of complex multi-factorial diseases such as cancer. The onset and development of any type of cancer can be influenced by multiple variables. Integrate as many as possible omic datasets is therefore the best approach to extract all the underlying knowledge. A significant factor in the mortality of this disease is the metastatic process. The identification of the factors involved in this cell behavior may be helpful in the diagnosis and hopefully in the disease prevention. The development of novel integrative multiomics approaches is an opportunity to fill the gaps between our ability to generate data and the difficulties to understand the biology behind them. In this work we propose a methodology pipeline for analyze multi-omics data using machine learning.Instituto de Salud Carlos III; PI17/01826Xunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/1Xunta de Galicia; ED431D 2017/2Ministerio de Economía y Competitividad; UNLC08-1E-002Ministerio de Economía y Competitividad; UNLC13-13-350

    Higher COVID-19 pneumonia risk associated with anti-IFN-α than with anti-IFN-ω auto-Abs in children

    Full text link
    We found that 19 (10.4%) of 183 unvaccinated children hospitalized for COVID-19 pneumonia had autoantibodies (auto-Abs) neutralizing type I IFNs (IFN-alpha 2 in 10 patients: IFN-alpha 2 only in three, IFN-alpha 2 plus IFN-omega in five, and IFN-alpha 2, IFN-omega plus IFN-beta in two; IFN-omega only in nine patients). Seven children (3.8%) had Abs neutralizing at least 10 ng/ml of one IFN, whereas the other 12 (6.6%) had Abs neutralizing only 100 pg/ml. The auto-Abs neutralized both unglycosylated and glycosylated IFNs. We also detected auto-Abs neutralizing 100 pg/ml IFN-alpha 2 in 4 of 2,267 uninfected children (0.2%) and auto-Abs neutralizing IFN-omega in 45 children (2%). The odds ratios (ORs) for life-threatening COVID-19 pneumonia were, therefore, higher for auto-Abs neutralizing IFN-alpha 2 only (OR [95% CI] = 67.6 [5.7-9,196.6]) than for auto-Abs neutralizing IFN-. only (OR [95% CI] = 2.6 [1.2-5.3]). ORs were also higher for auto-Abs neutralizing high concentrations (OR [95% CI] = 12.9 [4.6-35.9]) than for those neutralizing low concentrations (OR [95% CI] = 5.5 [3.1-9.6]) of IFN-omega and/or IFN-alpha 2
    corecore