1,545 research outputs found

    Les pors republicanes de RosalĂ­ Rovira

    Get PDF

    Prediction of Peptide Vascularization Inhibitory Activity in Tumor Tissue as a Possible Target for Cancer Treatment

    Get PDF
    [Abstract]The prediction of metabolic activities in silico form is crucial to be able to address all research possibilities without exceeding the experimental costs. In particular, for cancer research, the prediction of certain activities can be of great help in the discovery of different treatments. In this work it has been proposed to predict, through Machine Learning, the anti-angiogenic activity of peptides is currently being used in cancer treatment and is giving hopeful results. From a list of peptide sequences, three types of molecular descriptors were obtained (AAC, DC and TC) that offered the possibility of training different ML algorithms. After a Feature Selection process, different models were obtained with a predictive value that surpassed the current state of the art. These results shown that ML is useful for the classification and prediction of the activity of new peptides, making experimental screening cheaper and faster.Instituto Carlos III; PI17/01826Xunta de Galicia; Ref. ED431G/01Xunta de Galicia; , ED431D 2017/16Red Gallega de Investigación sobre Cáncer Colorrecta; Ref. ED431D 2017/23Ministerio de Economía y Competivividad; UNLC08-1E-002Ministerio de Economía y Competivividad; UNLC13-13-3503Ministerio de Economía y Competivividad; FJCI- 2015-2607

    Gene Signatures Research Involved in Cancer Using Machine Learning

    Get PDF
    [Abstract] With the cheapening of mass sequencing techniques and the rise of computer technologies, capable of analyzing a huge amount of data, it is necessary nowadays that both branches mutually benefit. Transcriptomics, in this case, is a branch of biology focused on the study of mRNA molecules, among others. The quantification of these molecules gives us information about the expression that a gene is having at a given moment. Having information on the expression of the approximately 20,000 genes harbored by human beings is a really useful source of information for the study of certain conditions and/or pathologies. In this work, patient expression -omic data data have been used to offer a new analysis methodology through Machine Learning. The results of this methodology were compared with a conventional methodology to observe how they differed and how they resembled each other. These techniques, therefore, offer a new mechanism for the search of genetic signatures involved, in this case, with cancer.Instituto de Salud Carlos III; PI17/01826Xunta de Galicia; ED431D 2017/16Red Gallega de Investigación sobre Cáncer Colorrectal; ED431D 2017/23Ministerio de Economía y Competitividad; UNLC08-1E-002Ministerio de Economía y Competitividad; UNLC13-13-3503Ministerio de Economía y Competitividad; FJCI- 2015-26071Xunta de Galicia; Ref ED431G/0

    Machine Learning Analysis of the Human Infant Gut Microbiome Identifies Influential Species in Type 1 Diabetes

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Diabetes is a disease that is closely linked to genetics and epigenetics, yet mechanisms for clarifying the onset and/or progression of the disease have sometimes not been fully managed. In recent years and due to the large number of recent studies, it is known that changes in the balance of the microbiota can cause a high battery of diseases, including diabetes. Machine Learning (ML) techniques are able to identify complex, non-linear patterns of expression and relationships within the data set to extract intrinsic knowledge without any biological assumptions about the data. At the same time, mass sequencing techniques allow us to obtain the metagenomic profile of an individual, whether it is a body part, organ or tissue, and thus identify the composition of a given microbe. The great increase in the development of both technologies in their respective fields of study leads to the logical union of both to try to identify the bases of a complex disease such as diabetes. To this end, a Random Forest model has been developed at different taxonomic levels, obtaining results above 0.80 in AUC for families and above 0.98 at species level, following a strict experimental design to ensure that results are compared under equal conditions. It is identified how, in infants, the species Bacteroides uniformis, Bacteroides dorei and Bacteroides thetaiotaomicron are reduced in the microbiota of those with T1D, while, the populations of Prevotella copri increase slightly and that of Bacteroides vulgatus is much higher. Finally, thanks to the more specific metagenomic signature at species level, a model has been generated to predict those seroconverted patients not previously diagnosed with diabetes but who have expressed at least two of the autoantibodies analysed.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe”. and the General Directorate of Culture, Education and University Management of Xunta de Galicia, Spain (Ref. ED431D 2017/16), the “Galician Network for Colorectal Cancer Research, Spain” (Ref. ED431D 2017/23) and Competitive Reference Groups, Spain (Ref. ED431C 2018/49). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscript. CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia, Spain”, supported in an 80% through ERDF Funds, Spain, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades, Spain” (Grant ED431G 2019/01). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscript. The calculations were performed on resources provided by the Spanish Ministry of Economy and Competitiveness via funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13-3503) and the European Regional Development Funds (FEDER) . Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431G 2019/0

    Identification of Prevotella, Anaerotruncus and Eubacterium Genera by Machine Learning Analysis of Metagenomic Profiles for Stratification of Patients Affected by Type I Diabetes

    Get PDF
    [Abstract] Previous works have reported different bacterial strains and genera as the cause of different clinical pathological conditions. In our approach, using the fecal metagenomic profiles of newborns, a machine learning-based model was generated capable of discerning between patients affected by type I diabetes and controls. Furthermore, a random forest algorithm achieved a 0.915 in AUROC. The automation of processes and support to clinical decision making under metagenomic variables of interest may result in lower experimental costs in the diagnosis of complex diseases of high prevalence worldwide.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe.” and the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. ED431G/01, ED431D 2017/16), the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23) and Competitive Reference Groups (Ref. ED431C 2018/49). The funding body did not have a role in the experimental design; data collection, analysis and interpretation; and writing of this manuscriptXunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/4

    Population Subset Selection for the Use of a Validation Dataset for Overfitting Control in Genetic Programming

    Get PDF
    [Abstract] Genetic Programming (GP) is a technique which is able to solve different problems through the evolution of mathematical expressions. However, in order to be applied, its tendency to overfit the data is one of its main issues. The use of a validation dataset is a common alternative to prevent overfitting in many Machine Learning (ML) techniques, including GP. But, there is one key point which differentiates GP and other ML techniques: instead of training a single model, GP evolves a population of models. Therefore, the use of the validation dataset has several possibilities because any of those evolved models could be evaluated. This work explores the possibility of using the validation dataset not only on the training-best individual but also in a subset with the training-best individuals of the population. The study has been conducted with 5 well-known databases performing regression or classification tasks. In most of the cases, the results of the study point out to an improvement when the validation dataset is used on a subset of the population instead of only on the training-best individual, which also induces a reduction on the number of nodes and, consequently, a lower complexity on the expressions.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431D 2017/23Instituto de Salud Carlos III; PI17/0182

    La literatura franciscana en las imprentas de la Cataluña moderna

    Get PDF
    Al igual que otras órdenes religiosas, como jesuitas y dominicos, los franciscanos pronto vieron en la imprenta y en sus productos una potente herramienta para la difusión de sus discursos durante la época moderna. En el presente artículo se analiza la literatura generada por los discípulos de San Francisco en el ámbito del Principado de Cataluña, cuál fue su evolución, así como identificar qué elementos fueron más significativos en sus ediciones.Like other religious orders, such as Jesuits and Dominicans, the Franciscans soon saw in the printing press and its products a powerful tool for the circulation of their discourses during early modern times. This article analyses the literature generated by the pupils of San Francisco in Catalonia, how it evolved and identifies the most significant features of its editions

    Machine Learning Analysis of TCGA Cancer Data

    Get PDF
    [Abstract] In recent years, machine learning (ML) researchers have changed their focus towards biological problems that are difficult to analyse with standard approaches. Large initiatives such as The Cancer Genome Atlas (TCGA) have allowed the use of omic data for the training of these algorithms. In order to study the state of the art, this review is provided to cover the main works that have used ML with TCGA data. Firstly, the principal discoveries made by the TCGA consortium are presented. Once these bases have been established, we begin with the main objective of this study, the identification and discussion of those works that have used the TCGA data for the training of different ML approaches. After a review of more than 100 different papers, it has been possible to make a classification according to following three pillars: the type of tumour, the type of algorithm and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two major algorithms: Random Forest and Support Vector Machines. We also observe the rise in the use of deep artificial neural networks. It is worth emphasizing, the increase of integrative models of multi-omic data analysis. The different biological conditions are a consequence of molecular homeostasis, driven by both protein coding regions, regulatory elements and the surrounding environment. It is notable that a large number of works make use of genetic expression data, which has been found to be the preferred method by researchers when training the different models. The biological problems addressed have been classified into five types: prognosis prediction, tumour subtypes, microsatellite instability (MSI), immunological aspects and certain pathways of interest. A clear trend was detected in the prediction of these conditions according to the type of tumour. That is the reason for which a greater number of works have focused on the BRCA cohort, while specific works for survival, for example, were centred on the GBM cohort, due to its large number of events. Throughout this review, it will be possible to go in depth into the works and the methodologies used to study TCGA cancer data. Finally, it is intended that this work will serve as a basis for future research in this field of study.This work was supported by the “Collaborative Project in Genomic Data Integration (CICLOGEN)” PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe.” and the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. ED431D 2017/16), the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23) and Competitive Reference Groups (Ref. ED431C 2018/49). CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia”, supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades” (Grant ED431G 2019/01). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscriptXunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431G 2019/0
    • …
    corecore