119 research outputs found

    Machine learning methods for the analysis of liquid chromatography-mass spectrometry datasets in metabolomics

    Get PDF
    Liquid Chromatography-Mass Spectrometry (LC/MS) instruments are widely used in Metabolomics. To analyse their output, it is necessary to use computational tools and algorithms to extract meaningful biological information. The main goal of this thesis is to provide with new computational methods and tools to process and analyse LC/MS datasets in a metabolomic context. A total of 4 tools and methods were developed in the context of this thesis. First, it was developed a new method to correct possible non-linear drift effects in the retention time of the LC/MS data in Metabolomics, and it was coded as an R package called HCor. This method takes advantage of the retention time drift correlation found in typical LC/MS data, in which there are chromatographic regions in which their retention time drift is consistently different than other regions. Our method makes the hypothesis that this correlation structure is monotonous in the retention time and fits a non-linear model to remove the unwanted drift from the dataset. This method was found to perform especially well on datasets suffering from large drift effects when compared to other state-of-the art algorithms. Second, it was implemented and developed a new method to solve known issues of peak intensity drifts in metabolomics datasets. This method is based on a two-step approach in which are corrected possible intensity drift effects by modelling the drift and then the data is normalised using the median of the resulting dataset. The drift was modelled using a Common Principal Components Analysis decomposition on the Quality Control classes and taking one, two or three Common Principal Components to model the drift space. This method was compared to four other drift correction and normalisation methods. The two-step method was shown to perform a better intensity drift removal than all the other methods. All the tested methods including the two-step method were coded as an R package called intCor and it is publicly available. Third, a new processing step in the LC/MS data analysis workflow was proposed. In general, when LC/MS instruments are used in a metabolomic context, a metabolite may give a set of peaks as an output. However, the general approach is to consider each peak as a variable in the machine learning algorithms and statistical tests despite the important correlation structure found between those peaks coming from the same source metabolite. It was developed an strategy called peak aggregation techniques, that allow to extract a measure for each metabolite considering the intensity values of the peaks coming from this metabolite across the samples in study. If the peak aggregation techniques are applied on each metabolite, the result is a transformed dataset in which the variables are no longer the peaks but the metabolites. 4 different peak aggregation techniques were defined and, running a repeated random sub-sampling cross-validation stage, it was shown that the predictive power of the data was improved when the peak aggregation techniques were used regardless of the technique used. Fourth, a computational tool to perform end-to-end analysis called MAIT was developed and coded under the R environment. The MAIT package is highly modular and programmable which ease replacing existing modules for user-created modules and allow the users to perform their personalised LC/MS data analysis workflows. By default, MAIT takes the raw output files from an LC/MS instrument as an input and, by applying a set of functions, gives a metabolite identification table as a result. It also gives a set of figures and tables to allow for a detailed analysis of the metabolomic data. MAIT even accepts external peak data as an input. Therefore, the user can insert peak table obtained by any other available tool and MAIT can still perform all its other capabilities on this dataset like a classification or mining the Human Metabolome Dataset which is included in the package

    Controlled composite processing based on off-stoichiometric thiol-epoxy dual-curing systems with sequential heat release (SHR)

    Get PDF
    Control of curing rate and exothermicity during processing of thermosetting composite materials is essential in order to minimize the formation of internal stresses leading to mechanical and dimensional defects in the samples, especially in thick composite samples. It was recently proposed that sequential heat release, an approach based on the kinetic control of the curing sequence of dual-curing thermosets, would enable a step-wise release of the reaction heat and therefore a better control of conversion and temperature profiles during the crosslinking stage. In this article, it is shown experimental proof of this concept obtained by means of an instrumented mold that can be used for the processing of small samples with and without carbon fiber reinforcement. Safe processing scenarios have been defined by numerical simulation using a simplified two-dimensional heat transfer model and validated experimentally.Peer ReviewedPostprint (author's final draft

    An R package to analyse LC/MS metabolomic data: MAIT (Metabolite Automatic Identification Toolkit)

    Get PDF
    Current tools for liquid chromatography and mass spectrometry for metabolomic data cover a limited number of processing steps, whereas online tools are hard to use in a programmable fashion. This article introduces the Metabolite Automatic Identification Toolkit (MAIT) package, which makes it possible for users to perform metabolomic end-to-end liquid chromatography and mass spectrometry data analysis. MAIT is focused on improving the peak annotation stage and provides essential tools to validate statistical analysis results. MAIT generates output files with the statistical results, peak annotation and metabolite identification. AVAILABILITY AND IMPLEMENTATION: http://b2slab.upc.edu/software-and-downloads/metabolite-automatic-identification-toolkit/

    Enriquecimiento mediante vías metabólicas de datos de Cromatografía Líquida- Espectrometría de Masas a través de análisis espectral de grafos

    Get PDF
    Una de las técnicas experimentales más extendidas en el ámbito de investigación biológica y la química analítica es la Cromatografía Líquida – Espectrometría de Masas, CL/EM, cuya salida informa sobre los compuestos presentes en las muestras mediante una técnica de separación física acoplada a una separación en función de la relación carga-masa. Las técnicas de enriquecimiento de vías metabólicas son preciadas en el tratamiento de conjuntos extensivos de datos, puesto que traducen esta información sobre computestos en términos de vías metabólicas a la vez que reducen el ruido estadístico. Las vías metabólicas son fuente de conocimiento por su estrecha relación con los mecanismos biológicos. Este trabajo propone una nueva técnica de enriquecimiento de datos obtenidos en CL/EM mediante una estrategia en dos bloques. El primero consiste en plasmar la base de datos Kyoto Encyclopedia of Genes and Genomes en grafos interpretables. El segundo trata de aplicar algoritmos de difusión de calor y PageRank sobre dichos grafos, con el objetivo de llevar a término el enriquecimiento. Estos procedimientos se han aplicado en un caso real y sus resultados coinciden con los de validación funcional.Peer ReviewedPostprint (author's final draft

    FELLA: an R package to enrich metabolomics data

    Get PDF
    Background: Pathway enrichment techniques are useful for understanding experimental metabolomics data. Their purpose is to give context to the affected metabolites in terms of the prior knowledge contained in metabolic pathways. However, the interpretation of a prioritized pathway list is still challenging, as pathways show overlap and cross talk effects. Results: We introduce FELLA, an R package to perform a network-based enrichment of a list of affected metabolites. FELLA builds a hierarchical representation of an organism biochemistry from the Kyoto Encyclopedia of Genes and Genomes (KEGG), containing pathways, modules, enzymes, reactions and metabolites. In addition to providing a list of pathways, FELLA reports intermediate entities (modules, enzymes, reactions) that link the input metabolites to them. This sheds light on pathway cross talk and potential enzymes or metabolites as targets for the condition under study. FELLA has been applied to six public datasets -three from Homo sapiens, two from Danio rerio and one from Mus musculus- and has reproduced findings from the original studies and from independent literature. Conclusions: The R package FELLA offers an innovative enrichment concept starting from a list of metabolites, based on a knowledge graph representation of the KEGG database that focuses on interpretability. Besides reporting a list of pathways, FELLA suggests intermediate entities that are of interest per se. Its usefulness has been shown at several molecular levels on six public datasets, including human and animal models. The user can run the enrichment analysis through a simple interactive graphical interface or programmatically. FELLA is publicly available in Bioconductor under the GPL-3 license.Peer ReviewedPostprint (published version

    Planta de producción de ácido láctico

    Get PDF
    El presente proyecto trata sobre el diseño y el estudio de la viabilidad de construcción y funcionamiento de una planta para la producción de ácido láctico siguiendo las pertinentes normativas urbanísticas, sectoriales y medioambientales

    Soft clustering using real-world data for the identification of multimorbidity patterns in an elderly population: Cross-sectional study in a Mediterranean population

    Get PDF
    The aim of this study was to identify, with soft clustering methods, multimorbidity patterns in the electronic health records of a population =65 years, and to analyse such patterns in accordance with the different prevalence cut-off points applied. Fuzzy cluster analysis allows individuals to be linked simultaneously to multiple clusters and is more consistent with clinical experience than other approaches frequently found in the literature.Peer ReviewedPostprint (published version

    Reconocimiento de emociones en video mediante servicios cognitivos

    Get PDF
    El uso de video tutoriales se ha convertido en una práctica habitual en la docencia, incluso a nivel presencial. Diseñar video tutoriales de calidad es un reto para aquel docente que pretende mantener la atención del estudiante a lo largo de todo el visionado. No obstante, la utilización de este recurso de manera inapropiada puede generar el efecto contrario al que se persigue. El visionado de vídeos con contenidos obsoletos, anacrónicos, y/o con una proporción de texto excesiva, puede provocar desinterés y desmotivación en los alumnos. Esta problemática fue la que motivó a una estudiante del Grado en Informática y servicios que se imparte en nuestro centro, a presentar una propuesta de Trabajo de fin de grado (TFG) con un claro objetivo: detectar los distintos estados de ánimo de un estudiante cuando está visionando un video tutorial, con la intención de aprovechar esta información para un mejor diseño de los video tutoriales utilizados en clase. En este trabajo se presenta el diseño y la implementación de una aplicación de escritorio para Windows, escrita en lenguaje de programación C#, que ayuda al docente a evaluar, de manera asíncrona, la idoneidad del material audiovisual utilizado como soporte, observando qué impacto tienen los video tutoriales sobre los alumnos y analizando cómo éstos responden a su contenido. Esta aplicación permite la extracción de datos cuantitativos, acerca de las emociones detectadas en videos almacenados localmente. La metodología utilizada consistió en efectuar grabaciones de varios estudiantes mientras visualizaban un video tutorial para, con posterioridad, analizar dichas grabaciones mediante el uso de las herramientas disponibles en servicios cognitivos de Microsoft. Estos servicios permiten el reconocimiento de emociones tales como sorpresa, disgusto, miedo, felicidad, entre otras. La utilización de dichas funcionalidades generó algunos problemas técnicos que, una vez superados, permitieron asociar las emociones mostradas por el estudiante durante el visionado, con los instantes del video tutorial causantes de dichas emociones. Los resultados obtenidos han sido satisfactorios hasta el momento, aunque no se ha podido efectuar un número de análisis suficientemente grande como para extraer conclusiones de peso. Lamentablemente, la temporalidad propia del TFG ha condicionado el tiempo que se ha dedicado a probar la herramienta implementada. No obstante, se ha abierto una línea de trabajo que será seguida por otros estudiantes en el futuro.The use of video tutorials has become a common practice in teaching, even at the classroom level. Design quality video tutorials is a challenge for the teacher who intends to maintain the student's attention throughout the viewing. However, the use of this resource inappropriately can generate the opposite effect to the one pursued. The viewing of videos with obsolete, anachronistic content and/or with an excessive proportion of text can cause disinterest and demotivation in the students. This problem was the one that motivated a student of the Degree in Computer Science and services that is taught in our center, to present a proposal for Trabajo de fin de grado (TFG) with a clear objective: detect the different emotions of a student when is watching a video tutorial, with the intention of taking advantage of this information for a better design of the video tutorials used in class. This paper presents the design and implementation of a desktop application for Windows, written in C # programming language, which helps the teacher to evaluate, asynchronously, the suitability of the audiovisual material used as support, observing what impact the video tutorials have on the students and analyzing how they respond to their content. This application allows the extraction of quantitative data, about the emotions detected in videos stored locally. The methodology used consisted of making recordings of several students while viewing a video tutorial to later analyze those recordings by using the tools available in Microsoft cognitive services. These services allow the recognition of emotions such as surprise, disgust, fear, happiness, among others. The use of these functionalities generated some technical problems that, once overcome, allowed to associate the emotions shown by the student during the viewing, with the instants of the video tutorial that caused those emotions. The results obtained have been satisfactory up to now, although it has not been possible to carry out an analysis number large enough to draw conclusions of weight. Unfortunately, the temporality of the TFG has conditioned the time that has been dedicated to test the implemented tool. However, a line of work has been opened that will be followed by other students in the future

    Neurodegenerative disorder risk in idiopathic REM sleep behavior disorder: study in 174 Patients.

    Get PDF
    Objective To estimate the risk for developing a defined neurodegenerative syndrome in a large cohort of idiopathic REM sleep behavior disorder (IRBD) patients with long follow-up. Methods Using the Kaplan-Meier method, we estimated the disease-free survival rate from defined neurodegenerative syndromes in all the consecutive IRBD patients diagnosed and followed-up in our tertiary referal sleep center between November 1991 and July 2013. Results The cohort comprises 174 patients with a median age at diagnosis of IRBD of 69 years and a median follow-up of four years. The risk of a defined neurodegenerative syndrome from the time of IRBD diagnosis was 33.1% at five years, 75.7% at ten years, and 90.9% at 14 years. The median conversion time was 7.5 years. Emerging diagnoses (37.4%) were dementia with Lewy bodies (DLB) in 29 subjects, Parkinson disease (PD) in 22, multiple system atrophy (MSA) in two, and mild cognitive impairment (MCI) in 12. In six cases, in whom postmortem was performed, neuropathological examination disclosed neuronal loss and widespread Lewy-type pathology in the brain in each case. Conclusions In a large IRBD cohort diagnosed in a tertiary referal sleep center, prolonged follow-up indicated that the majority of patients are eventually diagnosed with the synucleinopathies PD, DLB and less frequently MSA. IRBD represented the prodromal period of these conditions. Our findings in IRBD have important implications in clinical practice, in the investigation of the early pathological events occurring in the synucleinopathies, and for the design of interventions with potential disease-modifying agents
    corecore