662 research outputs found
Multivariate time series classification with temporal abstractions
The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data. This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets. Copyright © 2009, Assocation for the Advancement of ArtdicaI Intelligence (www.aaai.org). All rights reserved
Clinical Bioinformatics: challenges and opportunities
Background: Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics.
Methods: In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions.
Results: Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of –omics information.
Conclusions: Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput “-omics” technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research
Combining Unsupervised and Supervised Learning for Discovering Disease Subclasses
Diseases are often umbrella terms for many subcategories of disease. The identification of these subcategories is vital if we are to develop personalised treatments that are better focussed on individual patients. In this short paper, we explore the use of a combination of unsupervised learning to identify potential subclasses, and supervised learning to build models for better predicting a number of different health outcomes for patients that suffer from systemic sclerosis, a rare chronic connective tissue disorder - but one that shares many characteristics with other diseases. We explore a number of different algorithms for constructing models that simultaneously predict health outcomes and identify subcategories
Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The new algorithm combines K-means with consensus clustering in order build cohort-specific decision trees that improve classification as well as aid the understanding of the underlying differences of the discovered groups. The methods are tested on a real-world freely available breast cancer dataset and data from a London hospital on systemic sclerosis, a rare potentially fatal condition. Results show that “nearest consensus clustering classification” improves the accuracy and the prediction significantly when this algorithm has been compared with competitive similar methods
Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The new algorithm combines K-means with consensus clustering in order build cohort-specific decision trees that improve classification as well as aid the understanding of the underlying differences of the discovered groups. The methods are tested on a real-world freely available breast cancer dataset and data from a London hospital on systemic sclerosis, a rare potentially fatal condition. Results show that "nearest consensus clustering classification" improves the accuracy and the prediction significantly when this algorithm has been compared with competitive similar methods
Extracción de reglas temporales complejas para la detección de fallos del tratamiento antirretroviral
En la actualidad, las bases de datos clínicas contienen un gran volumen de información temporal que no está siendo suficientemente aprovechada y puede resultar fundamental para el óptimo cuidado de los pacientes. En este trabajo se describe un nuevo algoritmo que permite la asociación temporal del comportamiento de las variables que describen la evolución de los pacientes y la posterior obtención de reglas de interés clínico. Dicho interés es evaluado mediante el uso de diferentes métricas de demostrada utilidad en la extracción de conocimiento en bases de datos clínicas. Se presentan además los resultados obtenidos al aplicar este algoritmo a datos clínicos de pacientes con VIH/SIDA con objeto de detectar patrones de comportamiento de las variables que dan lugar a un fallo del tratamiento antirretroviral
Advancing Critical Care in the ICU: A Human-Centered Biomedical Data Visualization Systems
The purpose of this research is to provide medical clinicians with a new technology for interpreting large and diverse datasets to expedite critical care decision-making in the ICU. We refer to this technology as the medical information visualization assistant (MIVA). MIVA delivers multivariate biometric (bedside) data via a visualization display by transforming and organizing it into temporal resolutions that can provide contextual knowledge to clinicians. The result is a spatial organization of multiple datasets that allows rapid analysis and interpretation of trends. Findings from the usability study of the MIVA static prototype and heuristic inspection of the dynamic prototype suggest that using MIVA can yield faster and more accurate results. Furthermore, comments from the majority of the experimental group and the heuristic inspectors indicate that MIVA can facilitate clinical task flow in context-dependent health care settings
Clusters of individuals recovering from an exacerbation of chronic obstructive pulmonary disease and response to in-hospital pulmonary rehabilitation
Introduction and objectives: Due to the present low availability of pulmonary rehabilitation (PR) for individuals recovering from a COPD exacerbation (ECOPD), we need admission priority criteria. We tested the hypothesis that these individuals might be clustered according to baseline characteristics to identify subpopulations with different responses to PR. Methods: Multicentric retrospective analysis of individuals undergone in-hospital PR. Baseline characteristics and outcome measures (six-minute walking test - 6MWT, Medical Research Council scale for dyspnoea -MRC, COPD assessment test -CAT) were used for clustering analysis. Results: Data analysis of 1159 individuals showed that after program, the proportion of individuals reaching the minimal clinically important difference (MCID) was 85.0%, 86.3%, and 65.6% for CAT, MRC, and 6MWT respectively. Three clusters were found (C1-severe: 10.9%; C2-intermediate: 74.4%; C3-mild: 14.7% of cases respectively). Cluster C1-severe showed the worst conditions with the largest post PR improvements in outcome measures; C3-mild showed the least severe baseline conditions, but the smallest improvements. The proportion of participants reaching the MCID in ALL three outcome measures was significantly different among clusters, with C1-severe having the highest proportion of full success (69.0%) as compared to C2-intermediate (48.3%) and C3-mild (37.4%). Participants in C2-intermediate and C1-severe had 1.7- and 4.6-fold increases in the probability to reach the MCID in all three outcomes as compared to those in C3-mild (OR = 1.72, 95% confidence interval [95% CI] = 1.2 - 2.49, p = 0.0035 and OR = 4.57, 95% CI = 2.68 - 7.91, p < 0.0001 respectively). Conclusions: Clustering analysis can identify subpopulations of individuals recovering from ECOPD associated with different responses to PR. Our results may help in defining priority criteria based on the probability of success of PR
Recommended from our members
Predicting Comorbidities Using Resampling and Dynamic Bayesian Networks with Latent Variables
Gatekeeper of pluripotency: a common Oct4 transcriptional network operates in mouse eggs and embryonic stem cells
BACKGROUND: Oct4 is a key factor of an expanded transcriptional network (Oct4-TN) that governs pluripotency and self-renewal in embryonic stem cells (ESCs) and in the inner cell mass from which ESCs are derived. A pending question is whether the establishment of the Oct4-TN initiates during oogenesis or after fertilisation. To this regard, recent evidence has shown that Oct4 controls a poorly known Oct4-TN central to the acquisition of the mouse egg developmental competence. The aim of this study was to investigate the identity and extension of this maternal Oct4-TN, as much as whether its presence is circumscribed to the egg or maintained beyond fertilisation. RESULTS: By comparing the genome-wide transcriptional profile of developmentally competent eggs that express the OCT4 protein to that of developmentally incompetent eggs in which OCT4 is down-regulated, we unveiled a maternal Oct4-TN of 182 genes. Eighty of these transcripts escape post-fertilisation degradation and represent the maternal Oct4-TN inheritance that is passed on to the 2-cell embryo. Most of these 80 genes are expressed in cancer cells and 37 are notable companions of the Oct4 transcriptome in ESCs. CONCLUSIONS: These results provide, for the first time, a developmental link between eggs, early preimplantation embryos and ESCs, indicating that the molecular signature that characterises the ESCs identity is rooted in oogenesis. Also, they contribute a useful resource to further study the mechanisms of Oct4 function and regulation during the maternal-to-embryo transition and to explore the link between the regulation of pluripotency and the acquisition of de-differentiation in cancer cells
- …