128,807 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Digging into acceptor splice site prediction : an iterative feature selection approach
Feature selection techniques are often used to reduce data dimensionality, increase classification performance, and gain insight into the processes that generated the data. In this paper, we describe an iterative procedure of feature selection and feature construction steps, improving the classification of acceptor splice sites, an important subtask of gene prediction.
We show that acceptor prediction can benefit from feature selection, and describe how feature selection techniques can be used to gain new insights in the classification of acceptor sites. This is illustrated by the identification of a new, biologically motivated feature: the AG-scanning feature.
The results described in this paper contribute both to the domain of gene prediction, and to research in feature selection techniques, describing a new wrapper based feature weighting method that aids in knowledge discovery when dealing with complex datasets
Personality cannot be predicted from the power of resting state EEG
In the present study we asked whether it is possible to decode personality
traits from resting state EEG data. EEG was recorded from a large sample of
subjects (N = 309) who had answered questionnaires measuring personality trait
scores of the 5 dimensions as well as the 10 subordinate aspects of the Big
Five. Machine learning algorithms were used to build a classifier to predict
each personality trait from power spectra of the resting state EEG data. The
results indicate that the five dimensions as well as their subordinate aspects
could not be predicted from the resting state EEG data. Finally, to demonstrate
that this result is not due to systematic algorithmic or implementation
mistakes the same methods were used to successfully classify whether the
subject had eyes open or eyes closed and whether the subject was male or
female. These results indicate that the extraction of personality traits from
the power spectra of resting state EEG is extremely noisy, if possible at all.Comment: 14 pages, 4 figure
Novel translational approaches to the search for precision therapies for acute respiratory distress syndrome.
In the 50 years since acute respiratory distress syndrome (ARDS) was first described, substantial progress has been made in identifying the risk factors for and the pathogenic contributors to the syndrome and in characterising the protein expression patterns in plasma and bronchoalveolar lavage fluid from patients with ARDS. Despite this effort, however, pharmacological options for ARDS remain scarce. Frequently cited reasons for this absence of specific drug therapies include the heterogeneity of patients with ARDS, the potential for a differential response to drugs, and the possibility that the wrong targets have been studied. Advances in applied biomolecular technology and bioinformatics have enabled breakthroughs for other complex traits, such as cardiovascular disease or asthma, particularly when a precision medicine paradigm, wherein a biomarker or gene expression pattern indicates a patient's likelihood of responding to a treatment, has been pursued. In this Review, we consider the biological and analytical techniques that could facilitate a precision medicine approach for ARDS
Programmed cell death 6 interacting protein (PDCD6IP) and Rabenosyn-5 (ZFYVE20) are potential urinary biomarkers for upper gastrointestinal cancer
PURPOSE:
Cancer of the upper digestive tract (uGI) is a major contributor to cancer-related death worldwide. Due to a rise in occurrence, together with poor survival rates and a lack of diagnostic or prognostic clinical assays, there is a clear need to establish molecular biomarkers.
EXPERIMENTAL DESIGN:
Initial assessment was performed on urine samples from 60 control and 60 uGI cancer patients using MS to establish a peak pattern or fingerprint model, which was validated by a further set of 59 samples.
RESULTS:
We detected 86 cluster peaks by MS above frequency and detection thresholds. Statistical testing and model building resulted in a peak profiling model of five relevant peaks with 88% overall sensitivity and 91% specificity, and overall correctness of 90%. High-resolution MS of 40 samples in the 2-10 kDa range resulted in 646 identified proteins, and pattern matching identified four of the five model peaks within significant parameters, namely programmed cell death 6 interacting protein (PDCD6IP/Alix/AIP1), Rabenosyn-5 (ZFYVE20), protein S100A8, and protein S100A9, of which the first two were validated by Western blotting.
CONCLUSIONS AND CLINICAL RELEVANCE:
We demonstrate that MS analysis of human urine can identify lead biomarker candidates in uGI cancers, which makes this technique potentially useful in defining and consolidating biomarker patterns for uGI cancer screening
- …