3,843 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
Generative Modelling for Unsupervised Score Calibration
Score calibration enables automatic speaker recognizers to make
cost-effective accept / reject decisions. Traditional calibration requires
supervised data, which is an expensive resource. We propose a 2-component GMM
for unsupervised calibration and demonstrate good performance relative to a
supervised baseline on NIST SRE'10 and SRE'12. A Bayesian analysis demonstrates
that the uncertainty associated with the unsupervised calibration parameter
estimates is surprisingly small.Comment: Accepted for ICASSP 201
FOUR YEARS OF UNMANNED AERIAL SYSTEM IMAGERY REVEALS VEGETATION CHANGE IN A SUB-ARCTIC MIRE DUE TO PERMAFROST THAW
Warming trends in sub-arctic regions have resulted in thawing of permafrost which in turn induces change in vegetation across peatlands both in areal extent and composition. Collapse of palsas (i.e. permafrost plateaus) has also been correlated with increases in methane (CH4) emission to the atmosphere. Vegetation change provides new microenvironments that promote CH4 production and emission, specifically through plant interactions and structure. By quantifying the changes in vegetation at the landscape scale, we will be able to scale the impact of thaw on CH4 emissions in these complex climate-sensitive northern ecosystems. We combine field-based measurements of vegetation composition and Unmanned Aerial Systems (UAS) high resolution (3 cm) imagery to characterize vegetation change in a sub-arctic mire. The objective of this study is to analyze how vegetation from Stordalen Mire, Abisko, Sweden, has changed over time in response to permafrost thaw. At Stordalen Mire, we flew a fixed-wing UAS in July of each of four years, 2014 through 2017, over a 1 km x 0.5 km area. High precision GPS ground control points were used to georeference the imagery. Randomized square-meter plots were measured for vegetation composition and individually classified into one of five vegetation cover types, each representing a different stage of permafrost degradation. Using these training data, each year of imagery was classified by cover type in Google Earth Engine using a Random Forest Classifier. Textural information was extracted from the imagery, which provided additional spatial context information and improved classification accuracy. Twenty five percent of the training data were held back from the classification and used for validation, while the remaining seventy five percent of the training data were used to classify the imagery. The overall classification accuracy for 2014-2017 was 80.6%, 79.1%, 82.0%, and 82.9%, respectively. Percent cover across the landscape was calculated from each classification map and compared between years. Hummock sites, representing intact permafrost, decreased coverage by 9% from 2014-2017, while semi-wet sites increased coverage by 18%. This four-year comparison of vegetation cover indicated a rapid response to permafrost thaw. The use of a UAS allowed us to effectively capture the spatial heterogeneity of a northern peatland ecosystem. Estimation of vegetation cover types is vital in our understanding of the evolution of northern peatlands and their future role in the global carbon cycle
- …