37,365 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Structured penalized regression for drug sensitivity prediction
Large-scale {\it in vitro} drug sensitivity screens are an important tool in
personalized oncology to predict the effectiveness of potential cancer drugs.
The prediction of the sensitivity of cancer cell lines to a panel of drugs is a
multivariate regression problem with high-dimensional heterogeneous multi-omics
data as input data and with potentially strong correlations between the outcome
variables which represent the sensitivity to the different drugs. We propose a
joint penalized regression approach with structured penalty terms which allow
us to utilize the correlation structure between drugs with group-lasso-type
penalties and at the same time address the heterogeneity between omics data
sources by introducing data-source-specific penalty factors to penalize
different data sources differently. By combining integrative penalty factors
(IPF) with tree-guided group lasso, we create the IPF-tree-lasso method. We
present a unified framework to transform more general IPF-type methods to the
original penalized method. Because the structured penalty terms have multiple
parameters, we demonstrate how the interval-search Efficient Parameter
Selection via Global Optimization (EPSGO) algorithm can be used to optimize
multiple penalty parameters efficiently. Simulation studies show that
IPF-tree-lasso can improve the prediction performance compared to other
lasso-type methods, in particular for heterogenous data sources. Finally, we
employ the new methods to analyse data from the Genomics of Drug Sensitivity in
Cancer project.Comment: Zhao Z, Zucknick M (2020). Structured penalized regression for drug
sensitivity prediction. Journal of the Royal Statistical Society, Series C.
19 pages, 6 figures and 2 table
An integrative clustering approach combining particle swarm optimization and formal concept analysis
A Selective Review of Group Selection in High-Dimensional Models
Grouping structures arise naturally in many statistical modeling problems.
Several methods have been proposed for variable selection that respect grouping
structure in variables. Examples include the group LASSO and several concave
group selection methods. In this article, we give a selective review of group
selection concerning methodological developments, theoretical properties and
computational algorithms. We pay particular attention to group selection
methods involving concave penalties. We address both group selection and
bi-level selection methods. We describe several applications of these methods
in nonparametric additive models, semiparametric regression, seemingly
unrelated regressions, genomic data analysis and genome wide association
studies. We also highlight some issues that require further study.Comment: Published in at http://dx.doi.org/10.1214/12-STS392 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Aging concrete structures: a review of mechanics and concepts
The safe and cost-efficient management of our built infrastructure is a challenging task considering the expected service life of at least 50 years. In spite of time-dependent changes in material properties, deterioration processes and changing demand by society, the structures need to satisfy many technical requirements related to serviceability, durability, sustainability and bearing capacity. This review paper summarizes the challenges associated with the safe design and maintenance of aging concrete structures and gives an overview of some concepts and approaches that are being developed to address these challenges
An update on statistical boosting in biomedicine
Statistical boosting algorithms have triggered a lot of research during the
last decade. They combine a powerful machine-learning approach with classical
statistical modelling, offering various practical advantages like automated
variable selection and implicit regularization of effect estimates. They are
extremely flexible, as the underlying base-learners (regression functions
defining the type of effect for the explanatory variables) can be combined with
any kind of loss function (target function to be optimized, defining the type
of regression setting). In this review article, we highlight the most recent
methodological developments on statistical boosting regarding variable
selection, functional regression and advanced time-to-event modelling.
Additionally, we provide a short overview on relevant applications of
statistical boosting in biomedicine
- …