279,621 research outputs found
An update on statistical boosting in biomedicine
Statistical boosting algorithms have triggered a lot of research during the
last decade. They combine a powerful machine-learning approach with classical
statistical modelling, offering various practical advantages like automated
variable selection and implicit regularization of effect estimates. They are
extremely flexible, as the underlying base-learners (regression functions
defining the type of effect for the explanatory variables) can be combined with
any kind of loss function (target function to be optimized, defining the type
of regression setting). In this review article, we highlight the most recent
methodological developments on statistical boosting regarding variable
selection, functional regression and advanced time-to-event modelling.
Additionally, we provide a short overview on relevant applications of
statistical boosting in biomedicine
Stable Feature Selection from Brain sMRI
Neuroimage analysis usually involves learning thousands or even millions of
variables using only a limited number of samples. In this regard, sparse
models, e.g. the lasso, are applied to select the optimal features and achieve
high diagnosis accuracy. The lasso, however, usually results in independent
unstable features. Stability, a manifest of reproducibility of statistical
results subject to reasonable perturbations to data and the model, is an
important focus in statistics, especially in the analysis of high dimensional
data. In this paper, we explore a nonnegative generalized fused lasso model for
stable feature selection in the diagnosis of Alzheimer's disease. In addition
to sparsity, our model incorporates two important pathological priors: the
spatial cohesion of lesion voxels and the positive correlation between the
features and the disease labels. To optimize the model, we propose an efficient
algorithm by proving a novel link between total variation and fast network flow
algorithms via conic duality. Experiments show that the proposed nonnegative
model performs much better in exploring the intrinsic structure of data via
selecting stable features compared with other state-of-the-arts
Factorial graphical lasso for dynamic networks
Dynamic networks models describe a growing number of important scientific
processes, from cell biology and epidemiology to sociology and finance. There
are many aspects of dynamical networks that require statistical considerations.
In this paper we focus on determining network structure. Estimating dynamic
networks is a difficult task since the number of components involved in the
system is very large. As a result, the number of parameters to be estimated is
bigger than the number of observations. However, a characteristic of many
networks is that they are sparse. For example, the molecular structure of genes
make interactions with other components a highly-structured and therefore
sparse process.
Penalized Gaussian graphical models have been used to estimate sparse
networks. However, the literature has focussed on static networks, which lack
specific temporal constraints. We propose a structured Gaussian dynamical
graphical model, where structures can consist of specific time dynamics, known
presence or absence of links and block equality constraints on the parameters.
Thus, the number of parameters to be estimated is reduced and accuracy of the
estimates, including the identification of the network, can be tuned up. Here,
we show that the constrained optimization problem can be solved by taking
advantage of an efficient solver, logdetPPA, developed in convex optimization.
Moreover, model selection methods for checking the sensitivity of the inferred
networks are described. Finally, synthetic and real data illustrate the
proposed methodologies.Comment: 30 pp, 5 figure
- …