2,707 research outputs found
Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression
In high dimensional settings, sparse structures are crucial for efficiency,
both in term of memory, computation and performance. It is customary to
consider penalty to enforce sparsity in such scenarios. Sparsity
enforcing methods, the Lasso being a canonical example, are popular candidates
to address high dimension. For efficiency, they rely on tuning a parameter
trading data fitting versus sparsity. For the Lasso theory to hold this tuning
parameter should be proportional to the noise level, yet the latter is often
unknown in practice. A possible remedy is to jointly optimize over the
regression parameter as well as over the noise level. This has been considered
under several names in the literature: Scaled-Lasso, Square-root Lasso,
Concomitant Lasso estimation for instance, and could be of interest for
confidence sets or uncertainty quantification. In this work, after illustrating
numerical difficulties for the Smoothed Concomitant Lasso formulation, we
propose a modification we coined Smoothed Concomitant Lasso, aimed at
increasing numerical stability. We propose an efficient and accurate solver
leading to a computational cost no more expansive than the one for the Lasso.
We leverage on standard ingredients behind the success of fast Lasso solvers: a
coordinate descent algorithm, combined with safe screening rules to achieve
speed efficiency, by eliminating early irrelevant features
Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data
This paper addresses the learning task of estimating driver drowsiness from
the signals of car acceleration sensors. Since even drivers themselves cannot
perceive their own drowsiness in a timely manner unless they use burdensome
invasive sensors, obtaining labeled training data for each timestamp is not a
realistic goal. To deal with this difficulty, we formulate the task as a weakly
supervised learning. We only need to add labels for each complete trip, not for
every timestamp independently. By assuming that some aspects of driver
drowsiness increase over time due to tiredness, we formulate an algorithm that
can learn from such weakly labeled data. We derive a scalable stochastic
optimization method as a way of implementing the algorithm. Numerical
experiments on real driving datasets demonstrate the advantages of our
algorithm against baseline methods.Comment: Accepted by ICASSP202
Sparse Regression with Multi-type Regularized Feature Modeling
Within the statistical and machine learning literature, regularization
techniques are often used to construct sparse (predictive) models. Most
regularization strategies only work for data where all predictors are treated
identically, such as Lasso regression for (continuous) predictors treated as
linear effects. However, many predictive problems involve different types of
predictors and require a tailored regularization term. We propose a multi-type
Lasso penalty that acts on the objective function as a sum of subpenalties, one
for each type of predictor. As such, we allow for predictor selection and level
fusion within a predictor in a data-driven way, simultaneous with the parameter
estimation process. We develop a new estimation strategy for convex predictive
models with this multi-type penalty. Using the theory of proximal operators,
our estimation procedure is computationally efficient, partitioning the overall
optimization problem into easier to solve subproblems, specific for each
predictor type and its associated penalty. Earlier research applies
approximations to non-differentiable penalties to solve the optimization
problem. The proposed SMuRF algorithm removes the need for approximations and
achieves a higher accuracy and computational efficiency. This is demonstrated
with an extensive simulation study and the analysis of a case-study on
insurance pricing analytics
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Subsequence clustering of multivariate time series is a useful tool for
discovering repeated patterns in temporal data. Once these patterns have been
discovered, seemingly complicated datasets can be interpreted as a temporal
sequence of only a small number of states, or clusters. For example, raw sensor
data from a fitness-tracking application can be expressed as a timeline of a
select few actions (i.e., walking, sitting, running). However, discovering
these patterns is challenging because it requires simultaneous segmentation and
clustering of the time series. Furthermore, interpreting the resulting clusters
is difficult, especially when the data is high-dimensional. Here we propose a
new method of model-based clustering, which we call Toeplitz Inverse
Covariance-based Clustering (TICC). Each cluster in the TICC method is defined
by a correlation network, or Markov random field (MRF), characterizing the
interdependencies between different observations in a typical subsequence of
that cluster. Based on this graphical representation, TICC simultaneously
segments and clusters the time series data. We solve the TICC problem through
alternating minimization, using a variation of the expectation maximization
(EM) algorithm. We derive closed-form solutions to efficiently solve the two
resulting subproblems in a scalable way, through dynamic programming and the
alternating direction method of multipliers (ADMM), respectively. We validate
our approach by comparing TICC to several state-of-the-art baselines in a
series of synthetic experiments, and we then demonstrate on an automobile
sensor dataset how TICC can be used to learn interpretable clusters in
real-world scenarios.Comment: This revised version fixes two small typos in the published versio
- …