14 research outputs found
Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows
We consider supervised learning problems where the features are embedded in a
graph, such as gene expressions in a gene network. In this context, it is of
much interest to automatically select a subgraph with few connected components;
by exploiting prior knowledge, one can indeed improve the prediction
performance or obtain results that are easier to interpret. Regularization or
penalty functions for selecting features in graphs have recently been proposed,
but they raise new algorithmic challenges. For example, they typically require
solving a combinatorially hard selection problem among all connected subgraphs.
In this paper, we propose computationally feasible strategies to select a
sparse and well-connected subset of features sitting on a directed acyclic
graph (DAG). We introduce structured sparsity penalties over paths on a DAG
called "path coding" penalties. Unlike existing regularization functions that
model long-range interactions between features in a graph, path coding
penalties are tractable. The penalties and their proximal operators involve
path selection problems, which we efficiently solve by leveraging network flow
optimization. We experimentally show on synthetic, image, and genomic data that
our approach is scalable and leads to more connected subgraphs than other
regularization functions for graphs.Comment: 37 pages; to appear in the Journal of Machine Learning Research
(JMLR
A Sparse Optimization Method for Distributed Hydrology Information Monitoring System
The sparse optimization method is an effective technological approach to solve information detection and spectrum sensing in hydrology information monitoring sensor network and it is at the frontier domain of water data collection and transmission field in home and abroad. By combining sparse optimization with distributed hydrology information monitoring and based on the theoretical framework of sparse optimization of distributed water level monitoring, the paper puts forward the distributed sparse optimization monitoring method of water environment information, which expands the new application of sparse optimization in distributed network on one hand. On the other hand, it points out the reconstruction method of joint sparse signals based on the block coordinate descent method. Simulation experiment shows that the proposed method can converge quickly to the approximate optimal solution and has a good robustness for calculation error caused by inaccurate average and other uncertain factors in the network
Efficient RNA Isoform Identification and Quantification from RNA-Seq Data with Network Flows
International audienceSeveral state-of-the-art methods for isoform identification and quantification are based on l1- regularized regression, such as the Lasso. However, explicitly listing the--possibly exponentially-- large set of candidate transcripts is intractable for genes with many exons. For this reason, existing approaches using the l1-penalty are either restricted to genes with few exons, or only run the regression algorithm on a small set of pre-selected isoforms. We introduce a new technique called FlipFlop which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization. Our technique removes the need of a preselection step, leading to better isoform identification while keeping a low computational cost. Experiments with synthetic and real RNA-Seq data confirm that our approach is more accurate than alternative methods and one of the fastest available. Source code is freely available as an R package from the Bioconductor web site (http://www.bioconductor.org/) and more information is available at http://cbio.ensmp.fr/flipflop
Convex Relaxation for Combinatorial Penalties
In this paper, we propose an unifying view of several recently proposed
structured sparsity-inducing norms. We consider the situation of a model
simultaneously (a) penalized by a set- function de ned on the support of the
unknown parameter vector which represents prior knowledge on supports, and (b)
regularized in Lp-norm. We show that the natural combinatorial optimization
problems obtained may be relaxed into convex optimization problems and
introduce a notion, the lower combinatorial envelope of a set-function, that
characterizes the tightness of our relaxations. We moreover establish links
with norms based on latent representations including the latent group Lasso and
block-coding, and with norms obtained from submodular functions.Comment: 35 pag