14 research outputs found

    Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows

    Get PDF
    We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called "path coding" penalties. Unlike existing regularization functions that model long-range interactions between features in a graph, path coding penalties are tractable. The penalties and their proximal operators involve path selection problems, which we efficiently solve by leveraging network flow optimization. We experimentally show on synthetic, image, and genomic data that our approach is scalable and leads to more connected subgraphs than other regularization functions for graphs.Comment: 37 pages; to appear in the Journal of Machine Learning Research (JMLR

    A Sparse Optimization Method for Distributed Hydrology Information Monitoring System

    Get PDF
    The sparse optimization method is an effective technological approach to solve information detection and spectrum sensing in hydrology information monitoring sensor network and it is at the frontier domain of water data collection and transmission field in home and abroad. By combining sparse optimization with distributed hydrology information monitoring and based on the theoretical framework of sparse optimization of distributed water level monitoring, the paper puts forward the distributed sparse optimization monitoring method of water environment information, which expands the new application of sparse optimization in distributed network on one hand. On the other hand, it points out the reconstruction method of joint sparse signals based on the block coordinate descent method. Simulation experiment shows that the proposed method can converge quickly to the approximate optimal solution and has a good robustness for calculation error caused by inaccurate average and other uncertain factors in the network

    Efficient RNA Isoform Identification and Quantification from RNA-Seq Data with Network Flows

    No full text
    International audienceSeveral state-of-the-art methods for isoform identification and quantification are based on l1- regularized regression, such as the Lasso. However, explicitly listing the--possibly exponentially-- large set of candidate transcripts is intractable for genes with many exons. For this reason, existing approaches using the l1-penalty are either restricted to genes with few exons, or only run the regression algorithm on a small set of pre-selected isoforms. We introduce a new technique called FlipFlop which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization. Our technique removes the need of a preselection step, leading to better isoform identification while keeping a low computational cost. Experiments with synthetic and real RNA-Seq data confirm that our approach is more accurate than alternative methods and one of the fastest available. Source code is freely available as an R package from the Bioconductor web site (http://www.bioconductor.org/) and more information is available at http://cbio.ensmp.fr/flipflop

    Convex Relaxation for Combinatorial Penalties

    Get PDF
    In this paper, we propose an unifying view of several recently proposed structured sparsity-inducing norms. We consider the situation of a model simultaneously (a) penalized by a set- function de ned on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) regularized in Lp-norm. We show that the natural combinatorial optimization problems obtained may be relaxed into convex optimization problems and introduce a notion, the lower combinatorial envelope of a set-function, that characterizes the tightness of our relaxations. We moreover establish links with norms based on latent representations including the latent group Lasso and block-coding, and with norms obtained from submodular functions.Comment: 35 pag
    corecore