Search CORE

15 research outputs found

Tight Continuous Relaxation of the Balanced $k$ -Cut Problem

Author: Hein Matthias
Mudrakarta Pramod Kaushik
Rangapuram Syama Sundar
Publication venue
Publication date: 01/01/2014
Field of study

Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced

k

-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propose a new tight continuous relaxation for any balanced

k

-cut problem and show that a related recently proposed relaxation is in most cases loose leading to poor performance in practice. For the optimization of our tight continuous relaxation we propose a new algorithm for the difficult sum-of-ratios minimization problem which achieves monotonic descent. Extensive comparisons show that our method outperforms all existing approaches for ratio cut and other balanced

k

-cut criteria.Comment: Long version of paper accepted at NIPS 201

arXiv.org e-Print Archive

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

Oqtans: a Galaxy-integrated workflow for quantitative transcriptome analysis from NGS Data : From Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011 Vienna, Austria. 15 July 2011

Author: Behr Jonas
Bohnert Regina
Drewe Philipp
Görnitz Nico
Jean Géraldine
Kahles André
Mudrakarta Pramod
Rätsch Gunnar
Schultheiss Sebastian J.
Sreedharan Vipin T.
Zeller Georg
Publication venue
Publication date: 01/01/2011
Field of study

First published by BioMed Central: Schultheiss, Sebastian J.; Jean, Géraldine; Behr, Jonas; Bohnert, Regina; Drewe, Philipp; Görnitz, Nico; Kahles, André; Mudrakarta, Pramod; Sreedharan, Vipin T.; Zeller, Georg; Rätsch, Gunnar: Oqtans: a Galaxy-integrated workflow for quantitative transcriptome analysis from NGS Data - In: BMC Bioinformatics. - ISSN 1471-2105 (online). - 12 (2011), suppl. 11, art. A7. - doi:10.1186/1471-2105-12-S11-A7

DepositOnce

Recommended from our members

Challenges in Modern Machine Learning: Multiresolution Structure, Model Understanding and Transfer Learning

Author: Mudrakarta Pramod Kaushik
Publication venue: The University of Chicago
Publication date: 19/12/2019
Field of study

Recent advances in Artificial Intelligence (AI) are characterized by ever-increasing sizes of datasets and reemergence of neural-network methods. The modern AI pipeline begins with building datasets, followed by designing and training machine-learning models, and concludes with deployment of trained models in the real world. We tackle three important challenges relevant to this era; one from each part of the pipeline: 1) efficiently manipulating large matrices arising in real-world datasets (e.g., graph Laplacians from social network datasets), 2) interpreting deep-neural-network models, and 3) efficiently deploying hundreds of deep-neural-network models on embedded devices. Matrices arising in large, real-world datasets are oftentimes found to have high rank, rendering common matrix-manipulation approaches that are based on the low-rank assumption (e.g. SVD) ineffective. In the first part of this thesis, we build upon Multiresolution Matrix Factorization (MMF), a method originally proposed to perform multiresolution analysis on discrete spaces, and can consequently model hierarchical structure in symmetric matrices as a matrix factorization. We describe a parallel algorithm for computing the factorization that can scale up to matrices with a million rows and columns. We then showcase an application of MMF, wherein we demonstrate a preconditioner that accelerates iterative algorithms solving systems of linear equations. Among wavelet-based preconditioners, the MMF-preconditioner consistently results in faster convergence and is highly scalable. Finally, we propose approaches to extend MMF to asymmetric matrices and evaluate them in the context of matrix compression. In the second part of the thesis, we address the black-box nature of deep-neural-network models. The goodness of a deep-neural-network model is typically measured by its test accuracy. We argue that it is an incomplete measure, and show that state-of-the-art question-answering models often ignore important question terms. We perform a case study of a question-answering model and expose various ways in which the network gets the right answer for the wrong reasons. We propose a human-in-the-loop workflow based on the notion of "attribution" (word-importance) to understand the input-output behavior of neural network models, extract rules, identify weaknesses and construct adversarial attacks by leveraging the weaknesses. Our strongest attacks drop the accuracy of a visual question answering model from 61.1% to 19%, and that of a tabular question answering model from 33.5% to 3.3%. We propose a measure for overstability - the tendency of a model to rely on trigger logic and ignore semantics. We use a path-sensitive attribution method to extract contextual synonyms (rules) learned by a model. We discuss how attributions can augment standard measures of accuracy and empower investigation of model performance. We finish by identifying opportunities for research: abstraction tools that aid the debugging process, concepts and semantics of path-sensitive dataflow analysis, and formalizing the process of verifying natural-language-based specifications. The third challenge pertains to real-world deployment of deep-neural-network models. With the proliferation of personal devices such as phones, smart assistants, etc., the grounds for much of the human-AI interactions has shifted away from the cloud. While this has critical advantages such as user privacy and faster response times, as the space of deep-learning-based applications expands, limited availability of memory on these devices makes deploying hundreds of models impractical. We tackle the problem of re-purposing trained deep-neural-network models to new tasks while keeping most of the learned weights intact. Our method introduces the concept of a "model patch'' -- a set of small, trainable layers -- that can be applied to an existing trained model to adapt it to a new task. While keeping more than 98% of the weights intact, we show significantly higher transfer-learning performance from an object-detection task to an image-classification task, compared to traditional last-layer fine-tuning, among other results. We show how the model-patch idea can be used in multitask learning, where, despite using significantly fewer parameters, we incur zero accuracy loss compared to single-task performance for all the involved tasks

Knowledge UChicago