Search CORE

74 research outputs found

A multi-species functional embedding integrating sequence and network structure

Author: Cannistra Anthony
Crovella Mark
Fan Jason
Fried Inbar
Hescott Benjamin
Leiserson Mark D. M.
Lim Tim
Schaffner Thomas
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 04/12/2017
Field of study

A key challenge to transferring knowledge between species is that different species have fundamentally different genetic architectures. Initial computational approaches to transfer knowledge across species have relied on measures of heredity such as genetic homology, but these approaches suffer from limitations. First, only a small subset of genes have homologs, limiting the amount of knowledge that can be transferred, and second, genes change or repurpose functions, complicating the transfer of knowledge. Many approaches address this problem by expanding the notion of homology by leveraging high-throughput genomic and proteomic measurements, such as through network alignment. In this work, we take a new approach to transferring knowledge across species by expanding the notion of homology through explicit measures of functional similarity between proteins in different species. Specifically, our kernel-based method, HANDL (Homology Assessment across Networks using Diffusion and Landmarks), integrates sequence and network structure to create a functional embedding in which proteins from different species are embedded in the same vector space. We show that inner products in this space and the vectors themselves capture functional similarity across species, and are useful for a variety of functional tasks. We perform the first whole-genome method for predicting phenologs, generating many that were previously identified, but also predicting new phenologs supported from the biological literature. We also demonstrate the HANDL embedding captures pairwise gene function, in that gene pairs with synthetic lethal interactions are significantly separated in HANDL space, and the direction of separation is conserved across species. Software for the HANDL algorithm is available at http://bit.ly/lrgr-handl.Published versio

Crossref

Boston University Institutional Repository (OpenBU)

Functional protein representations from biological networks enable diverse cross-species inference

Author: Cannistra Anthony
Crovella Mark
Fan Jason
Fried Inbar
Hescott Benjamin
Leiserson Mark D.M.
Lim Tim
Schaffner Thomas
Publication venue: Oxford
Publication date: 01/01/2019
Field of study

Partial funding for Open Access provided by the UMD Libraries' Open Access Publishing Fund.Transferring knowledge between species is key for many biological applications, but is complicated by divergent and convergent evolution. Many current approaches for this problem leverage sequence and interaction network data to transfer knowledge across species, exemplified by network alignment methods. While these techniques do well, they are limited in scope, creating metrics to address one specific problem or task. We take a different approach by creating an environment where multiple knowledge transfer tasks can be performed using the same protein representations. Specifically, our kernel-based method, MUNK, integrates sequence and network structure to create functional protein representations, embedding proteins from different species in the same vector space. First we show proteins in different species that are close in MUNKspace are functionally similar. Next,we use these representations to share knowledge of synthetic lethal interactions between species. Importantly, we find that the results using MUNK-representations are at least as accurate as existing algorithms for these tasks. Finally, we generalize the notion of a phenolog (‘orthologous phenotype’) to use functionally similar proteins (i.e. those with similar representations). We demonstrate the utility of this broadened notion by using it to identify known phenologs and novel non-obvious ones supported by current research

Boston University Institutional Repository (OpenBU)

Digital Repository at the University of Maryland

Inducing and exploiting activation sparsity for fast neural network inference

Author: Alistarh Dan-Adrian
Carr John
Gelashvili Rati
Goin Michael
Kopinsky Justin
Kurtz Mark
Leiserson William
Matveev Alexander
Moore Sage
Nell Bill
Shavit Nir
Publication venue
Publication date: 01/01/2020
Field of study

Optimizing convolutional neural networks for fast inference has recently become an extremely active area of research. One of the go-to solutions in this context is weight pruning, which aims to reduce computational and memory footprint by removing large subsets of the connections in a neural network. Surprisingly, much less attention has been given to exploiting sparsity in the activation maps, which tend to be naturally sparse in many settings thanks to the structure of rectified linear (ReLU) activation functions. In this paper, we present an in-depth analysis of methods for maximizing the sparsity of the activations in a trained neural network, and show that, when coupled with an efficient sparse-input convolution algorithm, we can leverage this sparsity for significant performance gains. To induce highly sparse activation maps without accuracy loss, we introduce a new regularization technique, coupled with a new threshold-based sparsification method based on a parameterized activation function called Forced-Activation-Threshold Rectified Linear Unit (FATReLU). We examine the impact of our methods on popular image classification models, showing that most architectures can adapt to significantly sparser activation maps without any accuracy loss. Our second contribution is showing that these these compression gains can be translated into inference speedups: we provide a new algorithm to enable fast convolution operations over networks with sparse activations, and show that it can enable significant speedups for end-to-end inference on a range of popular models on the large-scale ImageNet image classification task on modern Intel CPUs, with little or no retraining cost

IST Austria: PubRep (Institute of Science and Technology)

Genecentric: a package to uncover graph-theoretic structure in high-throughput epistasis data

Author: Andrew Gallant
Benjamin J Hescott
Lenore J Cowen
Mark DM Leiserson
Maxim Kachalov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Springer - Publisher Connector

Integrated analysis of germline and somatic variants in ovarian cancer

Author: Ding Li
Druley Todd E.
Fulton Robert S.
Goodfellow Paul J.
Graubert Timothy A.
Johnson Kimberly J.
Kanchi Krishna L.
Kandoth Cyriac
Koboldt Daniel C.
Larson David E.
Leiserson Mark D.M.
Lu Charles
Mardis Elaine R.
McLellan Michael D.
McMichael Joshua F.
Miller Christopher A.
Raphael Benjamin J.
Schmidt Heather K.
Spellman Paul T.
Wendl Michael C.
Wilson Richard K.
Wyczalkowski Matthew A.
Xie Mingchao
Zhang Qunyuan
Publication venue: Digital Commons@Becker
Publication date: 01/01/2014
Field of study

We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in ovarian cancer. Here we analyze germline and somatic alterations in 429 ovarian carcinoma cases and 557 controls. We identify 3,635 high confidence, rare truncation and 22,953 missense variants with predicted functional impact. We find germline truncation variants and large deletions across Fanconi pathway genes in 20% of cases. Enrichment of rare truncations is shown in BRCA1, BRCA2, and PALB2. Additionally, we observe germline truncation variants in genes not previously associated with ovarian cancer susceptibility (NF1, MAP3K4, CDKN2B, and MLL3). Evidence for loss of heterozygosity was found in 100% and 76% of cases with germline BRCA1 and BRCA2 truncations respectively. Germline-somatic interaction analysis combined with extensive bioinformatics annotation identifies 237 candidate functional germline truncation and missense variants, including 2 pathogenic BRCA1 and 1 TP53 deleterious variants. Finally, integrated analyses of germline and somatic variants identify significantly altered pathways, including the Fanconi, MAPK, and MLL pathways

Crossref

Digital Commons@Becker

PubMed Central

The Integrated Genomic Landscape of Thymic Epithelial Tumors

Author: Akbani Rehan
Anur Pavana
Bowlby Reanne
Cherniack Andrew D.
Danilova Ludmila
Felau Ina
Ha Gavin
Hayes D. Neil
Hegde Apurva M.
Hoadley Katherine A.
Jo Heejoon
Kim Jaegil
Leiserson Mark D. M.
Leraas Kristen
Lichtenberg Tara
Lu Charles
Matthew Thomas
McLellan Mike
Pickering Curtis R.
Radovich Milan
Raphael Ben
Robertson Gordon
Ryan Michael
Sethi Geetika
Spellman Paul
Su Xiaoping
Weinstein John N.
Zenklusen Jean Claude
Zhang Hailei
Zhang Jiexin
Publication venue: 'Elsevier BV'
Publication date: 01/02/2018
Field of study

Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy

IUPUIScholarWorks