Search CORE

402 research outputs found

Scalable learning of interpretable rules for the dynamic microbiome domain [preprint]

Author: Bucci Vanni
Gerber Georg K.
Maringanti Venkata Suhas
Publication venue: eScholarship@UMassChan
Publication date: 28/06/2020
Field of study

The microbiome, which is inherently dynamic, plays essential roles in human physiology and its disruption has been implicated in numerous human diseases. Linking dynamic changes in the microbiome to the status of the human host is an important problem, which is complicated by limitations and complexities of the data. Model interpretability is key in the microbiome field, as practitioners seek to derive testable biological hypotheses from data or develop diagnostic tests that can be understood by clinicians. Interpretable structure must take into account domainspecific information key to biologists and clinicians including evolutionary relationships (phylogeny) and dynamic behavior of the microbiome. A Bayesian model was previously developed in the field, which uses Markov Chain Monte Carlo inference to learn human interpretable rules for classifying the status of the human host based on microbiome time-series data, but that approach is not scalable to increasingly large microbiome datasets being produced. We present a new fully-differentiable model that also learns human-interpretable rules for the same classification task, but in an end-to-end gradient-descent based framework. We validate the performance of our model on human microbiome data sets and demonstrate our approach has similar predictive performance to the fully Bayesian method, while running orders-of-magnitude faster and moreover learning a larger set of rules, thus providing additional biological insight into the effects of diet and environment on the microbiome

eScholarship@UMMS

Hierarchical Dirichlet Process-Based Models For Discovery of Cross-species Mammalian Gene Expression

Author: Dowell Robin D.
Gerber Georg K.
Gifford David K.
Jaakkola Tommi S.
Publication venue
Publication date: 06/07/2007
Field of study

An important research problem in computational biology is theidentification of expression programs, sets of co-activatedgenes orchestrating physiological processes, and thecharacterization of the functional breadth of these programs. Theuse of mammalian expression data compendia for discovery of suchprograms presents several challenges, including: 1) cellularinhomogeneity within samples, 2) genetic and environmental variationacross samples, and 3) uncertainty in the numbers of programs andsample populations. We developed GeneProgram, a new unsupervisedcomputational framework that uses expression data to simultaneouslyorganize genes into overlapping programs and tissues into groups toproduce maps of inter-species expression programs, which are sortedby generality scores that exploit the automatically learnedgroupings. Our method addresses each of the above challenges byusing a probabilistic model that: 1) allocates mRNA to differentexpression programs that may be shared across tissues, 2) ishierarchical, treating each tissue as a sample from a population ofrelated tissues, and 3) uses Dirichlet Processes, a non-parametricBayesian method that provides prior distributions over numbers ofsets while penalizing model complexity. Using real gene expressiondata, we show that GeneProgram outperforms several popularexpression analysis methods in recovering biologically interpretablegene sets. From a large compendium of mouse and human expressiondata, GeneProgram discovers 19 tissue groups and 100 expressionprograms active in mammalian tissues. Our method automaticallyconstructs a comprehensive, body-wide map of expression programs andcharacterizes their functional generality. This map can be used forguiding future biological experiments, such as discovery of genesfor new drug targets that exhibit minimal "cross-talk" withunintended organs, or genes that maintain general physiologicalresponses that go awry in disease states. Further, our method isgeneral, and can be applied readily to novel compendia of biologicaldata

DSpace@MIT

Automated Discovery of Functional Generality of Human Gene Expression Programs

Author: Arend Sidow
David K Gifford
Georg K Gerber
GO
Robin D Dowell
Tommi S Jaakkola
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-κB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal “cross-talk,” and genes from high generality programs may maintain common physiological responses that go awry in disease states. Further, our method is multipurpose, and can be applied readily to novel compendia of biological data

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Recommended from our members

Dynamics of the Microbiota in Response to Host Infection

Author: Belavusava Vera
Belzer Clara
Bry Lynn
Cavanaugh Colleen
Delaney Mary
DuBois Andrea
Gerber Georg K.
Houseman Andres
Liu Qing
Onderdonk Andrew
Roeselers Guus
Yeliseyev Vladimir
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Longitudinal studies of the microbiota are important for discovering changes in microbial communities that affect the host. The complexity of these ecosystems requires rigorous integrated experimental and computational methods to identify temporal signatures that promote physiologic or pathophysiologic responses in vivo. Employing a murine model of infectious colitis with the pathogen Citrobacter rodentium, we generated a 2-month time-series of 16S rDNA gene profiles, and quantitatively cultured commensals, from multiple intestinal sites in infected and uninfected mice. We developed a computational framework to discover time-varying signatures for individual taxa, and to automatically group signatures to identify microbial sub-communities within the larger gut ecosystem that demonstrate common behaviors. Application of this model to the 16S rDNA dataset revealed dynamic alterations in the microbiota at multiple levels of resolution, from effects on systems-level metrics to changes across anatomic sites for individual taxa and species. These analyses revealed unique, time-dependent microbial signatures associated with host responses at different stages of colitis. Signatures included a Mucispirillum OTU associated with early disruption of the colonic surface mucus layer, prior to the onset of symptomatic colitis, and members of the Clostridiales and Lactobacillales that increased with successful resolution of inflammation, after clearance of the pathogen. Quantitative culture data validated findings for predominant species, further refining and strengthening model predictions. These findings provide new insights into the complex behaviors found within host ecosystems, and define several time-dependent microbial signatures that may be leveraged in studies of other infectious or inflammatory conditions

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

The Francis Crick Institute

Towards a Visual-Language Foundation Model for Computational Pathology

Author: Chen Bowen
Chen Richard J.
Ding Tong
Gerber Georg
Jaume Guillaume
Le Long Phi
Liang Ivy
Lu Ming Y.
Mahmood Faisal
Odintsov Igor
Parwani Anil V
Williamson Drew F. K.
Zhang Andrew
Publication venue
Publication date: 25/07/2023
Field of study

The accelerated adoption of digital pathology and advances in deep learning have enabled the development of powerful models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain and the model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text, and notably over 1.17 million image-caption pairs via task-agnostic pretraining. Evaluated on a suite of 13 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving either or both histopathology images and text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning

arXiv.org e-Print Archive

AI-driven Discovery of Morphomolecular Signatures in Toxicology.

Author: Chen Richard J
De Brot Simone
Gerber Georg
Jaume Guillaume
Le Long Phi
Mahmood Faisal
Oldenburg Lukas
Peeters Thomas
Pettit Rowland
Song Andrew H
Thiran Jean-Philippe
Vaidya Anurag
Williamson Drew F K
Publication venue: Cold Spring Harbor Laboratory
Publication date: 23/07/2024
Field of study

Early identification of drug toxicity is essential yet challenging in drug development. At the preclinical stage, toxicity is assessed with histopathological examination of tissue sections from animal models to detect morphological lesions. To complement this analysis, toxicogenomics is increasingly employed to understand the mechanism of action of the compound and ultimately identify lesion-specific safety biomarkers for which in vitro assays can be designed. However, existing works that aim to identify morphological correlates of expression changes rely on qualitative or semi-quantitative morphological characterization and remain limited in scale or morphological diversity. Artificial intelligence (AI) offers a promising approach for quantitatively modeling this relationship at an unprecedented scale. Here, we introduce GEESE, an AI model designed to impute morphomolecular signatures in toxicology data. Our model was trained to predict 1,536 gene targets on a cohort of 8,231 hematoxylin and eosin-stained liver sections from Rattus norvegicus across 127 preclinical toxicity studies. The model, evaluated on 2,002 tissue sections from 29 held-out studies, can yield pseudo-spatially resolved gene expression maps, which we correlate with six key drug-induced liver injuries (DILI). From the resulting 25 million lesion-expression pairs, we established quantitative relations between up and downregulated genes and lesions. Validation of these signatures against toxicogenomic databases, pathway enrichment analyses, and human hepatocyte cell lines asserted their relevance. Overall, our study introduces new methods for characterizing toxicity at an unprecedented scale and granularity, paving the way for AI-driven discovery of toxicity biomarkers

Bern Open Repository and Information System (BORIS)

Recommended from our members

MDSINE: Microbial Dynamical Systems INference Engine for microbiome time-series analyses

Author: Bogart Elijah
Bry Lynn
Bucci Vanni
Delaney Mary L.
Deng Luxue
Gerber Georg K.
Honda Kenya
Li Ning
Liu Qing
Olle Bernat
Simmons Matt
Stein Richard R.
Tanoue Takeshi
Tzen Belinda
Yeliseyev Vladimir
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/07/2016
Field of study

Predicting dynamics of host-microbial ecosystems is crucial for the rational design of bacteriotherapies. We present MDSINE, a suite of algorithms for inferring dynamical systems models from microbiome time-series data and predicting temporal behaviors. Using simulated data, we demonstrate that MDSINE significantly outperforms the existing inference method. We then show MDSINE’s utility on two new gnotobiotic mice datasets, investigating infection with Clostridium difficile and an immune-modulatory probiotic. Using these datasets, we demonstrate new capabilities, including accurate forecasting of microbial dynamics, prediction of stable sub-communities that inhibit pathogen growth, and identification of bacteria most crucial to community integrity in response to perturbations. Electronic supplementary material The online version of this article (doi:10.1186/s13059-016-0980-6) contains supplementary material, which is available to authorized users

Harvard University - DASH

Springer - Publisher Connector

A microbiota signature associated with experimental food allergy promotes allergic sensitization and anaphylaxis

Author: Bry Lynn
Burton Oliver T.
Chatila Talal A.
Cheoud Christel
DeSantis Todd
Gerber Georg K.
Hobson Suejy A.
Hyde Embriette R.
Kuzynski Justin
Lloret Maria Garcia
Mazmanian Sarkis K.
Oettgen Hans C.
Petrosino Joseph F.
Rivas Magali Noval
Warrington Janet
Wise Petra
Zhang Yu-gian
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Background: Commensal microbiota play a critical role in maintaining oral tolerance. The effect of food allergy on the gut microbial ecology remains unknown. Methods: Food allergy–prone mice with a gain-of-function mutation in the IL-4 receptor α chain (Il4raF709) and wild-type (WT) control animals were subjected to oral sensitization with chicken egg ovalbumin (OVA). Enforced tolerance was achieved by using allergen-specific regulatory T (Treg) cells. Community structure analysis of gut microbiota was performed by using a high-density 16S rDNA oligonucleotide microarrays (PhyloChip) and massively parallel pyrosequencing of 16S rDNA amplicons. Results: OVA-sensitized Il4raF709 mice exhibited a specific microbiota signature characterized by coordinate changes in the abundance of taxa of several bacterial families, including the Lachnospiraceae, Lactobacillaceae, Rikenellaceae, and Porphyromonadaceae. This signature was not shared by similarly sensitized WT mice, which did not exhibit an OVA-induced allergic response. Treatment of OVA-sensitized Il4raF709 mice with OVA-specific Treg cells led to a distinct tolerance-associated signature coincident with the suppression of the allergic response. The microbiota of allergen-sensitized Il4raF709 mice differentially promoted OVA-specific IgE responses and anaphylaxis when reconstituted in WT germ-free mice. Conclusion: Mice with food allergy exhibit a specific gut microbiota signature capable of transmitting disease susceptibility and subject to reprogramming by enforced tolerance. Disease-associated microbiota may thus play a pathogenic role in food allergy

Caltech Authors

Macrophage dysfunction initiates colitis during weaning of infant mice lacking the interleukin-10 receptor

Author: Amlan Biswas
Amy Tsou
Andre Bleich
Bruce H Horwitz
Chuanwu Wang
Dror S Shouval
Evan A Conaway
Georg K Gerber
James G Fox
Jeremy A Goettel
Lynn Bry
Michael Field
Naresh S Redhu
Ning Li
Scott B Snapper
Vasudevan Bakthavatchalu
Werner Muller
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 01/01/2017
Field of study

Crossref