Search CORE

8,586 research outputs found

Recommended from our members

Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases.

Inflammatory bowel diseases, which include Crohn's disease and ulcerative colitis, affect several million individuals worldwide. Crohn's disease and ulcerative colitis are complex diseases that are heterogeneous at the clinical, immunological, molecular, genetic, and microbial levels. Individual contributing factors have been the focus of extensive research. As part of the Integrative Human Microbiome Project (HMP2 or iHMP), we followed 132 subjects for one year each to generate integrated longitudinal molecular profiles of host and microbial activity during disease (up to 24 time points each; in total 2,965 stool, biopsy, and blood specimens). Here we present the results, which provide a comprehensive view of functional dysbiosis in the gut microbiome during inflammatory bowel disease activity. We demonstrate a characteristic increase in facultative anaerobes at the expense of obligate anaerobes, as well as molecular disruptions in microbial transcription (for example, among clostridia), metabolite pools (acylcarnitines, bile acids, and short-chain fatty acids), and levels of antibodies in host serum. Periods of disease activity were also marked by increases in temporal variability, with characteristic taxonomic, functional, and biochemical shifts. Finally, integrative analysis identified microbial, biochemical, and host factors central to this dysregulation. The study's infrastructure resources, results, and data, which are available through the Inflammatory Bowel Disease Multi'omics Database ( http://ibdmdb.org ), provide the most comprehensive description to date of host and microbial activities in inflammatory bowel diseases

eScholarship - University of California

Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information

Author: Principe Jose C.
Sledge Isaac J.
Publication venue
Publication date: 27/10/2017
Field of study

In this paper, we provide an approach to clustering relational matrices whose entries correspond to either similarities or dissimilarities between objects. Our approach is based on the value of information, a parameterized, information-theoretic criterion that measures the change in costs associated with changes in information. Optimizing the value of information yields a deterministic annealing style of clustering with many benefits. For instance, investigators avoid needing to a priori specify the number of clusters, as the partitions naturally undergo phase changes, during the annealing process, whereby the number of clusters changes in a data-driven fashion. The global-best partition can also often be identified.Comment: Submitted to the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP

arXiv.org e-Print Archive

Crossref

Combining dissimilarities in a hyper reproducing kernel hilbert space for complex human cancer prediction

Author: Blanco Ángela
De Las Rivas Javier
Martín-Merino Manuel
Publication venue: 'Hindawi Limited'
Publication date: 08/11/2012
Field of study

9 páginas, 3 tablas.-- This is an open access article distributed under the Creative Commons Attribution License.DNA microarrays provide rich profiles that are used in cancer prediction considering the gene expression levels across a collection of related samples. Support Vector Machines (SVM) have been applied to the classification of cancer samples with encouraging results. However, they rely on Euclidean distances that fail to reflect accurately the proximities among sample profiles. Then, non-Euclidean dissimilarities provide additional information that should be considered to reduce the misclassification errors. In this paper, we incorporate in the -SVM algorithm a linear combination of non-Euclidean dissimilarities. The weights of the combination are learnt in a (Hyper Reproducing Kernel Hilbert Space) HRKHS using a Semidefinite Programming algorithm. This approach allows us to incorporate a smoothing term that penalizes the complexity of the family of distances and avoids overfitting. The experimental results suggest that the method proposed helps to reduce the misclassification errors in several human cancer problems. © 2009 Manuel Mart́n-Merino et al.Financial support from Grant S02EIA-07L01.Peer Reviewe

Digital.CSIC

Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian hierarchical approach

Author: Carvalho Luis
Kolaczyk Eric D.
Pham Lisa M.
Schaus Scott
Publication venue
Publication date: 01/09/2014
Field of study

Cellular response to a perturbation is the result of a dynamic system of biological variables linked in a complex network. A major challenge in drug and disease studies is identifying the key factors of a biological network that are essential in determining the cell's fate. Here our goal is the identification of perturbed pathways from high-throughput gene expression data. We develop a three-level hierarchical model, where (i) the first level captures the relationship between gene expression and biological pathways using confirmatory factor analysis, (ii) the second level models the behavior within an underlying network of pathways induced by an unknown perturbation using a conditional autoregressive model, and (iii) the third level is a spike-and-slab prior on the perturbations. We then identify perturbations through posterior-based variable selection. We illustrate our approach using gene transcription drug perturbation profiles from the DREAM7 drug sensitivity predication challenge data set. Our proposed method identified regulatory pathways that are known to play a causative role and that were not readily resolved using gene set enrichment analysis or exploratory factor models. Simulation results are presented assessing the performance of this model relative to a network-free variant and its robustness to inaccuracies in biological databases

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Soil domestication by rice cultivation results in plant-soil feedback through shifts in soil microbiota.

Author: Edwards Joseph
Kilmer John
Liechty Zachary
Nguyen Bao
Ni Jiadong
Phillips Gregory
Santos-Medellín Christian
Sundaresan Venkatesan
Veliz Esteban
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

BackgroundSoils are a key component of agricultural productivity, and soil microbiota determine the availability of many essential plant nutrients. Agricultural domestication of soils, that is, the conversion of previously uncultivated soils to a cultivated state, is frequently accompanied by intensive monoculture, especially in the developing world. However, there is limited understanding of how continuous cultivation alters the structure of prokaryotic soil microbiota after soil domestication, including to what extent crop plants impact soil microbiota composition, and how changes in microbiota composition arising from cultivation affect crop performance.ResultsWe show here that continuous monoculture (> 8 growing seasons) of the major food crop rice under flooded conditions is associated with a pronounced shift in soil bacterial and archaeal microbiota structure towards a more consistent composition, thereby domesticating microbiota of previously uncultivated sites. Aside from the potential effects of agricultural cultivation practices, we provide evidence that rice plants themselves are important drivers of the domestication process, acting through selective enrichment of specific taxa, including methanogenic archaea, in their rhizosphere that differ from those of native plants growing in the same environment. Furthermore, we find that microbiota from soils domesticated by rice cultivation contribute to plant-soil feedback, by imparting a negative effect on rice seedling vigor.ConclusionsSoil domestication through continuous monoculture cultivation of rice results in compositional changes in the soil microbiota, which are in part driven by the rice plants. The consequences include a negative impact on plant performance and increases in greenhouse gas emitting microbes

eScholarship - University of California

Combining Dissimilarities in a Hyper Reproducing Kernel Hilbert Space for Complex Human Cancer Prediction

Author: Blanco Ángela
De Las Rivas Javier
Martín-Merino Manuel
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

DNA microarrays provide rich profiles that are used in cancer prediction considering the gene expression levels across a collection of related samples. Support Vector Machines (SVM) have been applied to the classification of cancer samples with encouraging results. However, they rely on Euclidean distances that fail to reflect accurately the proximities among sample profiles. Then, non-Euclidean dissimilarities provide additional information that should be considered to reduce the misclassification errors. In this paper, we incorporate in the ν-SVM algorithm a linear combination of non-Euclidean dissimilarities. The weights of the combination are learnt in a (Hyper Reproducing Kernel Hilbert Space) HRKHS using a Semidefinite Programming algorithm. This approach allows us to incorporate a smoothing term that penalizes the complexity of the family of distances and avoids overfitting. The experimental results suggest that the method proposed helps to reduce the misclassification errors in several human cancer problems

Crossref

Directory of Open Access Journals

PubMed Central

A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain

Author: Alonso Vidal
Ferreras Antonio
López Rivero Alfonso José
Martín Merino Manuel
Vallejo Marcelo
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 13/12/2022
Field of study

Clustering algorithms such as k-means depend heavily on choosing an appropriate distance metric that reflect accurately the object proximities. A wide range of dissimilarities may be defined that often lead to different clustering results. Choosing the best dissimilarity is an ill-posed problem and learning a general distance from the data is a complex task, particularly for high dimensional problems. Therefore, an appealing approach is to learn an ensemble of dissimilarities. In this paper, we have developed a semi-supervised clustering algorithm that learns a linear combination of dissimilarities considering incomplete knowledge in the form of pairwise constraints. The minimization of the loss function is based on a robust and efficient quadratic optimization algorithm. Besides, a regularization term is considered that controls the complexity of the distance metric learned avoiding overfitting. The algorithm has been applied to the identification of tumor samples using the gene expression profiles, where domain experts provide often incomplete knowledge in the form of pairwise constraints. We report that the algorithm proposed outperforms a standard semi-supervised clustering technique available in the literature and clustering results based on a single dissimilarity. The improvement is particularly relevant for applications with high level of noise

Re-UNIR