Search CORE

13,075 research outputs found

Recommended from our members

The Computational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health.

Author: Eetemadi Ameen
Kim Minseung
Pereira Beatriz Merchel Piovesan
Rai Navneet
Schmitz Harold
Tagkopoulos Ilias
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Food and human health are inextricably linked. As such, revolutionary impacts on health have been derived from advances in the production and distribution of food relating to food safety and fortification with micronutrients. During the past two decades, it has become apparent that the human microbiome has the potential to modulate health, including in ways that may be related to diet and the composition of specific foods. Despite the excitement and potential surrounding this area, the complexity of the gut microbiome, the chemical composition of food, and their interplay in situ remains a daunting task to fully understand. However, recent advances in high-throughput sequencing, metabolomics profiling, compositional analysis of food, and the emergence of electronic health records provide new sources of data that can contribute to addressing this challenge. Computational science will play an essential role in this effort as it will provide the foundation to integrate these data layers and derive insights capable of revealing and understanding the complex interactions between diet, gut microbiome, and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant data sources, bioinformatics tools, machine learning capabilities, as well as the intellectual property and legislative regulatory landscape. We provide guidance on employing machine learning and data analytics, identify gaps in current methods, and describe new scenarios to be unlocked in the next few years in the context of current knowledge

eScholarship - University of California

A Distance-Based Test of Association Between Paired Heterogeneous Genomic Data

Author: Curry Edward
Minas Christopher
Montana Giovanni
Publication venue
Publication date: 27/03/2013
Field of study

Due to rapid technological advances, a wide range of different measurements can be obtained from a given biological sample including single nucleotide polymorphisms, copy number variation, gene expression levels, DNA methylation and proteomic profiles. Each of these distinct measurements provides the means to characterize a certain aspect of biological diversity, and a fundamental problem of broad interest concerns the discovery of shared patterns of variation across different data types. Such data types are heterogeneous in the sense that they represent measurements taken at very different scales or described by very different data structures. We propose a distance-based statistical test, the generalized RV (GRV) test, to assess whether there is a common and non-random pattern of variability between paired biological measurements obtained from the same random sample. The measurements enter the test through distance measures which can be chosen to capture particular aspects of the data. An approximate null distribution is proposed to compute p-values in closed-form and without the need to perform costly Monte Carlo permutation procedures. Compared to the classical Mantel test for association between distance matrices, the GRV test has been found to be more powerful in a number of simulation settings. We also report on an application of the GRV test to detect biological pathways in which genetic variability is associated to variation in gene expression levels in ovarian cancer samples, and present results obtained from two independent cohorts

arXiv.org e-Print Archive

Crossref

King's Research Portal

Comparative Analysis Association and Prediction of Various Phenotypic Traits of Oryza Sativa

Author: B. Kiranmai et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

Understanding the genotype-phenotype relationship and accurately predicting breeding values are crucial aspects of crop improvement programs. This paper investigates the genetic basis ,association of phenotypic trait height and yield and predicts the phenotypic traits of Oryza Sativa (rice) through a comprehensive approach encompassing genome-wide association studies (GWAS), phylogenetic analysis, machine learning algorithms, and the development of a graphical user interface (GUI) application. Genotypic and phenotypic data were collected from the RiceVarMap database. The genotypic information consisted of gene variation IDs, while the phenotype data included plant height. Data preprocessing involved the creation of a sequence. fasta file and multiple sequence alignment using the ClustalW tool. A phylogenetic tree was then constructed to analyse the subpopulations of Oryza Sativa. Clustering techniques were applied to further explore the genetic relationships among the samples. A GWAS file was generated to identify associations between genotype and phenotype. Subsequently, machine learning algorithms were employed for the classification and prediction of genomic estimated breeding values (GEBV) for height and yield traits. Random Forest emerged as the most accurate algorithm with 85% accuracy. To facilitate user interaction and data exploration, a GUI application was developed using Flask, allowing users to access the phylogenetic tree, height, and yield information, GWAS results, and make predictions.  We explored there is a strong positive association between phenotypic trait height and yield

International Journal on Recent and Innovation Trends in Computing and Communication

A Strategy analysis for genetic association studies with known inbreeding

Author: Bertolino Francesco
Biino Ginevra
Cabras Stefano
Castellanos Maria Eugenia
Casula Laura
Del Giacco Stefano
Persico Ivana
Pirastu Mario
Pirastu Nicola
Sassu Alessandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: Association studies consist in identifying the genetic variants which are related to a specific disease through the use of statistical multiple hypothesis testing or segregation analysis in pedigrees. This type of studies has been very successful in the case of Mendelian monogenic disorders while it has been less successful in identifying genetic variants related to complex diseases where the insurgence depends on the interactions between different genes and the environment. The current technology allows to genotype more than a million of markers and this number has been rapidly increasing in the last years with the imputation based on templates sets and whole genome sequencing. This type of data introduces a great amount of noise in the statistical analysis and usually requires a great number of samples. Current methods seldom take into account gene-gene and gene-environment interactions which are fundamental especially in complex diseases. In this paper we propose to use a non-parametric additive model to detect the genetic variants related to diseases which accounts for interactions of unknown order. Although this is not new to the current literature, we show that in an isolated population, where the most related subjects share also most of their genetic code, the use of additive models may be improved if the available genealogical tree is taken into account. Specifically, we form a sample of cases and controls with the highest inbreeding by means of the Hungarian method, and estimate the set of genes/environmental variables, associated with the disease, by means of Random Forest. Results: We have evidence, from statistical theory, simulations and two applications, that we build a suitable procedure to eliminate stratification between cases and controls and that it also has enough precision in identifying genetic variants responsible for a disease. This procedure has been successfully used for the betathalassemia, which is a well known Mendelian disease, and also to the common asthma where we have identified candidate genes that underlie to the susceptibility of the asthma. Some of such candidate genes have been also found related to common asthma in the current literature. Conclusions: The data analysis approach, based on selecting the most related cases and controls along with the Random Forest model, is a powerful tool for detecting genetic variants associated to a disease in isolated populations. Moreover, this method provides also a prediction model that has accuracy in estimating the unknown disease status and that can be generally used to build kit tests for a wide class of Mendelian diseases

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Cagliari

UnissResearch

Recommended from our members

Motif-informed analysis of phenotype heterogeneity in cancer

Author: Xu Qi, Ph. D.
Publication venue
Publication date: 08/05/2024
Field of study

The landscape of cancer genomics harbors a wealth of DNA motifs, whose thorough analysis and integration provide a pivotal method to decipher the complex molecular interactions underlying cancer. This dissertation delineates novel computational methodologies for robust DNA motif analysis and data integration, aiming to elucidate the implications of DNA motifs on cancer heterogeneity and clinical outcomes. Chapter 1 lays the groundwork by showing the significance of DNA motifs in the genomic framework and delineating the current biomarkers in cancer. It highlights the opportunity that DNA motif analysis presents in unveiling a nuanced understanding of genomic interactions. It also indicates the motivations and specific aims of the study of both DNA motif quantification and co-localization analysis. In Chapter 2, a foundational marker for quantifying the prevalence of DNA repetitive motifs, termed as “Non-B DNA Burden”, is introduced. A user-centric platform is also developed to facilitate the efficient computation and visualization of this metric across various genomic scales. Together, they are offering a novel perspective for analyzing DNA motif heterogeneity. Transitioning to Chapter 3, the focus evolves toward an integrated marker approach. By integrating the prevalence analysis of DNA motifs in conjunction with the frequency of co-localized mutations, novel markers mlTNB (mutation-localized total non-B burden) and nbTMB (non-B informed tumor mutation burden) are proposed. Their potential in predicting cancer prognosis and treatment responses is specifically explored. Chapter 4 broadens the analytical foundation by defining MoCoLo (Motif Co-Localization), a robust statistical framework for testing multi-modal DNA motif co-localization. Through this framework, we are able to explore the complex interplay of genomic features and provide a methodical approach to investigate their co-localization in a multi-modal data integration context. Case studies are employed to showcase the utility of MoCoLo in examining the co-localization of genomic features, thus facilitating the understanding of genomic interactions that are pivotal to cancer biology. Chapter 5 synthesizes the findings from the preceding explorations, outlining the contributions of the developed methodologies to the field of cancer genomics and bioinformatics. It demonstrates the potential impact of DNA motif analysis and data integration on understanding phenotype heterogeneity in cancer and shows the prospective avenues it provides for impactful future research. Overall, this work is structured to contribute to the bioinformatics community by weaving together innovative tools and analyses focused on DNA motif analysis and data integration. It strives to pave a beneficial way forward to a deeper understanding of the cancer genome, thereby enhancing potential diagnostic and therapeutic strategies.Cellular and Molecular Biolog

Texas ScholarWorks

The Population Genetic Signature of Polygenic Local Adaptation

Author: Berg Jeremy J.
Coop Graham
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Adaptation in response to selection on polygenic phenotypes may occur via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that may have been influenced by local adaptation. We exploit the fact that GWAS provide an estimate of the additive effect size of many loci to estimate the mean additive genetic value for a given phenotype across many populations as simple weighted sums of allele frequencies. We first describe a general model of neutral genetic value drift for an arbitrary number of populations with an arbitrary relatedness structure. Based on this model we develop methods for detecting unusually strong correlations between genetic values and specific environmental variables, as well as a generalization of

Q_{ST}/F_{ST}

comparisons to test for over-dispersion of genetic values among populations. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles, and also significantly outperform methods that do not account for population structure. We apply our tests to the Human Genome Diversity Panel (HGDP) dataset using GWAS data for height, skin pigmentation, type 2 diabetes, body mass index, and two inflammatory bowel disease datasets. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results.Comment: 42 pages including 8 figures and 3 tables; supplementary figures and tables not included on this upload, but are mostly unchanged from v

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare