Search CORE

21,106 research outputs found

Recommended from our members

Patterns of genomic and phenomic diversity in wine and table grapes.

Author: Aradhya Mallikarjuna K
Brown Patrick J
Buckler Edward S
Bustamante Carlos D
Gardner Kyle M
Migicovsky Zoë
Myles Sean
Prins Bernard H
Sawler Jason
Schwaninger Heidi R
Zhong Gan-Yuan
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Grapes are one of the most economically and culturally important crops worldwide, and they have been bred for both winemaking and fresh consumption. Here we evaluate patterns of diversity across 33 phenotypes collected over a 17-year period from 580 table and wine grape accessions that belong to one of the world's largest grape gene banks, the grape germplasm collection of the United States Department of Agriculture. We find that phenological events throughout the growing season are correlated, and quantify the marked difference in size between table and wine grapes. By pairing publicly available historical phenotype data with genome-wide polymorphism data, we identify large effect loci controlling traits that have been targeted during domestication and breeding, including hermaphroditism, lighter skin pigmentation and muscat aroma. Breeding for larger berries in table grapes was traditionally concentrated in geographic regions where Islam predominates and alcohol was prohibited, whereas wine grapes retained the ancestral smaller size that is more desirable for winemaking in predominantly Christian regions. We uncover a novel locus with a suggestive association with berry size that harbors a signature of positive selection for larger berries. Our results suggest that religious rules concerning alcohol consumption have had a marked impact on patterns of phenomic and genomic diversity in grapes

eScholarship - University of California

Prediction with Dimension Reduction of Multiple Molecular Data Sources for Patient Survival

Author: Kaplan Adam
Lock Eric F.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2017
Field of study

Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal components analysis (PCA). However, the application of PCA is not straightforward for multi-source data, wherein multiple sources of 'omics data measure different but related biological components. In this article we utilize recent advances in the dimension reduction of multi-source data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multi-source data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example we consider predicting survival for Glioblastoma Multiforme (GBM) patients from three data sources measuring mRNA expression, miRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction, and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function 'jive.predict'.Comment: 11 pages, 9 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Unconventional machine learning of genome-wide human cancer data

Author: Bajaj Sweta R.
Chittenden Thomas W.
Cilfone Nicholas
Gamel Omar E.
Gujja Sharvari
Gulcher Jeffrey R.
Li Richard Y.
Lidar Daniel A.
Publication venue
Publication date: 13/05/2020
Field of study

Recent advances in high-throughput genomic technologies coupled with exponential increases in computer processing and memory have allowed us to interrogate the complex aberrant molecular underpinnings of human disease from a genome-wide perspective. While the deluge of genomic information is expected to increase, a bottleneck in conventional high-performance computing is rapidly approaching. Inspired in part by recent advances in physical quantum processors, we evaluated several unconventional machine learning (ML) strategies on actual human tumor data. Here we show for the first time the efficacy of multiple annealing-based ML algorithms for classification of high-dimensional, multi-omics human cancer data from the Cancer Genome Atlas. To assess algorithm performance, we compared these classifiers to a variety of standard ML methods. Our results indicate the feasibility of using annealing-based ML to provide competitive classification of human cancer types and associated molecular subtypes and superior performance with smaller training datasets, thus providing compelling empirical evidence for the potential future application of unconventional computing architectures in the biomedical sciences

arXiv.org e-Print Archive

Directory of Open Access Journals

Posterior Contraction Rates of the Phylogenetic Indian Buffet Processes

Author: Chen Mengjie
Gao Chao
Zhao Hongyu
Publication venue
Publication date: 19/05/2015
Field of study

By expressing prior distributions as general stochastic processes, nonparametric Bayesian methods provide a flexible way to incorporate prior knowledge and constrain the latent structure in statistical inference. The Indian buffet process (IBP) is such an example that can be used to define a prior distribution on infinite binary features, where the exchangeability among subjects is assumed. The phylogenetic Indian buffet process (pIBP), a derivative of IBP, enables the modeling of non-exchangeability among subjects through a stochastic process on a rooted tree, which is similar to that used in phylogenetics, to describe relationships among the subjects. In this paper, we study the theoretical properties of IBP and pIBP under a binary factor model. We establish the posterior contraction rates for both IBP and pIBP and substantiate the theoretical results through simulation studies. This is the first work addressing the frequentist property of the posterior behaviors of IBP and pIBP. We also demonstrated its practical usefulness by applying pIBP prior to a real data example arising in the field of cancer genomics where the exchangeability among subjects is violated

arXiv.org e-Print Archive

PubMed Central

Carolina Digital Repository

Applicability of in vivo staging of regional amyloid burden in a cognitively normal cohort with subjective memory complaints: the INSIGHT-preAD study.

Author: Alzheimer Precision Medicine Initiative (APMI)
Bertin Hugo
Cavedo Enrica
Dubois Bruno
Dyrba Martin
Gonzalez-Escamilla Gabriel
Grothe Michel J
Habert Marie-Odile
Hampel Harald
INSIGHT-preAD study group
Jelistratova Irina
Lehericy Stephane
Locatelli Maxime
Sakr Fatemah A
Teipel Stefan
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

BACKGROUND:Current methods of amyloid PET interpretation based on the binary classification of global amyloid signal fail to identify early phases of amyloid deposition. A recent analysis of 18F-florbetapir PET data from the Alzheimer's disease Neuroimaging Initiative cohort suggested a hierarchical four-stage model of regional amyloid deposition that resembles neuropathologic estimates and can be used to stage an individual's amyloid burden in vivo. Here, we evaluated the validity of this in vivo amyloid staging model in an independent cohort of older people with subjective memory complaints (SMC). We further examined its potential association with subtle cognitive impairments in this population at elevated risk for Alzheimer's disease (AD). METHODS:The monocentric INSIGHT-preAD cohort includes 318 cognitively intact older individuals with SMC. All individuals underwent 18F-florbetapir PET scanning and extensive neuropsychological testing. We projected the regional amyloid uptake signal into the previously proposed hierarchical staging model of in vivo amyloid progression. We determined the adherence to this model across all cases and tested the association between increasing in vivo amyloid stage and cognitive performance using ANCOVA models. RESULTS:In total, 156 participants (49%) showed evidence of regional amyloid deposition, and all but 2 of these (99%) adhered to the hierarchical regional pattern implied by the in vivo amyloid progression model. According to a conventional binary classification based on global signal (SUVRCereb = 1.10), individuals in stages III and IV were classified as amyloid-positive (except one in stage III), but 99% of individuals in stage I and even 28% of individuals in stage II were classified as amyloid-negative. Neither in vivo amyloid stage nor conventional binary amyloid status was significantly associated with cognitive performance in this preclinical cohort. CONCLUSIONS:The proposed hierarchical staging scheme of PET-evidenced amyloid deposition generalizes well to data from an independent cohort of older people at elevated risk for AD. Future studies will determine the prognostic value of the staging approach for predicting longitudinal cognitive decline in older individuals at increased risk for AD

INRIA a CCSD electronic archive server

Directory of Open Access Journals

HAL Descartes

eScholarship - University of California

HAL-CEA

FigShare

Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum.

Author: Bunnik Evelien M
Le Roch Karine G
Lonardi Stefano
Lu Xueqing Maggie
Nasseri Sara
Pokhriyal Neeti
Publication venue: eScholarship, University of California
Publication date: 01/11/2015
Field of study

BackgroundPlasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5'- and 3'-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum.ResultsUsing binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47%) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6%). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated.ConclusionOur results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq)

Springer - Publisher Connector

PubMed Central

eScholarship - University of California