Search CORE

258 research outputs found

Large-scale associations between the leukocyte transcriptome and BOLD responses to speech differ in autism early language outcome subtypes.

Author: A Battle
A Krishnan
A Perry
A Schroeder
A Sugathan
A Venter
AJ Willsey
AM Wetherby
AR McIntosh
AR Pfenning
AT Hilliard
BS Abrahams
C Demopoulos
C Lord
CR Genovese
Cynthia Carter Barnes
DA Rossignol
DH Geschwind
DH Geschwind
DL Vargas
E Redcay
E Redcay
EA Boyle
EJ Greenblatt
Eric Courchesne
F Happé
G Dehaene-Lambertz
G Hickok
G Konopka
G Konopka
H Tager-Flusberg
HR Cremers
I Voineagu
J Cotney
JA Kosmicki
JA Miller
JC Darnell
JD Storey
JI Berman
JT Morgan
K Gotham
K Pierce
K Pierce
K Pierce
Karen Pierce
L Hippolyte
Linda Lopez
Lisa Eyler
LT Eyler
M Chikina
M Lek
MC Lai
MC Oldham
MCN Marchetto
ME Ritchie
Michael V. Lombardo
MM Kjelgaard
MV Lombardo
Nathan E. Lewis
NN Parikshak
NN Parikshak
NR Wray
P Du
P Howlin
P Langfelder
P Langfelder
P Szatmari
R Bernier
RC Gentleman
Richard A. I. Bethlehem
RJ Kelleher III
S Kapur
S Rubeis De
S Sandin
SJ Sanders
T Pramparo
T Pramparo
T Yarkoni
Tiziano Pramparo
Vahid Gazestani
Varun Warrier
X Liu
X Nuttle
Publication venue: eScholarship, University of California
Publication date: 01/12/2018
Field of study

Heterogeneity in early language development in autism spectrum disorder (ASD) is clinically important and may reflect neurobiologically distinct subtypes. Here, we identified a large-scale association between multiple coordinated blood leukocyte gene coexpression modules and the multivariate functional neuroimaging (fMRI) response to speech. Gene coexpression modules associated with the multivariate fMRI response to speech were different for all pairwise comparisons between typically developing toddlers and toddlers with ASD and poor versus good early language outcome. Associated coexpression modules were enriched in genes that are broadly expressed in the brain and many other tissues. These coexpression modules were also enriched in ASD-associated, prenatal, human-specific, and language-relevant genes. This work highlights distinctive neurobiology in ASD subtypes with different early language outcomes that is present well before such outcomes are known. Associations between neuroimaging measures and gene expression levels in blood leukocytes may offer a unique in vivo window into identifying brain-relevant molecular mechanisms in ASD

Crossref

eScholarship - University of California

Mining the Gene Wiki for functional genomic knowledge

Author: A Subramanian
AI Su
Andrew I Su
AR Aronson
AR Pico
B Mons
Benjamin M Good
C Jonquet
D Weekes
Douglas G Howe
DW Huang
E Callaway
E Camon
EB Camon
ES Lander
H Stehr
I Rivals
J Osborne
JC Venter
JW Huss
JW Huss
L Hirschman
LA Flórez
M Ashburner
M Waldrop
N Daraselia
NH Shah
R Hoffmann
R Tirrell
R Winnenburg
Simon M Lin
W Baumgartner
Warren A Kibbe
Z Lu
Publication venue: BioMed Central
Publication date: 01/12/2011
Field of study

Abstract Background Ontology-based gene annotations are important tools for organizing and analyzing genome-scale biological data. Collecting these annotations is a valuable but costly endeavor. The Gene Wiki makes use of Wikipedia as a low-cost, mass-collaborative platform for assembling text-based gene annotations. The Gene Wiki is comprised of more than 10,000 review articles, each describing one human gene. The goal of this study is to define and assess a computational strategy for translating the text of Gene Wiki articles into ontology-based gene annotations. We specifically explore the generation of structured annotations using the Gene Ontology and the Human Disease Ontology. Results Our system produced 2,983 candidate gene annotations using the Disease Ontology and 11,022 candidate annotations using the Gene Ontology from the text of the Gene Wiki. Based on manual evaluations and comparisons to reference annotation sets, we estimate a precision of 90-93% for the Disease Ontology annotations and 48-64% for the Gene Ontology annotations. We further demonstrate that this data set can systematically improve the results from gene set enrichment analyses. Conclusions The Gene Wiki is a rapidly growing corpus of text focused on human gene function. Here, we demonstrate that the Gene Wiki can be a powerful resource for generating ontology-based gene annotations. These annotations can be used immediately to improve workflows for building curated gene annotation databases and knowledge-based statistical analyses.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Distinguishing between cancer driver and passenger gene alteration candidates via cross-species comparison: a pilot study

Crossref

Springer - Publisher Connector

PubMed Central

Analysis and comparison of very large metagenomes with fast clustering and functional annotation

Author: AC McHardy
AR Quinlan
B Rodriguez-Brito
D Sheskin
DB Rusch
DC Richter
DH Huson
E Portugaly
EA Dinsdale
EF DeLong
FE Angly
GW Tyson
H Noguchi
H Noguchi
H Teeling
H Teeling
J Shendure
JC Venter
K Mavromatis
KJ Hoff
L Krause
PD Schloss
R Seshadri
RK Aziz
S Yooseph
S Yooseph
SF Altschul
SG Tringe
SR Eddy
SR Gill
W Li
W Li
W Li
W Li
Weizhong Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP) was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from <url>http://tools.camera.calit2.net/camera/rammcap/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Bluetongue: a historical and epidemiological perspective with the emphasis on South Africa

Author: A Pini
A Theiler
A Theiler
A Wilson
AA van Dijk
AR Gould
B Dungu
B Hemati
BI Osburn
BJ Barnard
BJ Erasmus
BJ Erasmus
BJ Erasmus
BJH Barnard
BV Purse
C He
C Saegerman
CA Batten
CR Mahrt
D Dercksen
D Hutcheon
D Hutcheon
DW Verwoerd
DW Verwoerd
DW Verwoerd
DW Verwoerd
E Breard
E Veronesi
E Veronesi
EC Borden
EM Nevill
EM Nevill
EP Gibbs
Estelle H Venter
F Dal Pozzo
FD Menzies
FR Holbrook
G Carpi
G Savini
G Vercauteren
GH Gerdes
GJ Venter
GJ Venter
GJ Venter
GY Akita
H Huismans
H Huismans
J Manso-Ribeiro
J Spreull
JL Stott
JT Paweska
KA Alexander
M Belhouchet
M Van Niekerk
Maria Stokstad
Mette Myrmel
Moritz Van Vuuren
N Lacetera
NJ Maclachlan
NJ Maclachlan
NJ Maclachlan
NJ Maclachlan
NJ Maclachlan
P Kirkland
PD Kirkland
Peter Coetzee
PG Howell
PG Howell
PPC Mertens
PS Mellor
R Meiswinkel
RA Alexander
RA Vosdingh
RM Du Toit
RM Gambles
S Maan
S Maan
S Zientara
SK Samal
SK Samal
SM Barratt-Boyes
SM Barratt-Boyes
T Tollersrud
WJ Tabachnick
WO Neitz
WO Neitz
WT Hardy
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recruitment of rare 3-grams at functional sites: Is this a mechanism for increasing enzyme specificity?

Abstract Background A wealth of unannotated and functionally unknown protein sequences has accumulated in recent years with rapid progresses in sequence genomics, giving rise to ever increasing demands for developing methods to efficiently assess functional sites. Sequence and structure conservations have traditionally been the major criteria adopted in various algorithms to identify functional sites. Here, we focus on the distributions of the 203 different types of <it>3</it>-grams (or triplets of sequentially contiguous amino acid) in the entire space of sequences accumulated to date in the UniProt database, and focus in particular on the rare <it>3</it>-grams distinguished by their high entropy-based information content. Results Comparison of the UniProt distributions with those observed near/at the active sites on a non-redundant dataset of 59 enzyme/ligand complexes shows that the active sites preferentially recruit <it>3</it>-grams distinguished by their low frequency in the UniProt. Three cases, Src kinase, hemoglobin, and tyrosyl-tRNA synthetase, are discussed in details to illustrate the biological significance of the results. Conclusion The results suggest that recruitment of rare <it>3</it>-grams may be an efficient mechanism for increasing specificity at functional sites. Rareness/scarcity emerges as a feature that may assist in identifying key sites for proteins function, providing information complementary to that derived from sequence alignments. In addition it provides us (for the first time) with a means of identifying potentially functional sites from sequence information alone, when sequence conservation properties are not available.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Inferring viral quasispecies spectra from 454 pyrosequencing reads

Author: A Sundquist
Alex Zelikovsky
AR Quinlan
B Gaschen
Bassam Tork
D Brinza
DC Douek
E Domingo
E Martinez-Salas
EA Duarte
G Myers
H Fakhrai-Rad
Ion Măndoiu
Irina Astrovskaya
JC de la Torre
JC Venter
JI Esteban
JJ Holland
JW Drake
K Westbrooks
Kelly Westbrooks
M Eigen
M Margulies
MC Prosperi
MJ Chaisson
N Beerenwinkel
N Eriksson
NM Laird
O Zagordi
O Zagordi
Peter Balfe
R Lippert
S Balser
S Hoffmann
S-Y Rhee
Serghei Mangul
SL Fishman
ST O’Neil
T von Hahn
V Bansal
W Brockman
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background RNA viruses infecting a host usually exist as a set of closely related sequences, referred to as quasispecies. The genomic diversity of viral quasispecies is a subject of great interest, particularly for chronic infections, since it can lead to resistance to existing therapies. High-throughput sequencing is a promising approach to characterizing viral diversity, but unfortunately standard assembly software was originally designed for single genome assembly and cannot be used to simultaneously assemble and estimate the abundance of multiple closely related quasispecies sequences. Results In this paper, we introduce a new Viral Spectrum Assembler (ViSpA) method for quasispecies spectrum reconstruction and compare it with the state-of-the-art ShoRAH tool on both simulated and real 454 pyrosequencing shotgun reads from HCV and HIV quasispecies. Experimental results show that ViSpA outperforms ShoRAH on simulated error-free reads, correctly assembling 10 out of 10 quasispecies and 29 sequences out of 40 quasispecies. While ShoRAH has a significant advantage over ViSpA on reads simulated with sequencing errors due to its advanced error correction algorithm, ViSpA is better at assembling the simulated reads after they have been corrected by ShoRAH. ViSpA also outperforms ShoRAH on real 454 reads. Indeed, 7 most frequent sequences reconstructed by ViSpA from a real HCV dataset are viable (do not contain internal stop codons), and the most frequent sequence was within 1% of the actual open reading frame obtained by cloning and Sanger sequencing. In contrast, only one of the sequences reconstructed by ShoRAH is viable. On a real HIV dataset, ShoRAH correctly inferred only 2 quasispecies sequences with at most 4 mismatches whereas ViSpA correctly reconstructed 5 quasispecies with at most 2 mismatches, and 2 out of 5 sequences were inferred without any mismatches. ViSpA source code is available at <url>http://alla.cs.gsu.edu/~software/VISPA/vispa.html</url>. Conclusions ViSpA enables accurate viral quasispecies spectrum reconstruction from 454 pyrosequencing reads. We are currently exploring extensions applicable to the analysis of high-throughput sequencing data from bacterial metagenomic samples and ecological samples of eukaryote populations.</p

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evidence that the Human Pathogenic Fungus Cryptococcus neoformans var. grubii May Have Evolved in Africa

Most of the species of fungi that cause disease in mammals, including Cryptococcus neoformans var. grubii (serotype A), are exogenous and non-contagious. Cryptococcus neoformans var. grubii is associated worldwide with avian and arboreal habitats. This airborne, opportunistic pathogen is profoundly neurotropic and the leading cause of fungal meningitis. Patients with HIV/AIDS have been ravaged by cryptococcosis – an estimated one million new cases occur each year, and mortality approaches 50%. Using phylogenetic and population genetic analyses, we present evidence that C. neoformans var. grubii may have evolved from a diverse population in southern Africa. Our ecological studies support the hypothesis that a few of these strains acquired a new environmental reservoir, the excreta of feral pigeons (Columba livia), and were globally dispersed by the migration of birds and humans. This investigation also discovered a novel arboreal reservoir for highly diverse strains of C. neoformans var. grubii that are restricted to southern Africa, the mopane tree (Colophospermum mopane). This finding may have significant public health implications because these primal strains have optimal potential for evolution and because mopane trees contribute to the local economy as a source of timber, folkloric remedies and the edible mopane worm

Public Library of Science (PLOS)

Crossref

PubMed Central

DukeSpace

Occlusion of Regulatory Sequences by Promoter Nucleosomes In Vivo

Author: A Almer
A Dhasarathy
A Nourani
AR Terrell
B Li
B Pina
C Mao
CC Adams
CD Carvin
CD Kaplan
Changhui Mao
Christopher R. Brown
D Rhodes
DJ Steger
E Segal
G Li
H Boeger
H Boeger
HD Kim
Hinrich Boeger
J Feser
J Griesenbeck
J Svaren
JL Workman
Joachim Griesenbeck
JP Magbanua
K Eisfeld
K Luger
K Merz
KD Fascher
KD Fascher
KJ Polach
Laszlo Tora
M Han
M Ransom
M Schmid
MW Adkins
MW Adkins
N Kaplan
P Korber
PC McAndrew
S Barbaric
SR Zabaronick
T Shimizu
TN Mavrich
U Venter
VG Norton
WH Wu
WR Bauer
Y Lorch
Y Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Nucleosomes are believed to inhibit DNA binding by transcription factors. Theoretical attempts to understand the significance of nucleosomes in gene expression and regulation are based upon this assumption. However, nucleosomal inhibition of transcription factor binding to DNA is not complete. Rather, access to nucleosomal DNA depends on a number of factors, including the stereochemistry of transcription factor-DNA interaction, the in vivo kinetics of thermal fluctuations in nucleosome structure, and the intracellular concentration of the transcription factor. In vitro binding studies must therefore be complemented with in vivo measurements. The inducible PHO5 promoter of yeast has played a prominent role in this discussion. It bears two binding sites for the transcriptional activator Pho4, which at the repressed promoter are positioned within a nucleosome and in the linker region between two nucleosomes, respectively. Earlier studies suggested that the nucleosomal binding site is inaccessible to Pho4 binding in the absence of chromatin remodeling. However, this notion has been challenged by several recent reports. We therefore have reanalyzed transcription factor binding to the PHO5 promoter in vivo, using ‘chromatin endogenous cleavage’ (ChEC). Our results unambiguously demonstrate that nucleosomes effectively interfere with the binding of Pho4 and other critical transcription factors to regulatory sequences of the PHO5 promoter. Our data furthermore suggest that Pho4 recruits the TATA box binding protein to the PHO5 promoter

CiteSeerX

Public Library of Science (PLOS)

University of Regensburg Publication Server

Crossref

Directory of Open Access Journals

PubMed Central

Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm

Author: A Abyzov
A Martínez-Fundichely
AR Quinlan
BS Weir
C Stewart
CA Buerkle
Cristina Aguado
CW Whelan
David Vicente-Salvador
E Gazave
E Karakoc
ES Lander
F Hormozdiari
G Bhatia
GR Abecasis
H Li
H Li
H Li
H Shao
HYK Lam
J Berglund
J Wang
JC Venter
JJ Michaelson
JM Kidd
José Ignacio Lucas-Lledó
K Chen
KJ McKernan
M Cáceres
M Muñoz Amatriaín
M Nei
Mario Cáceres
PD Keightley
PH Sudmant
R Li
R Nielsen
R Xi
RB Corbett-Detig
RE Handsaker
RE Mills
S Girirajan
S Levy
SM Ahn
SS Sindi
SY Kim
T Zichner
V Guryev
W Huang
X Li
Y Wang
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

PubMed Central

Diposit Digital de Documents de la UAB