Search CORE

129 research outputs found

Patterns of sequence conservation in presynaptic neural genes

Author: Bućan Maja
Hadley Dexter
Hannenhalli Sridhar
Kim Junhyong
Murphy Tara
Ungar Lyle
Valladares Otto
Publication venue: BioMed Central
Publication date: 10/11/2006
Field of study

BACKGROUND: The neuronal synapse is a fundamental functional unit in the central nervous system of animals. Because synaptic function is evolutionarily conserved, we reasoned that functional sequences of genes and related genomic elements known to play important roles in neurotransmitter release would also be conserved. RESULTS: Evolutionary rate analysis revealed that presynaptic proteins evolve slowly, although some members of large gene families exhibit accelerated evolutionary rates relative to other family members. Comparative sequence analysis of 46 megabases spanning 150 presynaptic genes identified more than 26,000 elements that are highly conserved in eight vertebrate species, as well as a small subset of sequences (6%) that are shared among unrelated presynaptic genes. Analysis of large gene families revealed that upstream and intronic regions of closely related family members are extremely divergent. We also identified 504 exceptionally long conserved elements (≥360 base pairs, ≥80% pair-wise identity between human and other mammals) in intergenic and intronic regions of presynaptic genes. Many of these elements form a highly stable stem-loop RNA structure and consequently are candidates for novel regulatory elements, whereas some conserved noncoding elements are shown to correlate with specific gene expression profiles. The SynapseDB online database integrates these findings and other functional genomic resources for synaptic genes. CONCLUSION: Highly conserved elements in nonprotein coding regions of 150 presynaptic genes represent sequences that may be involved in the transcriptional or post-transcriptional regulation of these genes. Furthermore, comparative sequence analysis will facilitate selection of genes and noncoding sequences for future functional studies and analysis of variation studies in neurodevelopmental and psychiatric disorders

Springer - Publisher Connector

PubMed Central

ScholarlyCommons@Penn

SAVoR: A Server for Sequencing Annotation and Visualization of RNA Structures

Author: Childress Daniel M
Gregory Brian D
Li Fan
Ryvkin Paul
Valladares Otto
Wang Li-San
Publication venue: ScholarlyCommons
Publication date: 06/04/2012
Field of study

RNA secondary structure is required for the proper regulation of the cellular transcriptome. This is because the functionality, processing, localization and stability of RNAs are all dependent on the folding of these molecules into intricate structures through specific base pairing interactions encoded in their primary nucleotide sequences. Thus, as the number of RNA sequencing (RNA-seq) data sets and the variety of protocols for this technology grow rapidly, it is becoming increasingly pertinent to develop tools that can analyze and visualize this sequence data in the context of RNA secondary structure. Here, we present Sequencing Annotation and Visualization of RNA structures (SAVoR), a web server, which seamlessly links RNA structure predictions with sequencing data and genomic annotations to produce highly informative and annotated models of RNA secondary structure. SAVoR accepts read alignment data from RNA-seq experiments and computes a series of per-base values such as read abundance and sequence variant frequency. These values can then be visualized on a customizable secondary structure model. SAVoR is freely available at http://tesla.pcbi.upenn.edu/savor

PubMed Central

ScholarlyCommons@Penn

Genome-wide expression profiling and bioinformatics analysis of diurnally regulated genes in the mouse prefrontal cortex

Author: Kai Wang
Maja Bucan
Otto Valladares
Shuzhang Yang
Sridhar Hannenhalli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

BACKGROUND: The prefrontal cortex is important in regulating sleep and mood. Diurnally regulated genes in the prefrontal cortex may be controlled by the circadian system, by sleep:wake states, or by cellular metabolism or environmental responses. Bioinformatics analysis of these genes will provide insights into a wide-range of pathways that are involved in the pathophysiology of sleep disorders and psychiatric disorders with sleep disturbances. RESULTS: We examined gene expression in the mouse prefrontal cortex at four time points during a 24 hour (12 hour light:12 hour dark) cycle using microarrays, and identified 3,890 transcripts corresponding to 2,927 genes with diurnally regulated expression patterns. We show that 16% of the genes identified in our study are orthologs of identified clock, clock controlled or sleep/wakefulness induced genes in the mouse liver and suprachiasmatic nucleus, rat cortex and cerebellum, or Drosophila head. The diurnal expression patterns were confirmed for 16 out of 18 genes in an independent set of RNA samples. The diurnal genes fall into eight temporal categories with distinct functional attributes, as assessed by Gene Ontology classification and analysis of enriched transcription factor binding sites. CONCLUSION: Our analysis demonstrates that approximately 10% of transcripts have diurnally regulated expression patterns in the mouse prefrontal cortex. Functional annotation of these genes will be important for the selection of candidate genes for behavioral mutants in the mouse and for genetic studies of disorders associated with anomalies in the sleep:wake cycle and circadian rhythm

Crossref

Springer - Publisher Connector

PubMed Central

HAMR: High-Throughput Annotation of Modified Ribonucleotides

Author: Childress Micah
Dragomir Isabelle
Gregory Brian D
Leung Yuk Y
Ryvkin Paul
Silverman Ian M
Valladares Otto
Wang Li-San
Publication venue: ScholarlyCommons
Publication date: 01/12/2013
Field of study

RNA is often altered post-transcriptionally by the covalent modification of particular nucleotides; these modifications are known to modulate the structure and activity of their host RNAs. The recent discovery that an RNA methyl-6 adenosine demethylase (FTO) is a risk gene in obesity has brought to light the significance of RNA modifications to human biology. These noncanonical nucleotides, when converted to cDNA in the course of RNA sequencing, can produce sequence patterns that are distinguishable from simple base-calling errors. To determine whether these modifications can be detected in RNA sequencing data, we developed a method that can not only locate these modifications transcriptome-wide with single nucleotide resolution, but can also differentiate between different classes of modifications. Using small RNA-seq data we were able to detect 92% of all known human tRNA modification sites that are predicted to affect RT activity. We also found that different modifications produce distinct patterns of cDNA sequence, allowing us to differentiate between two classes of adenosine and two classes of guanine modifications with 98% and 79% accuracy, respectively. To show the robustness of this method to sample preparation and sequencing methods, as well as to organismal diversity, we applied it to a publicly available yeast data set and achieved similar levels of accuracy. We also experimentally validated two novel and one known 3-methylcytosine (3mC) sites predicted by HAMR in human tRNAs. Researchers can now use our method to identify and characterize RNA modifications using only RNA-seq data, both retrospectively and when asking questions specifically about modified RNA

PubMed Central

ScholarlyCommons@Penn

Genome-Wide Double-Stranded RNA Sequencing Reveals the Functional Significance of Base-Paired RNAs in \u3cem\u3eArabidopsis\u3c/em\u3e

Author: Cao Kajia
Dragomir Isabelle
Gregory Brian D
Li Fan
Ryvkin Paul
Valladares Otto
Wang Li-San
Yang Jamie
Zheng Qi
Publication venue: ScholarlyCommons
Publication date: 01/09/2010
Field of study

The functional structure of all biologically active molecules is dependent on intra- and inter-molecular interactions. This is especially evident for RNA molecules whose functionality, maturation, and regulation require formation of correct secondary structure through encoded base-pairing interactions. Unfortunately, intra- and inter-molecular base-pairing information is lacking for most RNAs. Here, we marry classical nuclease-based structure mapping techniques with high-throughput sequencing technology to interrogate all base-paired RNA in Arabidopsis thaliana and identify ∼200 new small (sm)RNA–producing substrates of RNA–DEPENDENT RNA POLYMERASE6. Our comprehensive analysis of paired RNAs reveals conserved functionality within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs, as well as a novel population of functional RNAs, many of which are the precursors of smRNAs. Finally, we identify intra-molecular base-pairing interactions to produce a genome-wide collection of RNA secondary structure models. Although our methodology reveals the pairing status of RNA molecules in the absence of cellular proteins, previous studies have demonstrated that structural information obtained for RNAs in solution accurately reflects their structure in ribonucleoprotein complexes. Furthermore, our identification of RNA–DEPENDENT RNA POLYMERASE6 substrates and conserved functional RNA domains within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs using this approach strongly suggests that RNA molecules are correctly folded into their secondary structure in solution. Overall, our findings highlight the importance of base-paired RNAs in eukaryotes and present an approach that should be widely applicable for the analysis of this key structural feature of RNA

Directory of Open Access Journals

PubMed Central

ScholarlyCommons@Penn

DASHR: Database of Small Human Noncoding RNAs

Author: Amlie-Wolf Alexandre
Gregory Brian D
Kannan Sampath
Kuksa Pavel P
Leung Yuk Y
Ungar Lyle H
Valladares Otto
Wang Li-San
Publication venue: ScholarlyCommons
Publication date: 08/11/2015
Field of study

Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single database provides searchable, unified annotation, and expression information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. Here, we present the Database of small human noncoding RNAs (DASHR) . DASHR contains the most comprehensive information to date on human sncRNA genes and mature sncRNA products. DASHR provides a simple user interface for researchers to view sequence and secondary structure, compare expression levels, and evidence of specific processing across all sncRNA genes and mature sncRNA products in various human tissues. DASHR annotation and expression data covers all major classes of sncRNAs including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear, nucleolar, cytoplasmic (sn-, sno-, scRNAs, respectively), transfer (tRNAs), and ribosomal RNAs (rRNAs). Currently, DASHR (v1.0) integrates 187 smRNA high-throughput sequencing (smRNA-seq) datasets with over 2.5 billion reads and annotation data from multiple public sources. DASHR contains annotations for ~48,000 human sncRNA genes and mature sncRNA products, 82% of which are expressed in one of more of the curated tissues. DASHR is available at http://lisanwanglab.org/DASHR

PubMed Central

ScholarlyCommons@Penn

Reassessment of Risk Genotypes (\u3cem\u3eGRN\u3c/em\u3e, \u3cem\u3eTMEM106B\u3c/em\u3e, and \u3cem\u3eABCC9\u3c/em\u3e Variants) Associated with Hippocampal Sclerosis of Aging Pathology

Author: Ellingson Sally R.
Fardo David W.
Kukull Walter A.
Monsell Sarah E.
Naj Adam C.
Nelson Peter T.
Partch Amanda B.
Valladares Otto
Wang Li-San
Wang Wang-Xia
Wilfred Bernard R.
Publication venue: UKnowledge
Publication date: 01/01/2015
Field of study

Hippocampal sclerosis of aging (HS-Aging) is a common high-morbidity neurodegenerative condition in elderly persons. To understand the risk factors for HS-Aging, we analyzed data from the Alzheimer’s Disease Genetics Consortium and correlated the data with clinical and pathologic information from the National Alzheimer’s Coordinating Center database. Overall, 268 research volunteers with HS-Aging and 2,957 controls were included; detailed neuropathologic data were available for all. The study focused on single-nucleotide polymorphisms previously associated with HS-Aging risk: rs5848 ( GRN ), rs1990622 ( TMEM106B ), and rs704180 ( ABCC9 ). Analyses of a subsample that was not previously evaluated (51 HS-Aging cases and 561 controls) replicated the associations of previously identified HS-Aging risk alleles. To test for evidence of gene-gene interactions and genotype-phenotype relationships, pooled data were analyzed. The risk for HS-Aging diagnosis associated with these genetic polymorphisms was not secondary to an association with either Alzheimer disease or dementia with Lewy body neuropathologic changes. The presence of multiple risk genotypes was associated with a trend for additive risk for HS-Aging pathology. We conclude that multiple genes play important roles in HS-Aging, which is a distinctive neurodegenerative disease of aging

Crossref

PubMed Central

University of Kentucky

Global Analysis of RNA Secondary Structure in Two Metazoans

Author: Aiyer Subhadra
Bambina Shelley
Cherry Sara
Desai Yaanik
Dragomir Isabelle
Gregory Brian D
Lamitina Todd
Li Fan
Murray John I
Rai Arjun
Ryvkin Paul
Sabin Leah R
Valladares Otto
Wang Li-San
Yang Jamie
Zheng Qi
Publication venue: ScholarlyCommons
Publication date: 01/01/2012
Field of study

The secondary structure of RNA is necessary for its maturation, regulation, processing, and function. However, the global influence of RNA folding in eukaryotes is still unclear. Here, we use a high-throughput, sequencing-based, structure-mapping approach to identify the paired (double-stranded RNA [dsRNA]) and unpaired (single-stranded RNA [ssRNA]) components of the Drosophila melanogaster and Caenorhabditis elegans transcriptomes, which allows us to identify conserved features of RNA secondary structure in metazoans. From this analysis, we find that ssRNAs and dsRNAs are significantly correlated with specific epigenetic modifications. Additionally, we find key structural patterns across protein-coding transcripts that indicate that RNA folding demarcates regions of protein translation and likely affects microRNA-mediated regulation of mRNAs in animals. Finally, we identify and characterize 546 mRNAs whose folding pattern is significantly correlated between these metazoans, suggesting that their structure has some function. Overall, our findings provide a global assessment of RNA folding in animals

Elsevier - Publisher Connector

Directory of Open Access Journals

ScholarlyCommons@Penn

Whole exome sequencing study identifies novel rare and common Alzheimer's-Associated variants involved in immune response and transcriptional regulation

Author: Alzheimers Dis Sequencing Project
Bis Joshua C.
Cantwell Laura
Daly Mark
Havulinna Aki S.
Helisalmi Seppo
Hiltunen Mikko
Jian Xueqiu
Kaprio Jaakko
Kunkle Brian W.
Kurki Mitja
Lehtimäki Tero
Mattila Kari
Neale Benjamin
Palotie Aarno
Perola Markus
Remes Anne M.
Salomaa Veikko
Soininen Hilkka
Valladares Otto
Wang Li-San
Publication venue
Publication date: 01/08/2020
Field of study

Correction: Volume: 25 Issue: 8 Pages: 1901-1903 DOI: 10.1038/s41380-019-0529-7The Alzheimer's Disease Sequencing Project (ADSP) undertook whole exome sequencing in 5,740 late-onset Alzheimer disease (AD) cases and 5,096 cognitively normal controls primarily of European ancestry (EA), among whom 218 cases and 177 controls were Caribbean Hispanic (CH). An age-, sex- andAPOEbased risk score and family history were used to select cases most likely to harbor novel AD risk variants and controls least likely to develop AD by age 85 years. We tested ~1.5 million single nucleotide variants (SNVs) and 50,000 insertion-deletion polymorphisms (indels) for association to AD, using multiple models considering individual variants as well as gene-based tests aggregating rare, predicted functional, and loss of function variants. Sixteen single variants and 19 genes that met criteria for significant or suggestive associations after multiple-testing correction were evaluated for replication in four independent samples; three with whole exome sequencing (2,778 cases, 7,262 controls) and one with genome-wide genotyping imputed to the Haplotype Reference Consortium panel (9,343 cases, 11,527 controls). The top findings in the discovery sample were also followed-up in the ADSP whole-genome sequenced family-based dataset (197 members of 42 EA families and 501 members of 157 CH families). We identified novel and predicted functional genetic variants in genes previously associated with AD. We also detected associations in three novel genes:IGHG3(p = 9.8 x 10(-7)), an immunoglobulin gene whose antibodies interact with beta-amyloid, a long non-coding RNAAC099552.4(p = 1.2 x 10(-7)), and a zinc-finger proteinZNF655(gene-based p = 5.0 x 10(-6)). The latter two suggest an important role for transcriptional regulation in AD pathogenesis.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Human Whole-Exome Genotype Data For alzheimer\u27s Disease

Author: Brkanac Zoran
Bush William S
Cantwell Laura
Chou Yi-Fan
Clark Kaylyn
Cruchaga Carlos
Destefano Anita
Farrer Lindsay
Gangadharan Prabhakaran
Haines Jonathan
Hamilton-Nelson Kara
Kuzma Amanda B
Lee Wan-Ping
Leung Yuk Yee
Lin Honghuang
Martin Eden
Mayeux Richard P
Naj Adam C
Nicaretta Heather
Pericak-Vance Margaret
Qu Liming
Schellenberg Gerard D
Schmidt Michael
Seshadri Sudha
Valladares Otto
Wang Li-San
Wheeler Nicholas
Publication venue: DigitalCommons@TMC
Publication date: 23/01/2024
Field of study

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer\u27s Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD \u3e 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community

DigitalCommons@The Texas Medical Center