Search CORE

28,839 research outputs found

Mercury BLASTN Biosequence Similarity Search System: Technical Reference Guide

Author: Buhler Jeremy
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2011
Field of study

This guide documents the operation of the Mercury BLASTN system for hardware-accelerated DNA similarity search. It includes detailed information on the syntax and limitations of the system\u27s component commands, as well as a description of the system\u27s hardware platform suitable for administrators who need to maintain a Mercury BLASTN system. Mercury BLASTN is a product of the High Performance COmputational Biology Group at Washington University

Washington University St. Louis: Open Scholarship

REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads.

Author: Chu Chong
Nielsen Rasmus
Wu Yufeng
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Small Open Reading Frames, Non-Coding RNAs and Repetitive Elements in Bradyrhizobium japonicum USDA 110

Author: Cuklina Jelena
Evguenieva-Hackenberg Elena
Gelfand Mikhail S.
Hahn Julia
Thalmann Sebastian
Tsoy Olga V.
Publication venue: FB 08 - Biologie und Chemie. Biologie
Publication date: 01/01/2016
Field of study

Small open reading frames (sORFs) and genes for non-coding RNAs are poorly investigated components of most genomes. Our analysis of 1391 ORFs recently annotated in the soybean symbiont Bradyrhizobium japonicum USDA 110 revealed that 78% of them contain less than 80 codons. Twenty-one of these sORFs are conserved in or outside Alphaproteobacteria and most of them are similar to genes found in transposable elements, in line with their broad distribution. Stabilizing selection was demonstrated for sORFs with proteomic evidence and bll1319_ISGA which is conserved at the nucleotide level in 16 alphaproteobacterial species, 79 species from other taxa and 49 other Proteobacteria. Further we used Northern blot hybridization to validate ten small RNAs (BjsR1 to BjsR10) belonging to new RNA families. We found that BjsR1 and BjsR3 have homologs outside the genus Bradyrhizobium, and BjsR5, BjsR6, BjsR7, and BjsR10 have up to four imperfect copies in Bradyrhizobium genomes. BjsR8, BjsR9, and BjsR10 are present exclusively in nodules, while the other sRNAs are also expressed in liquid cultures. We also found that the level of BjsR4 decreases after exposure to tellurite and iron, and this down-regulation contributes to survival under high iron conditions. Analysis of additional small RNAs overlapping with 3-UTRs revealed two new repetitive elements named Br-REP1 and Br-REP2. These REP elements may play roles in the genomic plasticity and gene regulation and could be useful for strain identification by PCR-fingerprinting. Furthermore, we studied two potential toxin genes in the symbiotic island and confirmed toxicity of the yhaV homolog bll1687 but not of the newly annotated higB homolog blr0229_ISGA in E. coli. Finally, we revealed transcription interference resulting in an antisense RNA complementary to blr1853, a gene induced in symbiosis. The presented results expand our knowledge on sORFs, non-coding RNAs and repetitive elements in B. japonicum and related bacteria

Repository for Publications and Research Data

Crossref

Giessener Elektronische Bibliothek

PubMed Central

FigShare

The fate of Arabidopsis thaliana homeologous CNSs and their motifs in the Paleohexaploid Brassica rapa.

Author: Freeling Michael
Pires J
Subramaniam Sabarinath
Wang Xiaowu
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Following polyploidy, duplicate genes are often deleted, and if they are not, then duplicate regulatory regions are sometimes lost. By what mechanism is this loss and what is the chance that such a loss removes function? To explore these questions, we followed individual Arabidopsis thaliana-A. thaliana conserved noncoding sequences (CNSs) into the Brassica ancestor, through a paleohexaploidy and into Brassica rapa. Thus, a single Brassicaceae CNS has six potential orthologous positions in B. rapa; a single Arabidopsis CNS has three potential homeologous positions. We reasoned that a CNS, if present on a singlet Brassica gene, would be unlikely to lose function compared with a more redundant CNS, and this is the case. Redundant CNSs go nondetectable often. Using this logic, each mechanism of CNS loss was assigned a metric of functionality. By definition, proved deletions do not function as sequence. Our results indicated that CNSs that go nondetectable by base substitution or large insertion are almost certainly still functional (redundancy does not matter much to their detectability frequency), whereas those lost by inferred deletion or indels are approximately 75% likely to be nonfunctional. Overall, an average nondetectable, once-redundant CNS more than 30 bp in length has a 72% chance of being nonfunctional, and that makes sense because 97% of them sort to a molecular mechanism with deletion in its description, but base substitutions do cause loss. Similarly, proved-functional G-boxes go undetectable by deletion 82% of the time. Fractionation mutagenesis is a procedure that uses polyploidy as a mutagenic agent to genetically alter RNA expression profiles, and then to construct testable hypotheses as to the function of the lost regulatory site. We show fractionation mutagenesis to be a deletion machine in the Brassica lineage

PubMed Central

eScholarship - University of California

Split and Merge Functions for Supporting Multiple Processing Pipelines in Mercury BLASTN

Author: Ahir Jwalant
Buhler Jeremy
Chamberlain Roger D.
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2010
Field of study

Biosequence similarity search is an important application in computational biology. Mercury BLASTN, an FPGA-based implementation of BLAST for DNA, is one of the alternatives for fast DNA sequence comparison. The re-design of BLAST into a streaming application combined with a high-throughput hardware pipeline have enabled Mercury BLAST to emerge as one of the fastest implementations of bio-sequence similarity search. This performance can be further enhanced by exploiting the data-level parallelism present within the application. Here we present a multiple FPGA-based Mercury BLASTN design in order to double the speed and throughput of DNA sequence computation. This paper describes a dual Mercury BLASTN design, the detailed design of the split and merge functions, and simulation results

Washington University St. Louis: Open Scholarship

Non-invasive genetic monitoring for the threatened valley elderberry longhorn beetle.

Author: Baerwald Melinda
Goodbla Alisha
Graves Emily
Holyoak Marcel
Nagarajan Raman P
Schreier Andrea
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

The valley elderberry longhorn beetle (VELB), Desmocerus californicus dimorphus (Coleoptera: Cerambycidae), is a federally threatened subspecies endemic to the Central Valley of California. The VELB range partially overlaps with that of its morphologically similar sister taxon, the California elderberry longhorn beetle (CELB), Desmocerus californicus californicus (Coleoptera: Cerambycidae). Current surveying methods are limited to visual identification of larval exit holes in the VELB/CELB host plant, elderberry (Sambucus spp.), into which larvae bore and excavate feeding galleries. Unbiased genetic approaches could provide a much-needed complementary approach that has more precision than relying on visual inspection of exit holes. In this study we developed a DNA sequencing-based method for indirect detection of VELB/CELB from frass (insect fecal matter), which can be easily and non-invasively collected from exit holes. Frass samples were collected from 37 locations and the 12S and 16S mitochondrial genes were partially sequenced using nested PCR amplification. Three frass-derived sequences showed 100% sequence identity to VELB/CELB barcode references from museum specimens sequenced for this study. Database queries of frass-derived sequences also revealed high similarity to common occupants of old VELB feeding galleries, including earwigs, flies, and other beetles. Overall, this non-invasive approach is a first step towards a genetic assay that could augment existing VELB monitoring and accurately discriminate between VELB, CELB, and other insects. Furthermore, a phylogenetic analysis of 12S and 16S data from museum specimens revealed evidence for the existence of a previously unrecognized, genetically distinct CELB subpopulation in southern California

Directory of Open Access Journals

eScholarship - University of California

Hyb:A bioinformatics pipeline for the analysis of CLASH (crosslinking, ligation and sequencing of hybrids) data

Author: Aiba
Aleksandra Helwak
Altschul
Anthony J. Travis
Aravin
Chi
Chi
David Tollervey
Dodt
Field
Filipowicz
Gong
Granneman
Grzegorz Kudla
Hafner
Hansen
Helwak
Jonathan Moody
Kent
Kim
Kudla
Langmead
Leung
Licatalosi
Lorenz
Markham
Memczak
Quinlan
Rost
Simpson
Ule
Wlotzka
Zhang
Publication venue: 'Elsevier BV'
Publication date: 06/11/2013
Field of study

Peer reviewedPublisher PD

Aberdeen University Research

Crossref

PubMed Central

Edinburgh Research Explorer

Recommended from our members

Reconstructing an ancestral genotype of two hexachlorocyclohexane-degrading Sphingobium species using metagenomic sequence data.

Author: Gilbert Jack A
Khurana Jitendra P
Khurana Paramjit
Kumar Roshan
Lal Rup
Lax Simon
Negi Vivek
Sangwan Naseer
Verma Helianthous
Publication venue: eScholarship, University of California
Publication date: 01/02/2014
Field of study

Over the last 60 years, the use of hexachlorocyclohexane (HCH) as a pesticide has resulted in the production of >4 million tons of HCH waste, which has been dumped in open sinks across the globe. Here, the combination of the genomes of two genetic subspecies (Sphingobium japonicum UT26 and Sphingobium indicum B90A; isolated from two discrete geographical locations, Japan and India, respectively) capable of degrading HCH, with metagenomic data from an HCH dumpsite (∼450 mg HCH per g soil), enabled the reconstruction and validation of the last-common ancestor (LCA) genotype. Mapping the LCA genotype (3128 genes) to the subspecies genomes demonstrated that >20% of the genes in each subspecies were absent in the LCA. This includes two enzymes from the 'upper' HCH degradation pathway, suggesting that the ancestor was unable to degrade HCH isomers, but descendants acquired lin genes by transposon-mediated lateral gene transfer. In addition, anthranilate and homogentisate degradation traits were found to be strain (selectively retained only by UT26) and environment (absent in the LCA and subspecies, but prevalent in the metagenome) specific, respectively. One draft secondary chromosome, two near complete plasmids and eight complete lin transposons were assembled from the metagenomic DNA. Collectively, these results reinforce the elastic nature of the genus Sphingobium, and describe the evolutionary acquisition mechanism of a xenobiotic degradation phenotype in response to environmental pollution. This also demonstrates for the first time the use of metagenomic data in ancestral genotype reconstruction, highlighting its potential to provide significant insight into the development of such phenotypes

eScholarship - University of California