Search CORE

107 research outputs found

A quantitative reference transcriptome for Nematostella vectensis early embryonic development : a pipeline for de novo assembly in emerging model systems

Author: Aguiar Derek
Istrail Sorin
Smith Joel
Tulin Sarah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

© The Author(s), 2013. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in EvoDevo 4 (2013): 16, doi:10.1186/2041-9139-4-16.The de novo assembly of transcriptomes from short shotgun sequences raises challenges due to random and non-random sequencing biases and inherent transcript complexity. We sought to define a pipeline for de novo transcriptome assembly to aid researchers working with emerging model systems where well annotated genome assemblies are not available as a reference. To detail this experimental and computational method, we used early embryos of the sea anemone, Nematostella vectensis, an emerging model system for studies of animal body plan evolution. We performed RNA-seq on embryos up to 24 h of development using Illumina HiSeq technology and evaluated independent de novo assembly methods. The resulting reads were assembled using either the Trinity assembler on all quality controlled reads or both the Velvet and Oases assemblers on reads passing a stringent digital normalization filter. A control set of mRNA standards from the National Institute of Standards and Technology (NIST) was included in our experimental pipeline to invest our transcriptome with quantitative information on absolute transcript levels and to provide additional quality control. We generated >200 million paired-end reads from directional cDNA libraries representing well over 20 Gb of sequence. The Trinity assembler pipeline, including preliminary quality control steps, resulted in more than 86% of reads aligning with the reference transcriptome thus generated. Nevertheless, digital normalization combined with assembly by Velvet and Oases required far less computing power and decreased processing time while still mapping 82% of reads. We have made the raw sequencing reads and assembled transcriptome publically available. Nematostella vectensis was chosen for its strategic position in the tree of life for studies into the origins of the animal body plan, however, the challenge of reference-free transcriptome assembly is relevant to all systems for which well annotated gene models and independently verified genome assembly may not be available. To navigate this new territory, we have constructed a pipeline for library preparation and computational analysis for de novo transcriptome assembly. The gene models defined by this reference transcriptome define the set of genes transcribed in early Nematostella development and will provide a valuable dataset for further gene regulatory network investigations

Woods Hole Open Access Server

Springer - Publisher Connector

PubMed Central

The Pagenumber of Genus g Graph is 0(g)

Author: Heath Lenwood S.
Istrail Sorin
Publication venue
Publication date: 01/01/1990
Field of study

In 1979, Berhart and Kainen conjectured that graphs of fixed genus g greater than or equal to 1 have unbounded pagenumber. This proves that genus g graphs can be embedded in 0(g) pages, thus disproving the conjecture. An Omega(square root of g) lower bound is also derived. The first algorithm in the literature for embedding an arbitrary graph in a book with a non-trivial upper bound on the number of pages is presented. First, the algorithm computes the genus g of a graph using the algorithm of Filotti, Miller, Reif (1979), which is polynomial-time for fixed genus. Second, it applies an optimal-time algorithm for obtaining an 0(g)-page book embedding. We give separate book embedding algorithms for the cases of graphs embedded in orientable and nonorientable surfaces. An important aspect of the construction is a new decomposition algorithm, of independent interest, for a graph embedded on a surface. Book embedding has application in several areas, two of which are directly related to the results obtained: fault-tolerant VLSI and complexity theory

Computer Science Technical Reports @Virginia Tech

Functional cis-regulatory genomics for systems biology

Author: Eric H. Davidson
Erwin
Franks
Geiss
Hobert
Jongmin Nam
Livant
Ping Dong
Rozen
Ryan Tarpine
Sorin Istrail
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 23/02/2010
Field of study

Gene expression is controlled by interactions between trans-regulatory factors and cis-regulatory DNA sequences, and these interactions constitute the essential functional linkages of gene regulatory networks (GRNs). Validation of GRN models requires experimental cis-regulatory tests of predicted linkages to authenticate their identities and proposed functions. However, cis-regulatory analysis is, at present, at a severe bottleneck in genomic system biology because of the demanding experimental methodologies currently in use for discovering cis-regulatory modules (CRMs), in the genome, and for measuring their activities. Here we demonstrate a high-throughput approach to both discovery and quantitative characterization of CRMs. The unique aspect is use of DNA sequence tags to “barcode” CRM expression constructs, which can then be mixed, injected together into sea urchin eggs, and subsequently deconvolved. This method has increased the rate of cis-regulatory analysis by >100-fold compared with conventional one-by-one reporter assays. The utility of the DNA-tag reporters was demonstrated by the rapid discovery of 81 active CRMs from 37 previously unexplored sea urchin genes. We then obtained simultaneous high-resolution temporal characterization of the regulatory activities of more than 80 CRMs. On average 2–3 CRMs were discovered per gene. Comparison of endogenous gene expression profiles with those of the CRMs recovered from each gene showed that, for most cases, at least one CRM is active in each phase of endogenous expression, suggesting that CRM recovery was comprehensive. This approach will qualitatively alter the practice of GRN construction as well as validation, and will impact many additional areas of regulatory system biology

Crossref

PubMed Central

Caltech Authors

DELISHUS: an efficient and exact algorithm for genome-wide detection of deletion polymorphism in autism

Author: Arking
Bjarni V. Halldórsson
Bruining
Cazals
Ching
Conrad
Conrad
Corona
Derek Aguiar
Eric M. Morrow
Fiegler
Fradin
Glessner
Guilmatre
Hague
Halldórsson
Halldórsson
Harley
Iafrate
International HapMap Consortium
Khaja
Lamb
Levy
McCarroll
McClellan
Medvedev
Mefford
Mills
Mills
Morrow
Morrow
O'Roak
Park
Sanders
Sebat
Siva
Sorin Istrail
Stefansson
Tsukiyama
Walsh
Wang
Weiss
Zerr
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: The understanding of the genetic determinants of complex disease is undergoing a paradigm shift. Genetic heterogeneity of rare mutations with deleterious effects is more commonly being viewed as a major component of disease. Autism is an excellent example where research is active in identifying matches between the phenotypic and genomic heterogeneities. A considerable portion of autism appears to be correlated with copy number variation, which is not directly probed by single nucleotide polymorphism (SNP) array or sequencing technologies. Identifying the genetic heterogeneity of small deletions remains a major unresolved computational problem partly due to the inability of algorithms to detect them

Crossref

PubMed Central