Search CORE

12 research outputs found

STORMSeq: An Open-Source, User-Friendly Pipeline for Processing Personal Genomics Data in the Cloud

Author: Dudley Joel T.
Fernald Guy Haskin
Karczewski Konrad J.
Martin Alicia R.
Snyder Michael
Tatonetti Nicholas P.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

The increasing public availability of personal complete genome sequencing data has ushered in an era of democratized genomics. However, read mapping and variant calling software is constantly improving and individuals with personal genomic data may prefer to customize and update their variant calls. Here, we describe STORMSeq (Scalable Tools for Open-Source Read Mapping), a graphical interface cloud computing solution that does not require a parallel computing environment or extensive technical experience. This customizable and modular system performs read mapping, read cleaning, and variant calling and annotation. At present, STORMSeq costs approximately

2 and 5–10 hours to process a full exome sequence and

30 and 3–8 days to process a whole genome sequence. We provide this open-access and open-source resource as a user-friendly interface in Amazon EC2

Crossref

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

FigShare

Using Molecular Features of Xenobiotics to Predict Hepatic Gene Expression Response

Author: Guy Haskin Fernald (509684)
Russ B. Altman (6158)
Publication venue
Publication date
Field of study

Despite recent advances in molecular medicine and rational drug design, many drugs still fail because toxic effects arise at the cellular and tissue level. In order to better understand these effects, cellular assays can generate high-throughput measurements of gene expression changes induced by small molecules. However, our understanding of how the chemical features of small molecules influence gene expression is very limited. Therefore, we investigated the extent to which chemical features of small molecules can reliably be associated with significant changes in gene expression. Specifically, we analyzed the gene expression response of rat liver cells to 170 different drugs and searched for genes whose expression could be related to chemical features alone. Surprisingly, we can predict the up-regulation of 87 genes (increased expression of at least 1.5 times compared to controls). We show an average cross-validation predictive area under the receiver operating characteristic curve (AUROC) of 0.7 or greater for each of these 87 genes. We applied our method to an external data set of rat liver gene expression response to a novel drug and achieved an AUROC of 0.7. We also validated our approach by predicting up-regulation of Cytochrome P450 1A2 (CYP1A2) in three drugs known to induce CYP1A2 that were not in our data set. Finally, a detailed analysis of the CYP1A2 predictor allowed us to identify which fragments made significant contributions to the predictive scores

CiteSeerX

FigShare

A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports

Author: Guy Haskin Fernald
Nicholas P Tatonetti
Russ B Altman
Publication venue: 'BMJ'
Publication date
Field of study

Crossref

Bioinformatics challenges for personalized medicine

Author: Altman Russ B.
Capriotti Emidio
Daneshjou Roxana
Fernald Guy Haskin
Karczewski Konrad J.
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: Widespread availability of low-cost, full genome sequencing will introduce new challenges for bioinformatics

Crossref

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Genome-Wide Network Analysis Reveals the Global Properties of IFN-β Immediate Transcriptional Effects in Humans

Author: Andrew Pachner
Guy Haskin Fernald
Jorge R. Oksenberg
Kavitha Narayan
Parvin Mousavi
Sergio E. Baranzini
Simon Knott
Stacy J. Caillier
Publication venue: 'The American Association of Immunologists'
Publication date
Field of study

Crossref

Approximate costs for STORMSeq.

Author: Alicia R. Martin (509685)
Guy Haskin Fernald (509684)
Joel T. Dudley (159355)
Konrad J. Karczewski (107090)
Michael Snyder (16695)
Nicholas P. Tatonetti (387205)
Publication venue
Publication date
Field of study

<p>Note that these costs are approximate and may depend on a number of factors related to the input files.</p

FigShare

Overview of the STORMSeq system.

Author: Alicia R. Martin (509685)
Guy Haskin Fernald (509684)
Joel T. Dudley (159355)
Konrad J. Karczewski (107090)
Michael Snyder (16695)
Nicholas P. Tatonetti (387205)
Publication venue
Publication date
Field of study

<p>The user uploads short reads to Amazon S3 and starts a webserver on Amazon EC2, which controls the mapping and variant calling pipeline. Progress can be monitored on the webserver and results are uploaded to persistent storage on Amazon S3.</p

FigShare

Sample output.

Author: Alicia R. Martin (509685)
Guy Haskin Fernald (509684)
Joel T. Dudley (159355)
Konrad J. Karczewski (107090)
Michael Snyder (16695)
Nicholas P. Tatonetti (387205)
Publication venue
Publication date
Field of study

<p>STORMSeq provides basic visualization for summary statistics, such as (A) genome-wide SNP density and (B) size distribution of short indels.</p

FigShare