Search CORE

577 research outputs found

Implementation of GenePattern within the Stanford Microarray Database

Author: Brazma
C. A. Ball
F. Wymore
G. Sherlock
H. Jin
J. Demeter
J. Hubble
M. Mao
M. Nitzberg
T. B. K. Reddy
Z. K. Zachariah
Publication venue: Oxford University Press
Publication date
Field of study

Hundreds of researchers across the world use the Stanford Microarray Database (SMD; http://smd.stanford.edu/) to store, annotate, view, analyze and share microarray data. In addition to providing registered users at Stanford access to their own data, SMD also provides access to public data, and tools with which to analyze those data, to any public user anywhere in the world. Previously, the addition of new microarray data analysis tools to SMD has been limited by available engineering resources, and in addition, the existing suite of tools did not provide a simple way to design, execute and share analysis pipelines, or to document such pipelines for the purposes of publication. To address this, we have incorporated the GenePattern software package directly into SMD, providing access to many new analysis tools, as well as a plug-in architecture that allows users to directly integrate and share additional tools through SMD. In this article, we describe our implementation of the GenePattern microarray analysis software package into the SMD code base. This extension is available with the SMD source code that is fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD with an enriched data analysis capability

Crossref

PubMed Central

The Stanford Microarray Database accommodates additional microarray platforms and data formats

Author: Awad Ihab A. B.
Ball Catherine A.
Brown Patrick O.
Demeter Janos
Gollub Jeremy
Hebert Joan M.
Hernandez-Boussard Tina
Jin Heng
Matese John C.
Nitzberg Michael
Sherlock Gavin
Wymore Farrell
Zachariah Zachariah K.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The Stanford Microarray Database (SMD) (http://smd.stanford.edu) is a research tool for hundreds of Stanford researchers and their collaborators. In addition, SMD functions as a resource for the entire biological research community by providing unrestricted access to microarray data published by SMD users and by disseminating its source code. In addition to storing GenePix (Axon Instruments) and ScanAlyze output from spotted microarrays, SMD has recently added the ability to store, retrieve, display and analyze the complete raw data produced by several additional microarray platforms and image analysis software packages, so that we can also now accept data from Affymetrix GeneChips (MAS5/GCOS or dChip), Agilent Catalog or Custom arrays (using Agilent's Feature Extraction software) or data created by SpotReader (Niles Scientific). We have implemented software that allows us to accept MAGE-ML documents from array manufacturers and to submit MIAME-compliant data in MAGE-ML format directly to ArrayExpress and GEO, greatly increasing the ease with which data from SMD can be published adhering to accepted standards and also increasing the accessibility of published microarray data to the general public. We have introduced a new tool to facilitate data sharing among our users, so that datasets can be shared during, before or after the completion of data analysis. The latest version of the source code for the complete database package was released in November 2004 (http://smd.stanford.edu/download/), allowing researchers around the world to deploy their own installations of SMD

CiteSeerX

Crossref

PubMed Central

The Longhorn Array Database (LAD): An Open-Source, MIAME compliant implementation of the Stanford Microarray Database (SMD)

Author: Iyer Vishwanath R
Killion Patrick J
Sherlock Gavin
Publication venue: BioMed Central
Publication date: 20/08/2003
Field of study

BACKGROUND: The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. DESCRIPTION: The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at CONCLUSIONS: Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data

Springer - Publisher Connector

PubMed Central

The Stanford Microarray Database: implementation of new analysis tools and open source release of software

Author: Ball Catherine A.
Beauheim Catherine
Brown Patrick O.
Demeter Janos
Gollub Jeremy
Hernandez-Boussard Tina
Jin Heng
Maier Donald
Matese John C.
Nitzberg Michael
Sherlock Gavin
Wymore Farrell
Zachariah Zachariah K.
Publication venue: Oxford University Press
Publication date: 20/12/2006
Field of study

The Stanford Microarray Database (SMD; ) is a research tool and archive that allows hundreds of researchers worldwide to store, annotate, analyze and share data generated by microarray technology. SMD supports most major microarray platforms, and is MIAME-supportive and can export or import MAGE-ML. The primary mission of SMD is to be a research tool that supports researchers from the point of data generation to data publication and dissemination, but it also provides unrestricted access to analysis tools and public data from 300 publications. In addition to supporting ongoing research, SMD makes its source code fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD. In this article, we describe several data analysis tools implemented in SMD and we discuss features of our software release

Crossref

PubMed Central

Genomic data analysis using grid-based computing

Author: Rekapalli Bhanu Prasad
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2003
Field of study

Microarray experiments generate a plethora of genomic data; therefore we need techniques and architectures to analyze this data more quickly. This thesis presents a solution for reducing the computation time of a highly computationally intensive data analysis part of a genomic application. The application used is the Stanford Microarray Database (SMD). SMD\u27s implementation, working, and analysis features are described. The reasons for choosing the computationally intensive problems of the SMD, and the background importance of these problems are presented. This thesis presents an effective parallel solution to the computational problem, including the difficulties faced with the parallelization of the problem and the results achieved. Finally, future research directions for achieving even greater speedups are presented

University of Tennessee, Knoxville: Trace

Inferring gene ontologies from pairwise similarity data.

Author: Bafna Vineet
Dutkowski Janusz
Ideker Trey
Kramer Michael
Yu Michael
Publication venue: eScholarship, University of California
Publication date: 01/06/2014
Field of study

MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data

PubMed Central

eScholarship - University of California

The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

Author: Abderrahmane Tagmount
C Iseli
CA Ball
CE Cook
CM Linardic
Erika Lindquist
F Nardi
G Sherlock
J Colbourne
J Demeter
J Forment
J Gollub
J Haig
J Hubble
J Stillman
J Stillman
JH Stillman
JH Stillman
JH Stillman
JH Stillman
JH Stillman
JH Stillman
JH Stillman
JK Colbourne
JL Boore
JM Mallatt
Jonathon H. Stillman
Kristen S. Teranishi
KS Teranishi
LH Heckmann
MC Ungerer
Mei Wang
Mike Wong
PH Lenz
PJ Killion
RD Finn
RJ Waggett
S Schaack
Samir K. Brahmachari
SF Altschul
Shinichi Sunagawa
TF Smith
TheUniProtConsortium
TW Jeffries
X Huang
Y Moriya
Yoshihiro Tanaka
Publication venue: Public Library of Science
Publication date: 27/01/2010
Field of study

BACKGROUND: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. METHODOLOGY/PRINCIPAL FINDINGS: A set of approximately 30K unique sequences (UniSeqs) representing approximately 19K clusters were generated from approximately 98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66% of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases. CONCLUSIONS/SIGNIFICANCE: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UNT Digital Library

VisANT 3.0: new modules for pathway visualization, editing, prediction and construction

Author: Ball
Barrett
Bolan Linghu
Charles DeLisi
Chung
Chunnuan Chen
Dahlquist
David M. Ng
Demir
Demir
DeRisi
Fukuda
Gagneur
Gavin
Ge
Hasegawa
Herman
Hu
Hu
Hu
Jansen
Joe Mellor
Joshi-Tope
Joshua M. Stuart
Junker
Kanehisa
Keseler
Kitano
Klukas
Lashkari
Mellor
Minoru Kanehisa
Mlecnik
Ng
Ng
Owen
Ravasz
Saraiya
Segal
Shannon
Shuichi Kawashima
Spirin
Stuart
Sugiyama
Takuji Yamada
Zhenjun Hu
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu

CiteSeerX

Crossref

Boston University Institutional Repository (OpenBU)

PubMed Central