Search CORE

207 research outputs found

Systems biology via redescription and ontologies (I): finding phase changes with applications to malaria temporal data

Author: B Zeeberg
BR Zeeberg
Bud Mishra
J Ernst
Kevin Casey
M Antoniotti
M Ashburner
N Friedman
N Slonim
PT Spellman
R Cilibrasi
Samantha Kleinberg
TM Cover
W Clark
Z Bar-Joseph
Publication venue: Springer Netherlands
Publication date: 01/12/2007
Field of study

Biological systems are complex and often composed of many subtly interacting components. Furthermore, such systems evolve through time and, as the underlying biology executes its genetic program, the relationships between components change and undergo dynamic reorganization. Characterizing these relationships precisely is a challenging task, but one that must be undertaken if we are to understand these systems in sufficient detail. One set of tools that may prove useful are the formal principles of model building and checking, which could allow the biologist to frame these inherently temporal questions in a sufficiently rigorous framework. In response to these challenges, GOALIE (Gene ontology algorithmic logic and information extractor) was developed and has been successfully employed in the analysis of high throughput biological data (e.g. time-course gene-expression microarray data and neural spike train recordings). The method has applications to a wide variety of temporal data, indeed any data for which there exist ontological descriptions. This paper describes the algorithms behind GOALIE and its use in the study of the Intraerythrocytic Developmental Cycle (IDC) of Plasmodium falciparum, the parasite responsible for a deadly form of chloroquine resistant malaria. We focus in particular on the problem of finding phase changes, times of reorganization of transcriptional control

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

SpliceMiner: a high-throughput database implementation of the NCBI Evidence Viewer for microarray splice variant analysis

Author: Jamison D Curtis
Kahn Ari B
Liu Hongfang
Ryan Michael C
Weinstein John N
Zeeberg Barry R
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: There are many fewer genes in the human genome than there are expressed transcripts. Alternative splicing is the reason. Alternatively spliced transcripts are often specific to tissue type, developmental stage, environmental condition, or disease state. Accurate analysis of microarray expression data and design of new arrays for alternative splicing require assessment of probes at the sequence and exon levels. DESCRIPTION: SpliceMiner is a web interface for querying Evidence Viewer Database (EVDB). EVDB is a comprehensive, non-redundant compendium of splice variant data for human genes. We constructed EVDB as a queryable implementation of the NCBI Evidence Viewer (EV). EVDB is based on data obtained from NCBI Entrez Gene and EV. The automated EVDB build process uses only complete coding sequences, which may or may not include partial or complete 5' and 3' UTRs, and filters redundant splice variants. Unlike EV, which supports only one-at-a-time queries, SpliceMiner supports high-throughput batch queries and provides results in an easily parsable format. SpliceMiner maps probes to splice variants, effectively delineating the variants identified by a probe. CONCLUSION: EVDB can be queried by gene symbol, genomic coordinates, or probe sequence via a user-friendly web-based tool we call SpliceMiner (). The EVDB/SpliceMiner combination provides an interface with human splice variant information and, going beyond the very valuable NCBI Evidence Viewer, supports fluent, high-throughput analysis. Integration of EVDB information into microarray analysis and design pipelines has the potential to improve the analysis and bioinformatic interpretation of gene expression data, for both batch and interactive processing. For example, whenever a gene expression value is recognized as important or appears anomalous in a microarray experiment, the interactive mode of SpliceMiner can be used quickly and easily to check for possible splice variant issues

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

TGFβ Signaling Increases Net Acid Extrusion, Proliferation and Invasion in Panc-1 Pancreatic Cancer Cells:SMAD4 Dependence and Link to Merlin/NF2 Signaling

Author: Christensen Søren T.
Ludwig Mette Q
Malinda Raj R.
Pedersen Lotte B.
Pedersen Stine F.
Sharku Patricia C.
Zeeberg Katrine
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Copenhagen University Research Information System

SpliceCenter: A suite of web-based bioinformatic applications for evaluating the impact of alternative splicing on RT-PCR, RNAi, microarray, and peptide-based studies

Author: Caplen Natasha J
Cleland James A
Kahn Ari B
Liu Hongfang
Ryan Michael C
Weinstein John N
Zeeberg Barry R
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Over 60% of protein-coding genes in vertebrates express mRNAs that undergo alternative splicing. The resulting collection of transcript isoforms poses significant challenges for contemporary biological assays. For example, RT-PCR validation of gene expression microarray results may be unsuccessful if the two technologies target different splice variants. Effective use of sequence-based technologies requires knowledge of the specific splice variant(s) that are targeted. In addition, the critical roles of alternative splice forms in biological function and in disease suggest that assay results may be more informative if analyzed in the context of the targeted splice variant. Results A number of contemporary technologies are used for analyzing transcripts or proteins. To enable investigation of the impact of splice variation on the interpretation of data derived from those technologies, we have developed SpliceCenter. SpliceCenter is a suite of user-friendly, web-based applications that includes programs for analysis of RT-PCR primer/probe sets, effectors of RNAi, microarrays, and protein-targeting technologies. Both interactive and high-throughput implementations of the tools are provided. The interactive versions of SpliceCenter tools provide visualizations of a gene's alternative transcripts and probe target positions, enabling the user to identify which splice variants are or are not targeted. The high-throughput batch versions accept user query files and provide results in tabular form. When, for example, we used SpliceCenter's batch siRNA-Check to process the Cancer Genome Anatomy Project's large-scale shRNA library, we found that only 59% of the 50,766 shRNAs in the library target all known splice variants of the target gene, 32% target some but not all, and 9% do not target any currently annotated transcript. Conclusion SpliceCenter <url>http://discover.nci.nih.gov/splicecenter</url> provides unique, user-friendly applications for assessing the impact of transcript variation on the design and interpretation of RT-PCR, RNAi, gene expression microarrays, antibody-based detection, and mass spectrometry proteomics. The tools are intended for use by bench biologists as well as bioinformaticists.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Nonlinear gene cluster analysis with labeling for microarray gene expression data in organ development

Author: A Sturn
B Zeeberg
B Zeeberg
Barry R Zeeberg
Brian P Brooks
CA Suàrez-Quian
CL Sigulinsky
DJ Mordantameron
Gene Ontology Consortium
Jacob Brown
JD Brown
JN Weinstein
K Pearson
L Kaufman
M Ashburner
M Belkin
M Belkin
Martin Ehler
P Langfelder
RF Bonner
Robert F Bonner
S Reichman
SP Lloyd
SR Goldstein
T Hastie
T Hestilow
Vinodh N Rajapakse
W Czaja
Wojciech Czaja
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis

Author: A Alexa
A Sturn
AJ Richards
Ari B Kahn
Barry R Zeeberg
BR Zeeberg
BR Zeeberg
Brian P Brooks
C Herrmann
Hongfang Liu
J Wang
Jacob D Brown
JN Weinstein
JN Weinstein
John N Weinstein
K Prufer
M Ashburner
Martin Ehler
P Pehkonen
Robert F Bonner
S Bauer
S Grossmann
T Xu
Vinodh N Rajapakse
Vladimir L Larionov
William Reinhold
Y Lu
Yves G Pommier
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a list of significant functional categories. Two or more of the categories are often redundant, in the sense that identical or nearly-identical sets of genes map to the categories. The redundancy might typically inflate the report of significant categories by a factor of three-fold, create an illusion of an overly long list of significant categories, and obscure the relevant biological interpretation. Results We now introduce a new resource, RedundancyMiner, that de-replicates the redundant and nearly-redundant GO categories that had been determined by first running GoMiner. The main algorithm of RedundancyMiner, MultiClust, performs a novel form of cluster analysis in which a GO category might belong to several category clusters. Each category cluster follows a "complete linkage" paradigm. The metric is a similarity measure that captures the overlap in gene mapping between pairs of categories. Conclusions RedundancyMiner effectively eliminated redundancies from a set of GO categories. For illustration, we have applied it to the clarification of the results arising from two current studies: (1) assessment of the gene expression profiles obtained by laser capture microdissection (LCM) of serial cryosections of the retina at the site of final optic fissure closure in the mouse embryos at specific embryonic stages, and (2) analysis of a conceptual data set obtained by examining a list of genes deemed to be "kinetochore" genes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Infectious Disease Ontology

Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

PhilPapers

CiteSeerX

Crossref

Adding a Little Reality to Building Ontologies for Biology

Author: A Rector
AP Seyed
B Russell
B Smith
B Smith
B Smith
B Zeeberg
G Merrill
I Johansson
Iddo Friedberg
J Shrager
K Wolstencroft
M Ashburner
M Dumontier
M Egana
P Grenon
P Lord
Phillip Lord
PL Whetzel
PW Lord
R Stevens
Robert Stevens
S Schulz
T Gruber
W Ceusters
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

BACKGROUND: Many areas of biology are open to mathematical and computational modelling. The application of discrete, logical formalisms defines the field of biomedical ontologies. Ontologies have been put to many uses in bioinformatics. The most widespread is for description of entities about which data have been collected, allowing integration and analysis across multiple resources. There are now over 60 ontologies in active use, increasingly developed as large, international collaborations. There are, however, many opinions on how ontologies should be authored; that is, what is appropriate for representation. Recently, a common opinion has been the "realist" approach that places restrictions upon the style of modelling considered to be appropriate. METHODOLOGY/PRINCIPAL FINDINGS: Here, we use a number of case studies for describing the results of biological experiments. We investigate the ways in which these could be represented using both realist and non-realist approaches; we consider the limitations and advantages of each of these models. CONCLUSIONS/SIGNIFICANCE: From our analysis, we conclude that while realist principles may enable straight-forward modelling for some topics, there are crucial aspects of science and the phenomena it studies that do not fit into this approach; realism appears to be over-simplistic which, perversely, results in overly complex ontological models. We suggest that it is impossible to avoid compromise in modelling ontology; a clearer understanding of these compromises will better enable appropriate modelling, fulfilling the many needs for discrete mathematical models within computational biology

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

ToppCluster: a multiple gene list feature analyzer for comparative enrichment clustering and network-based dissection of biological systems

Author: A. G. Jegga
B. J. Aronow
Chen
Dennis
E. E. Bardes
Eisen
Kimura
Langfelder
Liu
Miano
Naya
PONTOGLIO
Reich
S. C. Tabar
Schnabel
Shannon
Taraviras
Usadel
V. Kaimal
Zeeberg
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

ToppCluster is a web server application that leverages a powerful enrichment analysis and underlying data environment for comparative analyses of multiple gene lists. It generates heatmaps or connectivity networks that reveal functional features shared or specific to multiple gene lists. ToppCluster uses hypergeometric tests to obtain list-specific feature enrichment P-values for currently 17 categories of annotations of human-ortholog genes, and provides user-selectable cutoffs and multiple testing correction methods to control false discovery. Each nameable gene list represents a column input to a resulting matrix whose rows are overrepresented features, and individual cells per-list P-values and corresponding genes per feature. ToppCluster provides users with choices of tabular outputs, hierarchical clustering and heatmap generation, or the ability to interactively select features from the functional enrichment matrix to be transformed into XGMML or GEXF network format documents for use in Cytoscape or Gephi applications, respectively. Here, as example, we demonstrate the ability of ToppCluster to enable identification of list-specific phenotypic and regulatory element features (both cis-elements and 3′UTR microRNA binding sites) among tissue-specific gene lists. ToppCluster’s functionalities enable the identification of specialized biological functions and regulatory networks and systems biology-based dissection of biological states. ToppCluster can be accessed freely at http://toppcluster.cchmc.org

Crossref

PubMed Central

Gitools: Analysis and Visualisation of Genomic Data Using Interactive Heat-Maps

Author: A Floratos
A Mascarell-Creus
A Sturn
A Subramanian
AI Saeed
B Usadel
BR Zeeberg
BR Zeeberg
Christian Perez-Llamas
D Smedley
DW Huang
DW Huang
G Gundem
I Ferreiro
I Medina
J Chen
J Hou
JN Weinstein
M Ashburner
M Hall
M Kanehisa
M Kapushesky
M Reich
MA Sartor
MA Sartor
MC Whitlock
N Lopez-Bigas
N Lopez-Bigas
Nuria Lopez-Bigas
P Pavlidis
R Shamir
S Holm
Stein Aerts
TJP Hubbard
V Rodilla
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Intuitive visualization of data and results is very important in genomics, especially when many conditions are to be analyzed and compared. Heat-maps have proven very useful for the representation of biological data. Here we present Gitools (http://www.gitools.org), an open-source tool to perform analyses and visualize data and results as interactive heat-maps. Gitools contains data import systems from several sources (i.e. IntOGen, Biomart, KEGG, Gene Ontology), which facilitate the integration of novel data with previous knowledge

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

UPF Digital Repository