Search CORE

186 research outputs found

An Ontology-Based Framework for Clinical Research Databases

Author: Carl Dahlke
David Karp
Megan Kong
Richard H. Scheuermann
Publication venue
Publication date: 27/07/2009
Field of study

The Ontology-Based eXtensible data model (OBX) was developed to serve as a framework for the development of a clinical research database in the Immunology Database and Analysis Portal (ImmPort) system. OBX was designed around the logical structure provided by the Basic Formal Ontology (BFO) and the Ontology for Biomedical Investigations (OBI). By using the logical structure provided by these two well-formulated ontologies, we have found that a relatively simple, extensible data model could be developed to represent the relatively complex domain of clinical research. In addition, the common framework provided by the BFO should make it straightforward to utilize OBX database data dictionaries based on reference and application ontologies from the OBO Foundry

Crossref

Nature Precedings

A distribution-free convolution model for background correction of oligonucleotide microarray data

Author: Chen Zhongxue
Deng Youping
Kong Megan
Liu Qingzhong
McGee Monnie
Scheuermann Richard H
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

IntroductionAffymetrix GeneChip® high-density oligonucleotide arrays are widely used in biological and medical research because of production reproducibility, which facilitates the comparison of results between experiment runs. In order to obtain high-level classification and cluster analysis that can be trusted, it is important to perform various pre-processing steps on the probe-level data to control for variability in sample processing and array hybridization. Many proposed preprocessing methods are parametric, in that they assume that the background noise generated by microarray data is a random sample from a statistical distribution, typically a normal distribution. The quality of the final results depends on the validity of such assumptions. ResultsWe propose a Distribution Free Convolution Model (DFCM) to circumvent observed deficiencies in meeting and validating distribution assumptions of parametric methods. Knowledge of array structure and the biological function of the probes indicate that the intensities of mismatched (MM) probes that correspond to the smallest perfect match (PM) intensities can be used to estimate the background noise. Specifically, we obtain the smallest q2 percent of the MM intensities that are associated with the lowest q1 percent PM intensities, and use these intensities to estimate background. ConclusionUsing the Affymetrix Latin Square spike-in experiments, we show that the background noise generated by microarray experiments typically is not well modeled by a single overall normal distribution. We further show that the signal is not exponentially distributed, as is also commonly assumed. Therefore, DFCM has better sensitivity and specificity, as measured by ROC curves and area under the curve (AUC) than MAS 5.0, RMA, RMA with no background correction (RMA-noBG), GCRMA, PLIER, and dChip (MBEI) for preprocessing of Affymetrix microarray data. These results hold for two spike-in data sets and one real data set that were analyzed. Comparisons with other methods on two spike-in data sets and one real data set show that our nonparametric methods are a superior alternative for background correction of Affymetrix data

Aquila Digital Community

Crossref

Scholarly Works @ SHSU (Sam Houston State University)

PubMed Central

Activation of the Syk tyrosine kinase is insufficient for downstream signal transduction in B lymphocytes

Author: Hammill Adrienne M
Hsueh Robert C
Lee Jamie A
Scheuermann Richard H
Uhr Jonathan W
Publication venue: BioMed Central
Publication date: 01/12/2002
Field of study

BACKGROUND: Immature B lymphocytes and certain B cell lymphomas undergo apoptotic cell death following activation of the B cell antigen receptor (BCR) signal transduction pathway. Several biochemical changes occur in response to BCR engagement, including activation of the Syk tyrosine kinase. Although Syk activation appears to be necessary for some downstream biochemical and cellular responses, the signaling events that precede Syk activation remain ill defined. In addition, the requirements for complete activation of the Syk-dependent signaling step remain to be elucidated. RESULTS: A mutant form of Syk carrying a combination of a K395A substitution in the kinase domain and substitutions of three phenylalanines (3F) for the three C-terminal tyrosines was expressed in a murine B cell lymphoma cell line, BCL(1).3B3 to interfere with normal Syk regulation as a means to examine the Syk activation step in BCR signaling. Introduction of this kinase-inactive mutant led to the constitutive activation of the endogenous wildtype Syk enzyme in the absence of receptor engagement through a 'dominant-positive' effect. Under these conditions, Syk kinase activation occurred in the absence of phosphorylation on Syk tyrosine residues. Although Syk appears to be required for BCR-induced apoptosis in several systems, no increase in spontaneous cell death was observed in these cells. Surprisingly, although the endogenous Syk kinase was enzymatically active, no enhancement in the phosphorylation of cytoplasmic proteins, including phospholipase Cγ2 (PLCγ2), a direct Syk target, was observed. CONCLUSION: These data indicate that activation of Syk kinase enzymatic activity is insufficient for Syk-dependent signal transduction. This observation suggests that other events are required for efficient signaling. We speculate that localization of the active enzyme to a receptor complex specifically assembled for signal transduction may be the missing event

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A gene selection method for GeneChip array data with small sample sizes

Author: Chen Zhongxue
Deng Youping
Huang Xudong
Kong Megan
Liu Qingzhong
McGee Monnie
Scheuermann Richard H
Publication venue: BioMed Central
Publication date: 01/07/2010
Field of study

Abstract Background In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Although permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. Furthermore, estimated permutation-based p-values for true nulls are highly correlated and not uniformly distributed between zero and one, making it difficult to use current false discovery rate (FDR)-controlling methods. Results We propose a model-based information sharing method (MBIS) that, after an appropriate data transformation, utilizes information shared among genes. We use a normal distribution to model the mean differences of true nulls across two experimental conditions. The parameters of the model are then estimated using all data in hand. Based on this model, p-values, which are uniformly distributed from true nulls, are calculated. Then, since FDR-controlling methods are generally not well suited to microarray data with very small sample sizes, we select genes for a given cutoff p-value and then estimate the false discovery rate. Conclusion Simulation studies and analysis using real microarray data show that the proposed method, MBIS, is more powerful and reliable than current methods. It has wide application to a variety of situations.</p

Crossref

Scholarly Works @ SHSU (Sam Houston State University)

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Application of Random Matrix Theory to Biological Networks

Author: Barabasi
Bohigas
Deane
Enright
Feng Luo
Girvan
Hartwell
Hasty
Hofstetter
Jeong
Jianxin Zhong
Jizhong Zhou
Overbeek
Plerou
Ravasz
Richard H. Scheuermann
Rives
Seba
Song
Wigner
Wigner
Xenarios
Yunfeng Yang
Zhong
Zhong
Publication venue: 'Elsevier BV'
Publication date: 22/03/2005
Field of study

We show that spectral fluctuation of interaction matrices of yeast a core protein interaction network and a metabolic network follows the description of the Gaussian orthogonal ensemble (GOE) of random matrix theory (RMT). Furthermore, we demonstrate that while the global biological networks evaluated belong to GOE, removal of interactions between constituents transitions the networks to systems of isolated modules described by the Poisson statistics of RMT. Our results indicate that although biological networks are very different from other complex systems at the molecular level, they display the same statistical properties at large scale. The transition point provides a new objective approach for the identification of functional modules.Comment: 3 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Automated Analysis of Flow Cytometry Data to Reduce Inter-Lab Variation in the Detection of Major Histocompatibility Complex Multimer-Binding T Cells

Author: Alexandra J. Lee
Charlotte Halgreen
Cliburn Chan
Cécile Gouttefangeas
Jonathan Rebhahn
Kivin Jakobsen
Mathilde Dalsgaard Hoff
Nadia Viborg Petersen
Natasja Wulff Pedersen
P. Anoop Chandran
Richard H. Scheuermann
Richard H. Scheuermann
Rick Stanton
Scott White
Sine Reker Hadrup
Tim Mosmann
Yu Qian
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Manual analysis of flow cytometry data and subjective gate-border decisions taken by individuals continue to be a source of variation in the assessment of antigen-specific T cells when comparing data across laboratories, and also over time in individual labs. Therefore, strategies to provide automated analysis of major histocompatibility complex (MHC) multimer-binding T cells represent an attractive solution to decrease subjectivity and technical variation. The challenge of using an automated analysis approach is that MHC multimer-binding T cell populations are often rare and therefore difficult to detect. We used a highly heterogeneous dataset from a recent MHC multimer proficiency panel to assess if MHC multimer-binding CD8+ T cells could be analyzed with computational solutions currently available, and if such analyses would reduce the technical variation across different laboratories. We used three different methods, FLOw Clustering without K (FLOCK), Scalable Weighted Iterative Flow-clustering Technique (SWIFT), and ReFlow to analyze flow cytometry data files from 28 laboratories. Each laboratory screened for antigen-responsive T cell populations with frequency ranging from 0.01 to 1.5% of lymphocytes within samples from two donors. Experience from this analysis shows that all three programs can be used for the identification of high to intermediate frequency of MHC multimer-binding T cell populations, with results very similar to that of manual gating. For the less frequent populations (<0.1% of live, single lymphocytes), SWIFT outperformed the other tools. As used in this study, none of the algorithms offered a completely automated pipeline for identification of MHC multimer populations, as varying degrees of human interventions were needed to complete the analysis. In this study, we demonstrate the feasibility of using automated analysis pipelines for assessing and identifying even rare populations of antigen-responsive T cells and discuss the main properties, differences, and advantages of the different methods tested

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

eScholarship - University of California

Online Research Database In Technology

Influenza research database: an integrated bioinformatics resource for influenza research and surveillance.

Author: Baumgarth Nicole
Deitrich Jon
García-Sastre Adolfo
Hunt Victoria
Klem Edward
Kumar Sanjeev
Larsen Christopher N
Macken Catherine
Noronha Jyothi
Pickett Brett E
Ramsey Alvin
Scheuermann Richard H
Squires R Burke
Suarez David
Zaremba Sam
Zhang Yun
Zhou Liwei
Publication venue: eScholarship, University of California
Publication date: 01/11/2012
Field of study

BackgroundThe recent emergence of the 2009 pandemic influenza A/H1N1 virus has highlighted the value of free and open access to influenza virus genome sequence data integrated with information about other important virus characteristics.DesignThe Influenza Research Database (IRD, http://www.fludb.org) is a free, open, publicly-accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in-protected 'workbench' spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature.ResultsTo demonstrate the utility of the data and analysis tools available in IRD, two scientific use cases are presented. A comparison of hemagglutinin sequence conservation and epitope coverage information revealed highly conserved protein regions that can be recognized by the human adaptive immune system as possible targets for inducing cross-protective immunity. Phylogenetic and geospatial analysis of sequences from wild bird surveillance samples revealed a possible evolutionary connection between influenza virus from Delaware Bay shorebirds and Alberta ducks.ConclusionsThe IRD provides a wealth of integrated data and information about influenza virus to support research of the genetic determinants dictating virus pathogenicity, host range restriction and transmission, and to facilitate development of vaccines, diagnostics, and therapeutics

PubMed Central

eScholarship - University of California

An improved ontological representation of dendritic cells as a paradigm for all cell types

Author: Arighi Cecilia N
Cowell Lindsay G
Diehl Alexander D
Lieberman Anne E
Masci Anna Maria
Mungall Chris
Scheuermann Richard H
Smith Barry
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Recent increases in the volume and diversity of life science data and information and an increasing emphasis on data sharing and interoperability have resulted in the creation of a large number of biological ontologies, including the Cell Ontology (CL), designed to provide a standardized representation of cell types for data annotation. Ontologies have been shown to have significant benefits for computational analyses of large data sets and for automated reasoning applications, leading to organized attempts to improve the structure and formal rigor of ontologies to better support computation. Currently, the CL employs multiple <it>is_a </it>relations, defining cell types in terms of histological, functional, and lineage properties, and the majority of definitions are written with sufficient generality to hold across multiple species. This approach limits the CL's utility for computation and for cross-species data integration. Results To enhance the CL's utility for computational analyses, we developed a method for the ontological representation of cells and applied this method to develop a dendritic cell ontology (DC-CL). DC-CL subtypes are delineated on the basis of surface protein expression, systematically including both species-general and species-specific types and optimizing DC-CL for the analysis of flow cytometry data. We avoid multiple uses of <it>is_a </it>by linking DC-CL terms to terms in other ontologies via additional, formally defined relations such as <it>has_function</it>. Conclusion This approach brings benefits in the form of increased accuracy, support for reasoning, and interoperability with other ontology resources. Accordingly, we propose our method as a general strategy for the ontological representation of cells. DC-CL is available from <url>http://www.obofoundry.org</url>.</p

PhilPapers

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

VO: Vaccine Ontology

Author: Brinkman Ryan R.
Courtot Melanie
Cowell Lindsay
Diehl Alexander D.
He Yongqun
Mobley H. L.
Mungall Chris
Others
Peters Bjoern
Ruttenberg Alan
Scheuermann Richard H.
Smith Barry
Publication venue
Publication date: 01/01/2009
Field of study

Vaccine research, as well as the development, testing, clinical trials, and commercial uses of vaccines involve complex processes with various biological data that include gene and protein expression, analysis of molecular and cellular interactions, study of tissue and whole body responses, and extensive epidemiological modeling. Although many data resources are available to meet different aspects of vaccine needs, it remains a challenge how we are to standardize vaccine annotation, integrate data about varied vaccine types and resources, and support advanced vaccine data analysis and inference. To address these problems, the community-based Vaccine Ontology (VO) has been developed through collaboration with vaccine researchers and many national and international centers and programs, including the National Center for Biomedical Ontology (NCBO), the Infectious Disease Ontology (IDO) Initiative, and the Ontology for Biomedical Investigations (OBI). VO utilizes the Basic Formal Ontology (BFO) as the top ontology and the Relation Ontology (RO) for definition of term relationships. VO is represented in the Web Ontology Language (OWL) and edited using the Protégé-OWL. Currently VO contains more than 2000 terms and relationships. VO emphasizes on classification of vaccines and vaccine components, vaccine quality and phenotypes, and host immune response to vaccines. These reflect different aspects of vaccine composition and biology and can thus be used to model individual vaccines. More than 200 licensed vaccines and many vaccine candidates in research or clinical trials have been modeled in VO. VO is being used for vaccine literature mining through collaboration with the National Center for Integrative Biomedical Informatics (NCIBI). Multiple VO applications will be presented

PhilPapers

FuGEFlow: data model and markup language for flow cytometry

Author: Brinkman Ryan R
Gasparetto Maura
Jones Andrew R
Manion Frank J
Qian Yu
Scheuermann Richard H
Sekaly Rafick-Pierre
Spidlen Josef
Tchuvatkina Olga
Wilkinson Peter
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Flow cytometry technology is widely used in both health care and research. The rapid expansion of flow cytometry applications has outpaced the development of data storage and analysis tools. Collaborative efforts being taken to eliminate this gap include building common vocabularies and ontologies, designing generic data models, and defining data exchange formats. The Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) standard was recently adopted by the International Society for Advancement of Cytometry. This standard guides researchers on the information that should be included in peer reviewed publications, but it is insufficient for data exchange and integration between computational systems. The Functional Genomics Experiment (FuGE) formalizes common aspects of comprehensive and high throughput experiments across different biological technologies. We have extended FuGE object model to accommodate flow cytometry data and metadata. Methods We used the MagicDraw modelling tool to design a UML model (Flow-OM) according to the FuGE extension guidelines and the AndroMDA toolkit to transform the model to a markup language (Flow-ML). We mapped each MIFlowCyt term to either an existing FuGE class or to a new FuGEFlow class. The development environment was validated by comparing the official FuGE XSD to the schema we generated from the FuGE object model using our configuration. After the Flow-OM model was completed, the final version of the Flow-ML was generated and validated against an example MIFlowCyt compliant experiment description. Results The extension of FuGE for flow cytometry has resulted in a generic FuGE-compliant data model (FuGEFlow), which accommodates and links together all information required by MIFlowCyt. The FuGEFlow model can be used to build software and databases using FuGE software toolkits to facilitate automated exchange and manipulation of potentially large flow cytometry experimental data sets. Additional project documentation, including reusable design patterns and a guide for setting up a development environment, was contributed back to the FuGE project. Conclusion We have shown that an extension of FuGE can be used to transform minimum information requirements in natural language to markup language in XML. Extending FuGE required significant effort, but in our experiences the benefits outweighed the costs. The FuGEFlow is expected to play a central role in describing flow cytometry experiments and ultimately facilitating data exchange including public flow cytometry repositories currently under development.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central