Search CORE

An Ontology-Based Framework for Clinical Research Databases

Author: Carl Dahlke
David Karp
Megan Kong
Richard H. Scheuermann
Publication venue
Publication date: 27/07/2009
Field of study

The Ontology-Based eXtensible data model (OBX) was developed to serve as a framework for the development of a clinical research database in the Immunology Database and Analysis Portal (ImmPort) system. OBX was designed around the logical structure provided by the Basic Formal Ontology (BFO) and the Ontology for Biomedical Investigations (OBI). By using the logical structure provided by these two well-formulated ontologies, we have found that a relatively simple, extensible data model could be developed to represent the relatively complex domain of clinical research. In addition, the common framework provided by the BFO should make it straightforward to utilize OBX database data dictionaries based on reference and application ontologies from the OBO Foundry

Nature Precedings

An improved ontological representation of dendritic cells as a paradigm for all cell types

Author: Arighi Cecilia N.
Chris Mungall
Cowell Lindsay G.
Diehl Alexander D.
Lieberman Anne E.
Maria Masci Anna
Scheuermann Richard H.
Smith Barry
Publication venue
Publication date: 01/01/2009
Field of study

The Cell Ontology (CL) is designed to provide a standardized representation of cell types for data annotation. Currently, the CL employs multiple is_a relations, defining cell types in terms of histological, functional, and lineage properties, and the majority of definitions are written with sufficient generality to hold across multiple species. This approach limits the CL’s utility for cross-species data integration. To address this problem, we developed a method for the ontological representation of cells and applied this method to develop a dendritic cell ontology (DC-CL). DC-CL subtypes are delineated on the basis of surface protein expression, systematically including both species-general and species-specific types and optimizing DC-CL for the analysis of flow cytometry data. This approach brings benefits in the form of increased accuracy, support for reasoning, and interoperability with other ontology resources. 104. Barry Smith, “Toward a Realistic Science of Environments”, Ecological Psychology, 2009, 21 (2), April-June, 121-130. Abstract: The perceptual psychologist J. J. Gibson embraces a radically externalistic view of mind and action. We have, for Gibson, not a Cartesian mind or soul, with its interior theater of contents and the consequent problem of explaining how this mind or soul and its psychological environment can succeed in grasping physical objects external to itself. Rather, we have a perceiving, acting organism, whose perceptions and actions are always already tuned to the parts and moments, the things and surfaces, of its external environment. We describe how on this basis Gibson sought to develop a realist science of environments which will be ‘consistent with physics, mechanics, optics, acoustics, and chemistry’

PhilPapers

A distribution-free convolution model for background correction of oligonucleotide microarray data

Author: Chen Zhongxue
Deng Youping
Kong Megan
Liu Qingzhong
McGee Monnie
Scheuermann Richard H
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

IntroductionAffymetrix GeneChip® high-density oligonucleotide arrays are widely used in biological and medical research because of production reproducibility, which facilitates the comparison of results between experiment runs. In order to obtain high-level classification and cluster analysis that can be trusted, it is important to perform various pre-processing steps on the probe-level data to control for variability in sample processing and array hybridization. Many proposed preprocessing methods are parametric, in that they assume that the background noise generated by microarray data is a random sample from a statistical distribution, typically a normal distribution. The quality of the final results depends on the validity of such assumptions. ResultsWe propose a Distribution Free Convolution Model (DFCM) to circumvent observed deficiencies in meeting and validating distribution assumptions of parametric methods. Knowledge of array structure and the biological function of the probes indicate that the intensities of mismatched (MM) probes that correspond to the smallest perfect match (PM) intensities can be used to estimate the background noise. Specifically, we obtain the smallest q2 percent of the MM intensities that are associated with the lowest q1 percent PM intensities, and use these intensities to estimate background. ConclusionUsing the Affymetrix Latin Square spike-in experiments, we show that the background noise generated by microarray experiments typically is not well modeled by a single overall normal distribution. We further show that the signal is not exponentially distributed, as is also commonly assumed. Therefore, DFCM has better sensitivity and specificity, as measured by ROC curves and area under the curve (AUC) than MAS 5.0, RMA, RMA with no background correction (RMA-noBG), GCRMA, PLIER, and dChip (MBEI) for preprocessing of Affymetrix microarray data. These results hold for two spike-in data sets and one real data set that were analyzed. Comparisons with other methods on two spike-in data sets and one real data set show that our nonparametric methods are a superior alternative for background correction of Affymetrix data

Aquila Digital Community

Scholarly Works @ SHSU (Sam Houston State University)

Springer - Publisher Connector

Activation of the Syk tyrosine kinase is insufficient for downstream signal transduction in B lymphocytes

Author: Hammill Adrienne M
Hsueh Robert C
Lee Jamie A
Scheuermann Richard H
Uhr Jonathan W
Publication venue: BioMed Central
Publication date: 01/12/2002
Field of study

BACKGROUND: Immature B lymphocytes and certain B cell lymphomas undergo apoptotic cell death following activation of the B cell antigen receptor (BCR) signal transduction pathway. Several biochemical changes occur in response to BCR engagement, including activation of the Syk tyrosine kinase. Although Syk activation appears to be necessary for some downstream biochemical and cellular responses, the signaling events that precede Syk activation remain ill defined. In addition, the requirements for complete activation of the Syk-dependent signaling step remain to be elucidated. RESULTS: A mutant form of Syk carrying a combination of a K395A substitution in the kinase domain and substitutions of three phenylalanines (3F) for the three C-terminal tyrosines was expressed in a murine B cell lymphoma cell line, BCL(1).3B3 to interfere with normal Syk regulation as a means to examine the Syk activation step in BCR signaling. Introduction of this kinase-inactive mutant led to the constitutive activation of the endogenous wildtype Syk enzyme in the absence of receptor engagement through a 'dominant-positive' effect. Under these conditions, Syk kinase activation occurred in the absence of phosphorylation on Syk tyrosine residues. Although Syk appears to be required for BCR-induced apoptosis in several systems, no increase in spontaneous cell death was observed in these cells. Surprisingly, although the endogenous Syk kinase was enzymatically active, no enhancement in the phosphorylation of cytoplasmic proteins, including phospholipase Cγ2 (PLCγ2), a direct Syk target, was observed. CONCLUSION: These data indicate that activation of Syk kinase enzymatic activity is insufficient for Syk-dependent signal transduction. This observation suggests that other events are required for efficient signaling. We speculate that localization of the active enzyme to a receptor complex specifically assembled for signal transduction may be the missing event

A gene selection method for GeneChip array data with small sample sizes

Author: Chen Zhongxue
Deng Youping
Huang Xudong
Kong Megan
Liu Qingzhong
McGee Monnie
Scheuermann Richard H
Publication venue: BioMed Central
Publication date: 01/07/2010
Field of study

Abstract Background In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Although permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. Furthermore, estimated permutation-based p-values for true nulls are highly correlated and not uniformly distributed between zero and one, making it difficult to use current false discovery rate (FDR)-controlling methods. Results We propose a model-based information sharing method (MBIS) that, after an appropriate data transformation, utilizes information shared among genes. We use a normal distribution to model the mean differences of true nulls across two experimental conditions. The parameters of the model are then estimated using all data in hand. Based on this model, p-values, which are uniformly distributed from true nulls, are calculated. Then, since FDR-controlling methods are generally not well suited to microarray data with very small sample sizes, we select genes for a given cutoff p-value and then estimate the false discovery rate. Conclusion Simulation studies and analysis using real microarray data show that the proposed method, MBIS, is more powerful and reliable than current methods. It has wide application to a variety of situations.</p

Scholarly Works @ SHSU (Sam Houston State University)

Harvard University - DASH

Springer - Publisher Connector

arXiv.org e-Print Archive

Application of Random Matrix Theory to Biological Networks

Author: Barabasi
Bohigas
Deane
Enright
Feng Luo
Girvan
Hartwell
Hasty
Hofstetter
Jeong
Jianxin Zhong
Jizhong Zhou
Overbeek
Plerou
Ravasz
Richard H. Scheuermann
Rives
Seba
Song
Wigner
Wigner
Xenarios
Yunfeng Yang
Zhong
Zhong
Publication venue: 'Elsevier BV'
Publication date: 22/03/2005
Field of study

We show that spectral fluctuation of interaction matrices of yeast a core protein interaction network and a metabolic network follows the description of the Gaussian orthogonal ensemble (GOE) of random matrix theory (RMT). Furthermore, we demonstrate that while the global biological networks evaluated belong to GOE, removal of interactions between constituents transitions the networks to systems of isolated modules described by the Poisson statistics of RMT. Our results indicate that although biological networks are very different from other complex systems at the molecular level, they display the same statistical properties at large scale. The transition point provides a new objective approach for the identification of functional modules.Comment: 3 pages, 2 figure

Cochlear transcriptome analysis of an outbred mouse population (CFW)

Author: Clara Draf
Ely Cheikh Boussaty
Eric Du
Mark Novotny
Neil Tedeschi
Richard H. Scheuermann
Richard H. Scheuermann
Rick Friedman
Uri Manor
Yun Zhang
Yuzuru Ninoyu
Publication venue: Frontiers Media S.A.
Publication date: 01/11/2023
Field of study

Age-related hearing loss (ARHL) is the most common cause of hearing loss and one of the most prevalent conditions affecting the elderly worldwide. Despite evidence from our lab and others about its polygenic nature, little is known about the specific genes, cell types, and pathways involved in ARHL, impeding the development of therapeutic interventions. In this manuscript, we describe, for the first time, the complete cell-type specific transcriptome of the aging mouse cochlea using snRNA-seq in an outbred mouse model in relation to auditory threshold variation. Cochlear cell types were identified using unsupervised clustering and annotated via a three-tiered approach—first by linking to expression of known marker genes, then using the NSForest algorithm to select minimum cluster-specific marker genes and reduce dimensional feature space for statistical comparison of our clusters with existing publicly-available data sets on the gEAR website,1 and finally, by validating and refining the annotations using Multiplexed Error Robust Fluorescence In Situ Hybridization (MERFISH) and the cluster-specific marker genes as probes. We report on 60 unique cell-types expanding the number of defined cochlear cell types by more than two times. Importantly, we show significant specific cell type increases and decreases associated with loss of hearing acuity implicating specific subsets of hair cell subtypes, ganglion cell subtypes, and cell subtypes within the stria vascularis in this model of ARHL. These results provide a view into the cellular and molecular mechanisms responsible for age-related hearing loss and pathways for therapeutic targeting

Automated Analysis of Flow Cytometry Data to Reduce Inter-Lab Variation in the Detection of Major Histocompatibility Complex Multimer-Binding T Cells

Author: Alexandra J. Lee
Charlotte Halgreen
Cliburn Chan
Cécile Gouttefangeas
Jonathan Rebhahn
Kivin Jakobsen
Mathilde Dalsgaard Hoff
Nadia Viborg Petersen
Natasja Wulff Pedersen
P. Anoop Chandran
Richard H. Scheuermann
Richard H. Scheuermann
Rick Stanton
Scott White
Sine Reker Hadrup
Tim Mosmann
Yu Qian
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Manual analysis of flow cytometry data and subjective gate-border decisions taken by individuals continue to be a source of variation in the assessment of antigen-specific T cells when comparing data across laboratories, and also over time in individual labs. Therefore, strategies to provide automated analysis of major histocompatibility complex (MHC) multimer-binding T cells represent an attractive solution to decrease subjectivity and technical variation. The challenge of using an automated analysis approach is that MHC multimer-binding T cell populations are often rare and therefore difficult to detect. We used a highly heterogeneous dataset from a recent MHC multimer proficiency panel to assess if MHC multimer-binding CD8+ T cells could be analyzed with computational solutions currently available, and if such analyses would reduce the technical variation across different laboratories. We used three different methods, FLOw Clustering without K (FLOCK), Scalable Weighted Iterative Flow-clustering Technique (SWIFT), and ReFlow to analyze flow cytometry data files from 28 laboratories. Each laboratory screened for antigen-responsive T cell populations with frequency ranging from 0.01 to 1.5% of lymphocytes within samples from two donors. Experience from this analysis shows that all three programs can be used for the identification of high to intermediate frequency of MHC multimer-binding T cell populations, with results very similar to that of manual gating. For the less frequent populations (<0.1% of live, single lymphocytes), SWIFT outperformed the other tools. As used in this study, none of the algorithms offered a completely automated pipeline for identification of MHC multimer populations, as varying degrees of human interventions were needed to complete the analysis. In this study, we demonstrate the feasibility of using automated analysis pipelines for assessing and identifying even rare populations of antigen-responsive T cells and discuss the main properties, differences, and advantages of the different methods tested

Frontiers - Publisher Connector

eScholarship - University of California

Online Research Database In Technology

Influenza research database: an integrated bioinformatics resource for influenza research and surveillance.

Author: Baumgarth Nicole
Deitrich Jon
García-Sastre Adolfo
Hunt Victoria
Klem Edward
Kumar Sanjeev
Larsen Christopher N
Macken Catherine
Noronha Jyothi
Pickett Brett E
Ramsey Alvin
Scheuermann Richard H
Squires R Burke
Suarez David
Zaremba Sam
Zhang Yun
Zhou Liwei
Publication venue: eScholarship, University of California
Publication date: 01/11/2012
Field of study

BackgroundThe recent emergence of the 2009 pandemic influenza A/H1N1 virus has highlighted the value of free and open access to influenza virus genome sequence data integrated with information about other important virus characteristics.DesignThe Influenza Research Database (IRD, http://www.fludb.org) is a free, open, publicly-accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user-friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in-protected 'workbench' spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature.ResultsTo demonstrate the utility of the data and analysis tools available in IRD, two scientific use cases are presented. A comparison of hemagglutinin sequence conservation and epitope coverage information revealed highly conserved protein regions that can be recognized by the human adaptive immune system as possible targets for inducing cross-protective immunity. Phylogenetic and geospatial analysis of sequences from wild bird surveillance samples revealed a possible evolutionary connection between influenza virus from Delaware Bay shorebirds and Alberta ducks.ConclusionsThe IRD provides a wealth of integrated data and information about influenza virus to support research of the genetic determinants dictating virus pathogenicity, host range restriction and transmission, and to facilitate development of vaccines, diagnostics, and therapeutics