138 research outputs found

    AnnotCompute: annotation-based exploration and meta-analysis of genomics experiments

    Get PDF
    The ever-increasing scale of biological data sets, particularly those arising in the context of high-throughput technologies, requires the development of rich data exploration tools. In this article, we present AnnotCompute, an information discovery platform for repositories of functional genomics experiments such as ArrayExpress. Our system leverages semantic annotations of functional genomics experiments with controlled vocabulary and ontology terms, such as those from the MGED Ontology, to compute conceptual dissimilarities between pairs of experiments. These dissimilarities are then used to support two types of exploratory analysis—clustering and query-by-example. We show that our proposed dissimilarity measures correspond to a user's intuition about conceptual dissimilarity, and can be used to support effective query-by-example. We also evaluate the quality of clustering based on these measures. While AnnotCompute can support a richer data exploration experience, its effectiveness is limited in some cases, due to the quality of available annotations. Nonetheless, tools such as AnnotCompute may provide an incentive for richer annotations of experiments. Code is available for download at http://www.cbil.upenn.edu/downloads/AnnotCompute

    A Community-Based Platform for Machine Learning Experimentation

    Full text link
    We demonstrate the practical uses of a community-based platform for the sharing and in-depth investigation of the thousands of machine learning experiments executed every day. It is aimed at researchers and practitioners of data mining techniques, and is publicly available at http://expdb.cs.kuleuven.be. The system offers standards and API’s for sharing experimental results, extensive querying capabilities of the gathered results and allows easy integration in existing data mining toolboxes. We believe such a system may speed up scientific discovery and enhance the scientific rigor of machine learning research.status: publishe

    Annotare—a tool for annotating high-throughput biomedical investigations and resulting data

    Get PDF
    Summary: Computational methods in molecular biology will increasingly depend on standards-based annotations that describe biological experiments in an unambiguous manner. Annotare is a software tool that enables biologists to easily annotate their high-throughput experiments, biomaterials and data in a standards-compliant way that facilitates meaningful search and analysis

    The strategies WDK: a graphical search interface and web development kit for functional genomics databases

    Get PDF
    Web sites associated with the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) have recently introduced a graphical user interface, the Strategies WDK, intended to make advanced searching and set and interval operations easy and accessible to all users. With a design guided by usability studies, the system helps motivate researchers to perform dynamic computational experiments and explore relationships across data sets. For example, PlasmoDB users seeking novel therapeutic targets may wish to locate putative enzymes that distinguish pathogens from their hosts, and that are expressed during appropriate developmental stages. When a researcher runs one of the approximately 100 searches available on the site, the search is presented as a first step in a strategy. The strategy is extended by running additional searches, which are combined with set operators (union, intersect or minus), or genomic interval operators (overlap, contains). A graphical display uses Venn diagrams to make the strategy’s flow obvious. The interface facilitates interactive adjustment of the component searches with changes propagating forward through the strategy. Users may save their strategies, creating protocols that can be shared with colleagues. The strategy system has now been deployed on all EuPathDB databases, and successfully deployed by other projects. The Strategies WDK uses a configurable MVC architecture that is compatible with most genomics and biological warehouse databases, and is available for download at code.google.com/p/strategies-wdk

    Functional genomics of the beta-cell: short-chain 3-hydroxyacyl-coenzyme A dehydrogenase regulates insulin secretion independent of K+ currents

    Get PDF
    Recent advances in functional genomics afford the opportunity to interrogate the expression profiles of thousands of genes simultaneously and examine the function of these genes in a high-throughput manner. In this study, we describe a rational and efficient approach to identifying novel regulators of insulin secretion by the pancreatic beta-cell. Computational analysis of expression profiles of several mouse and cellular models of impaired insulin secretion identified 373 candidate genes involved in regulation of insulin secretion. Using RNA interference, we assessed the requirements of 10 of these candidates and identified four genes (40%) as being essential for normal insulin secretion. Among the genes identified was Hadhsc, which encodes short-chain 3-hydroxyacyl-coenzyme A dehydrogenase (SCHAD), an enzyme of mitochondrial beta-oxidation of fatty acids whose mutation results in congenital hyperinsulinism. RNA interference-mediated gene suppression of Hadhsc in insulinoma cells and primary rodent islets revealed enhanced basal but normal glucose-stimulated insulin secretion. This increase in basal insulin secretion was not attenuated by the opening of the KATP channel with diazoxide, suggesting that SCHAD regulates insulin secretion through a KATP channel-independent mechanism. Our results suggest a molecular explanation for the hyperinsulinemia hypoglycemic seen in patients with SCHAD deficiency

    War of Ontology Worlds: Mathematics, Computer Code, or Esperanto?

    Get PDF
    The use of structured knowledge representations—ontologies and terminologies—has become standard in biomedicine. Definitions of ontologies vary widely, as do the values and philosophies that underlie them. In seeking to make these views explicit, we conducted and summarized interviews with a dozen leading ontologists. Their views clustered into three broad perspectives that we summarize as mathematics, computer code, and Esperanto. Ontology as mathematics puts the ultimate premium on rigor and logic, symmetry and consistency of representation across scientific subfields, and the inclusion of only established, non-contradictory knowledge. Ontology as computer code focuses on utility and cultivates diversity, fitting ontologies to their purpose. Like computer languages C++, Prolog, and HTML, the code perspective holds that diverse applications warrant custom designed ontologies. Ontology as Esperanto focuses on facilitating cross-disciplinary communication, knowledge cross-referencing, and computation across datasets from diverse communities. We show how these views align with classical divides in science and suggest how a synthesis of their concerns could strengthen the next generation of biomedical ontologies

    FungiDB: an integrated functional genomics database for fungi

    Get PDF
    FungiDB (http://FungiDB.org) is a functional genomic resource for pan-fungal genomes that was developed in partnership with the Eukaryotic Pathogen Bioinformatic resource center (http://EuPathDB.org). FungiDB uses the same infrastructure and user interface as EuPathDB, which allows for sophisticated and integrated searches to be performed using an intuitive graphical system. The current release of FungiDB contains genome sequence and annotation from 18 species spanning several fungal classes, including the Ascomycota classes, Eurotiomycetes, Sordariomycetes, Saccharomycetes and the Basidiomycota orders, Pucciniomycetes and Tremellomycetes, and the basal ‘Zygomycete’ lineage Mucormycotina. Additionally, FungiDB contains cell cycle microarray data, hyphal growth RNA-sequence data and yeast two hybrid interaction data. The underlying genomic sequence and annotation combined with functional data, additional data from the FungiDB standard analysis pipeline and the ability to leverage orthology provides a powerful resource for in silico experimentation
    corecore