Location of Repository

BioMart – biological queries made easy

By Damian Smedley, Syed Haider, Benoit Ballester, Richard Holland, Darin London, Gudmundur A. Thorisson and Arek Kasprzyk

Abstract

Background: Biologists need to perform complex queries, often across a variety of databases. Typically, each data resource provides an advanced query interface, each of which must be learnt by the biologist before they can begin to query them. Frequently, more than one data source is required and for high-throughput analysis, cutting and pasting results between websites is certainly very time consuming. Therefore, many groups rely on local bioinformatics support to process queries by accessing the resource's programmatic interfaces if they exist. This is not an efficient solution in terms of cost and time. Instead, it would be better if the biologist only had to learn one generic interface. BioMart provides such a solution.\ud Results: BioMart enables scientists to perform advanced querying of biological data sources through a single web interface. The power of the system comes from integrated querying of data sources regardless of their geographical locations. Once these queries have been defined, they may be automated with its "scripting at the click of a button" functionality. BioMart's capabilities are extended by integration with several widely used software packages such as BioConductor, DAS, Galaxy, Cytoscape, Taverna. In this paper, we describe all aspects of BioMart from a user's perspective and demonstrate how it can be used to solve real biological use cases such as SNP selection for candidate gene screening or annotation of microarray results.\ud Conclusion: BioMart is an easy to use, generic and scalable system and therefore, has become an integral part of large data resources including Ensembl, UniProt, HapMap, Wormbase, Gramene, Dictybase, PRIDE, MSD and Reactome. BioMart is freely accessible to use at http://www.biomart.org.Peer-reviewedPublisher Versio

Publisher: BioMed Central Ltd
Year: 2009
DOI identifier: 10.1186/1471-2164-10-22
OAI identifier: oai:lra.le.ac.uk:2381/8807
Journal:

Suggested articles

Preview

Citations

  1. (2004). Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology doi
  2. (2002). BioMOBY: an open-source biological web services proposal. Brief Bioinform doi
  3. (2000). Bleasby A: EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics doi
  4. (2007). Crnogorac-Jurcevic T: Pancreatic Expression database: a generic model for the organization, integration and mining of complex cancer datasets. BMC Genomics doi
  5. (2004). E: EnsMart: A Generic System for Fast and Flexible Access to Biological Data. Genome Res
  6. (2008). Ensembl 2008. Nucleic Acids Res doi
  7. Generic Model Organism Database (GMOD)
  8. (2007). HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature
  9. (2008). Hermjakob H: PRIDE: new developments and new datasets. Nucleic Acids Res doi
  10. (2007). HJ: The Rat Genome Database, update doi
  11. (2007). Integration of biological networks and gene expression data using Cytoscape. Nature Protocols doi
  12. (2006). Kibbe WA: dictyBase, the model organism database for Dictyostelium discoideum. Nucleic Acids Res
  13. (2007). L: Reactome: a knowledge base of biologic pathways and processes. Genome Biology doi
  14. (2005). McCouch S: Gramene: a bird's eye view of cereal genomes. Nucleic Acids Res doi
  15. (2005). Nekrutenko A: Galaxy: A platform for interactive large-scale genome analysis. Genome Research
  16. (2006). Oinn T: Taverna: A tool for building and running workflows of services. Nucleic Acids Res doi
  17. (2005). Rampazzo A: Regulatory mutations in transforming growth factor-beta-3 gene cause arrhythmogenic right ventricular cardiomyopathy type 1. Cardiovasc Re doi
  18. sources [http://www.biomart.org/biomart/das/dsn] Additional file 4 Taverna workflow demonstrating BioMart and web services interaction. Ensembl Gene IDs and EMBL IDs for a given set of genes (results
  19. (2001). The distributed annotation system.
  20. (1994). Thiene G: The gene for arrhythmogenic right ventricular cardiomyopathy maps to chromosome 14q23-q24. Human Mol Genet doi
  21. (2007). WormBase: new content and better access. Nucleic Acids Res doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.