172 research outputs found
Standards and infrastructure for managing experimental metadata
*See also "related poster":http://precedings.nature.com/documents/3144/version/1*

Today’s researchers can perform biological and biomedical studies where the same material is run through a wide range of assays, comprising several technologies such as genomics, transcriptomics, proteomics and metabol/nomics (hereafter referred as ‘omics’). To enable others to correctly interpret the complex data sets that result, and the conclusions drawn, it is necessary to provide contextualizing experimental metadata at an appropriate level of granularity.

Standards initiatives normally cater to particular domains. However, several synergistic standards activities foster cross-domain harmonization of the three kinds of reporting standard (minimum information checklists, ontologies and file formats). Some 29 groups participate in the "MIBBI":http://www.mibbi.org project, which offers a one-stop shop for those exploring the range of extant ‘minimum information’ checklists, and which fosters integrative development^1^. More than 60 groups participate in the "OBO Foundry":http://www.obofoundry.org ^2^, which coordinates the orthogonal development of ontologies such as "OBI":http://obi-ontology.org for describing experimental (meta)data. And several groups participate in the development of "ISA-Tab":http://isatab.sf.net, a tabular framework for presenting experimental metadata^3^ (analogous to "FuGE":http://fuge.sf.net, a generic data model to underpin various XML file formats^4^).

We have developed an infrastructure that leverages the aforementioned synergistic reporting standards to create a common structured representation and storage mechanism for experimental metadata from biological and biomedical investigations ranging from simple single-assay studies to complex, methodologically diverse multi-assay studies. 

View the "public instance":http://www.ebi.ac.uk/bioinvindex of our ISA-based infrastructure, running at EBI, and/or "download the components":http://isatab.sf.net for your local use.

*References*
1. Taylor CF, Field D, Sansone SA,… Rocca-Serra P et al. (2008) The MIBBI Project. _Nature Biotechnology_ Aug;26(8):889-896. "http://www.mibbi.org":http://www.mibbi.org

2. Smith B, Ashburner M, Rosse C,… Rocca-Serra P, …Sansone SA et al. (2007) The OBO Foundry. _Nature Biotechnology_ Nov;25(11):1251-5. "http://www.obofoundry.org":http://www.obofoundry.org

3. Sansone SA, Rocca-Serra P, Brandizi M,… Taylor CF et al. (2008) The First MGED RSBI (ISA-TAB) Workshop. _OMICS_. Jun;12(2):143-9. "http://isatab.sf.net":http://isatab.sf.net

4. Jones AR, Miller M, Aebersold R,… Sansone SA et al. (2007) The Functional Genomics Experiment model (FuGE). _Nature Biotechnology_ Oct;25(10):1127-1133. "http://fuge.sf.net":http://fuge.sf.ne
Overcoming the Ontology Enrichment Bottleneck with Quick Term Templates
The developers of the Ontology of Biomedical Investigations (OBI) primarily use Protégé for editing. However, adding many classes with similar patterns of logical definition is time consuming, error prone, and requires the editor to have some expertise in OWL. Therefore, the process is poorly suited for a large number of domain experts who have limited experience Protégé and ontology development. We have developed a procedure to ease this task and allow such domain experts to add terms to the ontology in a way that both effectively includes complex logical definitions yet requires minimal manual intervention by OBI developers. The procedure is based on editing a Quick Term Template in a spreadsheet format which is subsequently converted into an OWL file. This procedure promises to be a robust and scalable approach for ontology enrichment
Standards and infrastructure for managing experimental metadata
*See also the "related presentation":http://precedings.nature.com/documents/3145/version/1*

We present an infrastructure that leverages synergistic reporting standards and ontologies^1,2,3,4,5^ to create a common structured representation and storage mechanism for experimental metadata from biological and biomedical investigations ranging from simple single-assay studies to complex, methodologically diverse multi-assay studies. 

The infrastructure’s components include: a data capture and editing tool (_ISAcreator_); validator (_ISAvalidator_); database (_BioInvestigation Index_); and converter (_ISAconverter_); and a BioConductor analysis package (_R-ISApackage_). The components are designed for local installation, and can work independently, or as unified system.

View the "public instance":http://www.ebi.ac.uk/bioinvindex running at EBI and/or "download the components":http://isatab.sf.net for your local use.

*References*
1. Taylor CF, Field D, Sansone SA,… Rocca-Serra P et al. (2008) The MIBBI Project. _Nature Biotechnology_ Aug;26(8):889-896. "http://www.mibbi.org":http://www.mibbi.org

2. Smith B, Ashburner M, Rosse C,… Rocca-Serra P, …Sansone SA et al. (2007) The OBO Foundry. _Nature Biotechnology_ Nov;25(11):1251-5. "http://www.obofoundry.org":http://www.obofoundry.org

3. Ontology for Biomedical Investigations (OBI) "http://obi-ontology.org":http://obi-ontology.org 

4. Sansone SA, Rocca-Serra P, Brandizi M,… Taylor CF et al. (2008) The First MGED RSBI (ISA-TAB) Workshop. _OMICS_. Jun;12(2):143-9. "http://isatab.sf.net":http://isatab.sf.net

5. Jones AR, Miller M, Aebersold R,… Sansone SA et al. (2007) The Functional Genomics Experiment model (FuGE). _Nature Biotechnology_ Oct;25(10):1127-1133. "http://fuge.sf.net":http://fuge.sf.ne
Survey-based naming conventions for use in OBO Foundry ontology development
A wide variety of ontologies relevant to the biological and medical domains are
available through the OBO Foundry portal, and their number is growing rapidly. Integration of these ontologies, while requiring considerable effort, is extremely desirable. However, heterogeneities in format and style pose serious obstacles to such integration. In particular, inconsistencies in naming conventions can impair the readability and navigability of ontology class hierarchies, and hinder their alignment and integration. While other sources of diversity are tremendously complex and challenging, agreeing a set of common naming conventions is an achievable goal, particularly if those conventions are based on lessons drawn from pooled practical
experience and surveys of community opinion. We summarize a review of existing naming conventions and highlight certain disadvantages with respect to general applicability in the biological domain. We also present the results of a survey carried out to establish which naming conventions are currently employed by OBO Foundry ontologies and to determine what their special requirements regarding the naming
of entities might be. Lastly, we propose an initial set of typographic, syntactic and semantic conventions for labelling classes in OBO Foundry ontologies. Adherence to common naming conventions is more than just a matter of aesthetics. Such conventions provide guidance to ontology creators, help developers avoid flaws and
inaccuracies when editing, and especially when interlinking, ontologies. Common naming conventions will also assist consumers of ontologies to more readily understand what meanings were intended by the authors of ontologies used in annotating bodies of data
graph2tab, a library to convert experimental workflow graphs into tabular formats
Motivations: Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve
Using Pathway Signatures as Means of Identifying Similarities among Microarray Experiments
Widespread use of microarrays has generated large amounts of data, the interrogation of the public microarray repositories, identifying similarities between microarray experiments is now one of the major challenges. Approaches using defined group of genes, such as pathways and cellular networks (pathway analysis), have been proposed to improve the interpretation of microarray experiments. We propose a novel method to compare microarray experiments at the pathway level, this method consists of two steps: first, generate pathway signatures, a set of descriptors recapitulating the biologically meaningful pathways related to some clinical/biological variable of interest, second, use these signatures to interrogate microarray databases. We demonstrate that our approach provides more reliable results than with gene-based approaches. While gene-based approaches tend to suffer from bias generated by the analytical procedures employed, our pathway based method successfully groups together similar samples, independently of the experimental design. The results presented are potentially of great interest to improve the ability to query and compare experiments in public repositories of microarray data. As a matter of fact, this method can be used to retrieve data from public microarray databases and perform comparisons at the pathway level
InChI isotopologue and isotopomer specifications
This work presents a proposed extension to the International Union of Pure and Applied Chemistry (IUPAC) International Chemical Identifier (InChI) standard that allows the representation of isotopically-resolved chemical entities at varying levels of ambiguity in isotope location. This extension includes an improved interpretation of the current isotopic layer within the InChI standard and a new isotopologue layer specification for representing chemical intensities with ambiguous isotope localization. Both improvements support the unique isotopically-resolved chemical identification of features detected and measured in analytical instrumentation, specifically nuclear magnetic resonance and mass spectrometry.
Scientific contribution
This new extension to the InChI standard would enable improved annotation of analytical datasets characterizing chemical entities, supporting the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles of data stewardship for chemical datasets, ultimately promoting Open Science in chemistry
The FAIR Cookbook - the essential resource for and by FAIR doers
The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for “FAIR doers” in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.We thank all book dash participants and recipe authors, as well as the FAIRplus fellows, all partners, and the members of the FAIRplus Scientific Advisory Board, and the management team. In particular we acknowledge a number of colleagues for their role in the FAIRplus project, in particular: Ebitsam Alharbi (0000-0002-3887-3857), Oya Deniz Beyan (0000-0001-7611-3501), Ola Engkvist (0000-0003-4970-6461), Laura Furlong (0000-0002-9383-528X), Carole Goble (0000-0003-1219-2137), Mark Ibberson (0000-0003-3152-5670), Manfred Kohler, Nick Lynch (0000-0002-8997-5298), Scott Lusher (0000-0003-2401-4223), Jean-Marc Neefs, George Papadotas, Manuela Pruess (0000-0002-6857-5543), Ratnesh Sahay, Rudi Verbeeck (0000-0001-5445-6095), Bryn Williams-Jones, and Gesa Witt (0000-0003-2344-706X). This work and the authors were primarily funded by FAIRplus (IMI 802750). PRS and SAS also acknowledge contributions from the following grants (the FAIR Cookbook is also embedded in or connected to): ELIXIR Interoperability Platform, EOSC-Life (H2020-EU 824087), FAIRsharing (Wellcome 212930/Z/18/Z), NIH CFDE Coordinating Center (NIH Common Fund OT3OD025459-01), Precision Toxicology (H2020-EU 965406), UKRI DASH grant (MR/V038966/1), BY-COVID (Horizon-EU 101046203), AgroServ (Horizon-EU 101058020).Peer Reviewed"Article signat per 33 autors/es: Philippe Rocca-Serra, Wei Gu, Vassilios Ioannidis, Tooba Abbassi-Daloii, Salvador Capella-Gutierrez, Ishwar Chandramouliswaran, Andrea Splendiani, Tony Burdett, Robert T. Giessmann, David Henderson, Dominique Batista, Ibrahim Emam, Yojana Gadiya, Lucas Giovanni, Egon Willighagen, Chris Evelo, Alasdair J. G. Gray, Philip Gribbon, Nick Juty, Danielle Welter, Karsten Quast, Paul Peeters, Tom Plasterer, Colin Wood, Eelke van der Horst, Dorothy Reilly, Herman van Vlijmen, Serena Scollen, Allyson Lister, Milo Thurston, Ramon Granell, the FAIR Cookbook Contributors & Susanna-Assunta Sansone"Postprint (published version
FAIR data management: what does it mean for drug discovery?
The drug discovery community faces high costs in bringing safe and effective medicines to market, in part due to the rising volume and complexity of data which must be generated during the research and development process. Fully utilising these expensively created experimental and computational data resources has become a key aim of scientists due to the clear imperative to leverage the power of artificial intelligence (AI) and machine learning-based analyses to solve the complex problems inherent in drug discovery. In turn, AI methods heavily rely on the quantity, quality, consistency, and scope of underlying training data. While pre-existing preclinical and clinical data cannot fully replace the need for de novo data generation in a project, having access to relevant historical data represents a valuable asset, as its reuse can reduce the need to perform similar experiments, therefore avoiding a “reinventing the wheel” scenario. Unfortunately, most suitable data resources are often archived within institutes, companies, or individual research groups and hence unavailable to the wider community. Hence, enabling the data to be Findable, Accessible, Interoperable, and Reusable (FAIR) is crucial for the wider community of drug discovery and development scientists to learn from the work performed and utilise the findings to enhance comprehension of their own research outcomes. In this mini-review, we elucidate the utility of FAIR data management across the drug discovery pipeline and assess the impact such FAIR data has made on the drug development process
Special issue on bio-ontologies and phenotypes
The bio-ontologies and phenotypes special issue includes eight papers selected from the 11 papers presented at the Bio-Ontologies SIG (Special Interest Group) and the Phenotype Day at ISMB (Intelligent Systems for Molecular Biology) conference in Boston in 2014. The selected papers span a wide range of topics including the automated re-use and update of ontologies, quality assessment of ontological resources, and the systematic description of phenotype variation, driven by manual, semi- and fully automatic means
- …