10 research outputs found
Augmenting Metadata with Models of Experimental Methods: Filling in the Gaps
<p>A rigorous and machine-readable model for representing experimental
methods is a necessity for improving the replicability and
reproducibility of scientific experiments. Knowledge about experimental
methods can be incorporated as metadata in online data sets and used to
guide the work of future investigators. The technology being developed
by the Center for Expanded Data Annotation and Retrieval (CEDAR), which
is designed to ease the annotation of online data sets, will use such
representations of experimental methods to improve communication among
researchers and to automate experimental workflows that replicate
previous investigations. We have identified three distinct levels of
abstraction used to represent experimental methods: (1) the level of the
"abstract" (how the method would be presented in a paper's abstract),
(2) the level of the "methods section," and (3) the level of the
"notebook" (or that of supplementary information). Even at the level of
"supplementary information," important details needed to carry out the
experiment are often left out because they are assumed to be obvious.
This problem is only exacerbated at the less-specific levels. With a
rigorous model of this experiment, all steps, no matter how
"well-known," will be expressed. Having all details available, if
needed, will allow for better understanding of the method and for better
understanding of the resultant data. Our work will result in the
identification of motifs and steps in protocols as they are written at
all three levels. The representation scientific methods in this manner
will provide a detailed, multi-scale model of scientific procedures that
can stand in for the methods section of a publication when searching
for or comparing data sets online.</p
Additional file 1: of NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation
Ontology Recommender traffic summary. Summary of traffic received by the Ontology Recommender for the period 2014–2016, compared to the other most used BioPortal services. (PDF 27 kb
FAIR LINCS Metadata Powered by CEDAR Cloud-Based Templates and Services
The Library of Integrated Network-based Signatures (LINCS) program generates a wide variety of cell-based perturbation-response signatures using diverse assay technologies. For example, LINCS includes large-scale transcriptional profiling of genetic and small molecule perturbations, and various proteomics and imaging datasets. We currently obtain metadata through an online platform, the metadata submission tool (MST), based off the use of spreadsheet data templates. While functional, it remains difficult to maintain FAIR standards, specifically remaining findable and re-usable, for metadata without (enforced) controlled vocabulary and internally built linkages to ontologies and metadata standards. To maintain FAIR-centric metadata, we have worked with the Center for Enhanced Data Annotation and Retrieval (CEDAR), to develop modular metadata templates linked to ontologies and standards present in the NCBO Bioportal. We have also developed a new LINCS Dataset Submission Tool (DST), which links new LINCS datasets to the form-fillable CEDAR templates. This metadata management framework supports authoring, curation, validation, management, and sharing of LINCS metadata, while building upon the existing LINCS metadata standards and data-release workflows. Additionally, the CEDAR technology facilitates metadata validation and testing testing, enabling users to ensure their input metadata are LINCS compliant prior to submission for public release. CEDAR templates have been developed for reagent metadata, experimental metadata, to describe assays, and to capture global dataset attributes. Integrating the submission of all these components into one submission tool and workflow we aim to significantly simplify and streamline the workflow of LINCS dataset submission, processing, validation, registration, and publication. As other projects apply the same approach, many more datasets will become cross-searchable and can be linked optimizing the metadata pathway from submission to discovery
Examples of Meta-Resources for Computational Biology.
<p>Summary comparing <i>iTools</i> to other similar meta-resources environments for archival and retrieval of software tools for computational biology.</p
The main two displays of <i>iTools</i> resources provide tabular (left) and graph-based (right) human interfaces to the resource database (http://<i>iTools</i>.ccb.ucla.edu/).
<p>Both of these facilitate comprehensive traversal, comparison and search of resources. There are several other human and machine interfaces to the <i>iTools</i> database which are discussed in the text.</p
A schematic and dynamic integration of <i>iTools</i> resources demonstrating interoperability of multi-disciplinary tools via graphical workflow environments.
<p>The three nodes with dash-boundaries on the <i>left</i> demonstrate schematically the integration of some computational biology tools. The graphical workflow on the <i>right</i> depicts the practical means of using <i>iTools</i> meta-data to construct module descriptions and generate multidisciplinary and heterogeneous data analysis protocols.</p
Examples of the <i>input</i> and <i>output</i> XML descriptions in the Pipeline, an integrated graphical workflow environment that mediates inter-resource communications.
<p>If resources described in <i>iTools</i> include such data I/O descriptions, external interoperability environments (like the Pipeline) will be able to automatically enable construction and validation of inter-resource computational workflows.</p
Left <i>panel</i> shows the search, traversal and comparison of tools (in this case image alignment and visualization) based on their data input/output specifications.
<p>The <i>right</i> panel illustrates how streaming data through independent tools (via an external graphical workflow environment, e.g., LONI Pipeline) may be facilitated by the types of data I/O parameters stored as iTools resource-specific meta-data.</p
<i>iTools CompBiome </i>– the <i>iTools</i> Computational Biology Resourceome plug-in consists of a decentralized collection of BioSiteMaps (<i>sitemaps</i> of resources for biomedical computing) and a <i>Yahoo!Search</i>-based crawler for discovering new and updating existent BioSiteMaps anywhere on the web.
<p> These updates propagate automatically to <i>iTools</i>' SandBox and are later reviewed by expert users for inclusion in the <i>iTools</i> DB. The distributed nature of the NCBC CompBiome may be utilized by any tool developer, user or librarian to find, compare, integrate and expand the functionality of different resources for biomedical computing. The left and right panels illustrate the XML schema definition for the BioSiteMap.xml files and the results of a <i>manual</i> initiation of the <i>Yahoo!Search</i> using the <i>iTools</i> CompBiome plug-in, respectively. <i>iTools</i> has an automated weekly crawler initiation as well as manual triggering of the crawler.</p
This figure illustrates the utilization of <i>iTools</i> for search, comparison and integration of bioinformatics tools.
<p>In this example, we demonstrate the use of the Basic Local Alignment Search Tool (BLAST) for comparing gene and protein sequences against other nucleic sequences available in various public databases. The <i>top row</i> shows <i>iTools</i> traversal and search (keyword = blast) using the hyperbolic graphical interface, and tools comparison and investigation of interoperability using the tabular resource view panel. The <i>bottom row</i> shows the design of a simple BLAST analysis workflow using one specific graphical workflow environment (LONI Pipeline). This BLAST analysis protocol depicts the NCBI DB formatting, index generation and filtering using <i>miBLAST</i>, sequence alignment and result textual visualization.</p