38 research outputs found
The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making
Recommended from our members
The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the
application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity
is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i)
shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis
flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse,
globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model
integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity
analysis, comparative analysis of crop genomic data, and plant breeding decision making.This is the publisherâs final pdf. The published article is copyrighted by the author(s) and published by Hindawi Publishing Corporation. The published article can be found at: http://www.hindawi.com/journals/ijpg/
Recommended from our members
The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet
The planet is experiencing an ongoing global biodiversity crisis. Measuring the magnitude and rate of change more effectively requires access to organized, easily discoverable, and digitally-formatted biodiversity data, both legacy and new, from across the globe. Assembling this coherent digital representation of biodiversity requires the integration of data that have historically been analog, dispersed, and heterogeneous. The Integrated Publishing Toolkit (IPT) is a software package developed to support biodiversity dataset publication in a common format. The IPTâs two primary functions are to 1) encode existing species occurrence datasets and checklists, such as records from natural history collections or observations, in the Darwin Core standard to enhance interoperability of data, and 2) publish and archive data and metadata for broad use in a Darwin Core Archive, a set of files following a standard format. Here we discuss the key need for the IPT, how it has developed in response to community input, and how it continues to evolve to streamline and enhance the interoperability, discoverability, and mobilization of new data types beyond basic Darwin Core records. We close with a discussion how IPT has impacted the biodiversity research community, how it enhances data publishing in more traditional journal venues, along with new features implemented in the latest version of the IPT, and future plans for more enhancements
The GBIF integrated publishing toolkit: facilitating the efficient publishing of biodiversity data on the internet.
The planet is experiencing an ongoing global biodiversity crisis. Measuring the magnitude and rate of change more effectively requires access to organized, easily discoverable, and digitally-formatted biodiversity data, both legacy and new, from across the globe. Assembling this coherent digital representation of biodiversity requires the integration of data that have historically been analog, dispersed, and heterogeneous. The Integrated Publishing Toolkit (IPT) is a software package developed to support biodiversity dataset publication in a common format. The IPT's two primary functions are to 1) encode existing species occurrence datasets and checklists, such as records from natural history collections or observations, in the Darwin Core standard to enhance interoperability of data, and 2) publish and archive data and metadata for broad use in a Darwin Core Archive, a set of files following a standard format. Here we discuss the key need for the IPT, how it has developed in response to community input, and how it continues to evolve to streamline and enhance the interoperability, discoverability, and mobilization of new data types beyond basic Darwin Core records. We close with a discussion how IPT has impacted the biodiversity research community, how it enhances data publishing in more traditional journal venues, along with new features implemented in the latest version of the IPT, and future plans for more enhancements
B-HIT - A Tool for Harvesting and Indexing Biodiversity Data.
With the rapidly growing number of data publishers, the process of harvesting and indexing information to offer advanced search and discovery becomes a critical bottleneck in globally distributed primary biodiversity data infrastructures. The Global Biodiversity Information Facility (GBIF) implemented a Harvesting and Indexing Toolkit (HIT), which largely automates data harvesting activities for hundreds of collection and observational data providers. The team of the Botanic Garden and Botanical Museum Berlin-Dahlem has extended this well-established system with a range of additional functions, including improved processing of multiple taxon identifications, the ability to represent associations between specimen and observation units, new data quality control and new reporting capabilities. The open source software B-HIT can be freely installed and used for setting up thematic networks serving the demands of particular user groups
Example of an IPT summary page displaying some of the metadata provided for the dataset hosted by VertNet for the Cowan Tetrapod Collection of birds
<p>(<a href="http://ipt.vertnet.org:8080/ipt/resource.do?r=ubc_bbm_ctc_birds" target="_blank">http://ipt.vertnet.org:8080/ipt/resource.do?r=ubc_bbm_ctc_birds</a><b>).</b></p
This screenshot of IPT shows how users map their local field headings to Darwin Core terms, an essential task for data publishers.
<p>The Darwin Core term names are on the left and terms loaded from a database or spreadsheet on the right, which are selected using dropdown menus. Fields that have the same name string in both Darwin Core and the publisher dataset are matched automatically, while those that do not match must be selected manually (via adrop-down list) by the âdata publisherâ.</p
The current workflow for biodiversity data networks has multiple steps that separate the publishing of datasets from downstream aggregation and enhanced discoverability.
<p>The IPT supports the creation and publication of Darwin Core Archives accessible for download, with a publicly available summary web page. Aggregators harvest, process, and upload Darwin Core Archives into systems effective for searching, filtering, visualization, and download.</p