25 research outputs found
Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)
This is the published version, also available at http://dx.doi.org/10.4056/sigs.4898640.The workshop-hackathon was convened by the Global Biodiversity Information Facility (GBIF) at its secretariat in Copenhagen over 22-24 May 2013 with additional support from several projects (RCN4GSC, EAGER, VertNet, BiSciCol, GGBN, and Micro B3). It assembled a team of experts to address the challenge of adapting the Darwin Core standard for a wide variety of sample data. Topics addressed in the workshop included 1) a review of outstanding issues in the Darwin Core standard, 2) issues relating to publishing of biodiversity data through Darwin Core Archives, 3) use of Darwin Core Archives for publishing sample and monitoring data, 4) the case for modifying the Darwin Core Text Guide specification to support many-to-many relations, and 5) the generalization of the Darwin Core Archive to a “Biodiversity Data Archive”. A wide variety of use cases were assembled and discussed in order to inform further developments
Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)
This is the published version, also available at http://dx.doi.org/10.4056/sigs.4898640.The workshop-hackathon was convened by the Global Biodiversity Information Facility (GBIF) at its secretariat in Copenhagen over 22-24 May 2013 with additional support from several projects (RCN4GSC, EAGER, VertNet, BiSciCol, GGBN, and Micro B3). It assembled a team of experts to address the challenge of adapting the Darwin Core standard for a wide variety of sample data. Topics addressed in the workshop included 1) a review of outstanding issues in the Darwin Core standard, 2) issues relating to publishing of biodiversity data through Darwin Core Archives, 3) use of Darwin Core Archives for publishing sample and monitoring data, 4) the case for modifying the Darwin Core Text Guide specification to support many-to-many relations, and 5) the generalization of the Darwin Core Archive to a “Biodiversity Data Archive”. A wide variety of use cases were assembled and discussed in order to inform further developments
Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)
This is the published version, also available at http://dx.doi.org/10.4056/sigs.4898640.The workshop-hackathon was convened by the Global Biodiversity Information Facility (GBIF) at its secretariat in Copenhagen over 22-24 May 2013 with additional support from several projects (RCN4GSC, EAGER, VertNet, BiSciCol, GGBN, and Micro B3). It assembled a team of experts to address the challenge of adapting the Darwin Core standard for a wide variety of sample data. Topics addressed in the workshop included 1) a review of outstanding issues in the Darwin Core standard, 2) issues relating to publishing of biodiversity data through Darwin Core Archives, 3) use of Darwin Core Archives for publishing sample and monitoring data, 4) the case for modifying the Darwin Core Text Guide specification to support many-to-many relations, and 5) the generalization of the Darwin Core Archive to a “Biodiversity Data Archive”. A wide variety of use cases were assembled and discussed in order to inform further developments
Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)
This is the published version, also available at http://dx.doi.org/10.4056/sigs.4898640.The workshop-hackathon was convened by the Global Biodiversity Information Facility (GBIF) at its secretariat in Copenhagen over 22-24 May 2013 with additional support from several projects (RCN4GSC, EAGER, VertNet, BiSciCol, GGBN, and Micro B3). It assembled a team of experts to address the challenge of adapting the Darwin Core standard for a wide variety of sample data. Topics addressed in the workshop included 1) a review of outstanding issues in the Darwin Core standard, 2) issues relating to publishing of biodiversity data through Darwin Core Archives, 3) use of Darwin Core Archives for publishing sample and monitoring data, 4) the case for modifying the Darwin Core Text Guide specification to support many-to-many relations, and 5) the generalization of the Darwin Core Archive to a “Biodiversity Data Archive”. A wide variety of use cases were assembled and discussed in order to inform further developments
Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)
This is the published version, also available at http://dx.doi.org/10.4056/sigs.4898640.The workshop-hackathon was convened by the Global Biodiversity Information Facility (GBIF) at its secretariat in Copenhagen over 22-24 May 2013 with additional support from several projects (RCN4GSC, EAGER, VertNet, BiSciCol, GGBN, and Micro B3). It assembled a team of experts to address the challenge of adapting the Darwin Core standard for a wide variety of sample data. Topics addressed in the workshop included 1) a review of outstanding issues in the Darwin Core standard, 2) issues relating to publishing of biodiversity data through Darwin Core Archives, 3) use of Darwin Core Archives for publishing sample and monitoring data, 4) the case for modifying the Darwin Core Text Guide specification to support many-to-many relations, and 5) the generalization of the Darwin Core Archive to a “Biodiversity Data Archive”. A wide variety of use cases were assembled and discussed in order to inform further developments
The Global Genome Biodiversity Network (GGBN) Data Portal
The Author(s) 2013. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial
re-use, please contact [email protected]
The attached file is the published pdf
Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies
The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers
OpenBiodiv: an Implementaion of a Semantic System Running on top of the Biodiversity Knowledge Graph
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System.
The need for an integrated information system serving the needs of the biodiversity community can be dated at least as far back as the sanctioning of the Bouchout declaration in 2007. The Bouchout declaration proposes to make biodiversity knowledge freely available as Linked Open Data (LOD)*1. At TDWG2016 Fig. 1) we presented the prototype of the sytem - then called Open Biodiversity Knolwedge Management Sysyttem (OBKMS). The specification and design of OpenBiodiv was outlined by Senderov and Penev (2016) and in this talk we would like to showcase its pilot. We believe OpenBiodiv is possibly the first pilot-stage implenatation of a semantic system running on top of the biodiversity knowledge graph.
OpenBiodiv has several components:
OpenBiodiv ontology: general data model allowing the extraction of biodiversity knowledge from taxonomic articles or from databases such as GBIF. The ontology (in preparation, Journal of Biomedical Semantics, available on GitHub) incorporates several pre-existing models: Darwin-SW (Baskauf and Webb 2016), SPAR (Peroni 2014), Treatment Ontology, and several others. It defines classes, properties, and rules allowing to interlink these disparate ontologies and to create a LOD of biodiversity knowledge. New is the Taxonomic Name Usage class, accompanied by a Vocabulary of Taxonomic Statuses (created via an analysis of 4,000 Pensoft articles) allowing for the automated inference of the taxonomic status of Latinized scientific names. The ontology allows for multiple backbone taxonomies via the introduction of a Taxon Concept class (equivalent to DarwinCore Taxon) and Taxon Concept Labels as a subclass of biological name.
The Biodiversity Knowledge Graph - a LOD dataset of information extracted from taxonomic literature and databases. In practice, it has realized part of what has been proposed during pro-iBiosphere and later discussed by Page (2016). Its main resources are articles, sub-article componets (tables, figures, treatents, references), author names, institution names, geographical locations, biological names, taxon concepts, and occurrences. Authors have been disambiguated via their affiliation with the use of fuzzy-logic based on the GraphDB Lucene connector. The graph interlinks: (1) Prospectively published literature via Pensoft Publishers. (2) Legacy literature via Plazi. (3) Well-known resources such as geographical places or institutions via DBPedia. (4) GBIF's backbone taxonomy as a default but not preferential hierarchy of taxon concepts. (5) OpenBiodiv id's are matched to nomenclator id's (e.g. ZooBank) whenever possible. Names form two networks in the graph: (1) A directed-acyclical graph (DAG) of supercedence that can be followed to the corresponding sinks to infer the currently applicable scientific name for a given taxon. (2) A network of bi-directional relations indicating the relatedness of names. These names may be compared to the related names inferred on the basis of distributional semantics by the co-organizers of this workshop (Nguyen et al. 2017).
ropenbio: an R package for RDF*2-ization of biodiversity information resources according to the OpenBiodiv ontology. It will be submitted to the rOpenSci project. While many of its high-level functions are specific to OpenBiodiv, the low-level functions, and its RDF-ization framework can be used for any R-based RDF-ization effort.
OpenBiodiv.net: a front-end of the system allowing users to run low-level SPARQL queries as well to use an extensible set of semantic apps running on top of the Biodiversity Knowledge Graph.
The talk will showcase the progress from prototype to pilot stage of the system since TDWG2016. It will focus on the new features and about the web UI allowing researchers and other interested parties to already use the system. We will discuss several possible scenarios including semantic search and finding related names
OpenBiodiv Poster: an Implementation of a Semantic System Running on top of the Biodiversity Knowledge Graph
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System.
The need for an integrated information system serving the needs of the biodiversity community can be dated at least as far back as the sanctioning of the Bouchout declaration in 2007. The Bouchout declaration proposes to make biodiversity knowledge freely available as Linked Open Data (LOD)*1. At TDWG 2016 (Fig. 1) we presented the prototype of the system - then called Open Biodiversity Knolwedge Management System (OBKMS) (Senderov et al. 2016). The specification and design of OpenBiodiv was then outlined in more detail by Senderov and Penev (2016). In this poster, we describe the pilot implementation. We believe OpenBiodiv is possibly the first pilot-stage implementation of a semantic system running on top of a biodiversity knowledge graph.
OpenBiodiv has several components:
OpenBiodiv ontology: A general data model supporting the extraction of biodiversity knowledge from taxonomic articles or from databases such as GBIF. The ontology (in preparation, Journal of Biomedical Semantics, available on GitHub) incorporates several pre-existing models: Darwin-SW (Baskauf and Webb 2016), SPAR (Peroni 2014), Treatment Ontology, and several others. It defines classes, properties, and rules supporting the interlinking of these disparate ontologies to create a LOD biodiversity knowledge graph. A new addition is the Taxonomic Name Usage class, accompanied by a Vocabulary of Taxonomic Statuses (created via an analysis of 4,000 Pensoft articles) enabling for the automated inference of the taxonomic status of Latinized scientific names. The ontology supports multiple backbone taxonomies via the introduction of a Taxon Concept class (equivalent to DarwinCore Taxon) and Taxon Concept Labels as a subclass of biological name.
The Biodiversity Knowledge Graph: A LOD dataset of information extracted from taxonomic literature and databases. To date, this resource has realized part of what was proposed during the pro-iBiosphere project and later discussed by Page (2016). Its main resources are articles, sub-article componets (tables, figures, treatents, references), author names, institution names, geographical locations, biological names, taxon concepts, and occurrences. Authors have been disambiguated via their affiliation with the use of fuzzy-logic based on the GraphDB Lucene connector. The graph interlinks: (1) Prospectively published literature via Pensoft Publishers. (2) Legacy literature via Plazi. (3) Well-known resources such as geographical places or institutions via DBPedia. (4) GBIF's backbone taxonomy as a default but not the preferential hierarchy of taxon concepts. (5) OpenBiodiv id's with nomenclator id's (e.g. ZooBank) whenever possible. Names form two networks in the graph: (1) A directed-acyclical graph (DAG) of supercedence that can be followed to the corresponding sinks to infer the currently applicable scientific name for a given taxon. (2) A network of bi-directional relations indicating the relatedness of names. These names may be compared to the related names inferred on the basis of distributional semantics (Nguyen et al. 2017).
ropenbio: An R package for RDF*2-ization of biodiversity information resources according to the OpenBiodiv ontology. We intend to submit this to the rOpenSci project. While many of its high-level functions are specific to OpenBiodiv, the low-level functions, and its RDF-ization framework can be used for any R-based RDF-ization effort.
OpenBiodiv.net: A front-end of the system allowing users to run low-level SPARQL queries as well to use an extensible set of semantic apps running on top of a biodiversity knowledge graph
OpenBiodiv Computer Demo: an Implementation of a Semantic System Running on top of the Biodiversity Knowledge Graph
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System. We believe OpenBiodiv is possibly the first pilot-stage implenatation of a semantic system running on top of the biodiversity knowledge graph.
The need for an integrated information system serving the needs of the biodiversity community can be dated at least as far back as the sanctioning of the Bouchout declaration in 2007. The Bouchout declaration proposes to make biodiversity knowledge freely available as Linked Open Data (LOD)*1. At TDWG2016 (Fig. 1) we presented the prototype of the sytem - then called Open Biodiversity Knolwedge Management System (OBKMS). The specification and design of OpenBiodiv was outlined by Senderov and Penev (2016) and in this computer demo we would like to showcase its pilot. We will show how to use the SPARQL*2 endpoint directly, we will illustrate the semantic search capabilities of the system, and we will showcase some high-level applications that run on top of it. We will also look at the core dataset (the Biodiversity Knowledge Graph) and the R tools used to create it.
OpenBiodiv has several components:
OpenBiodiv ontology: general data model allowing the extraction of biodiversity knowledge from taxonomic articles or from databases such as GBIF. The ontology (in preparation, Journal of Biomedical Semantics, available on GitHub) incorporates several pre-existing models: Darwin-SW (Baskauf and Webb 2016), SPAR (Peroni 2014), Treatment Ontology, and several others. It defines classes, properties, and rules allowing to interlink these disparate ontologies and to create a LOD of biodiversity knowledge. New is the Taxonomic Name Usage class, accompanied by a Vocabulary of Taxonomic Statuses (created via an analysis of 4,000 Pensoft articles) allowing for the automated inference of the taxonomic status of Latinized scientific names. The ontology allows for multiple backbone taxonomies via the introduction of a Taxon Concept class (equivalent to DarwinCore Taxon) and Taxon Concept Labels as a subclass of biological name.
The Biodiversity Knowledge Graph - a LOD dataset of information extracted from taxonomic literature and databases. In practice, it has realized part of what has been proposed during pro-iBiosphere and later discussed by Page (2016). Its main resources are articles, sub-article componets (tables, figures, treatents, references), author names, institution names, geographical locations, biological names, taxon concepts, and occurrences. Authors have been disambiguated via their affiliation with the use of fuzzy-logic based on the GraphDB Lucene connector. The graph interlinks: (1) Prospectively published literature via Pensoft Publishers. (2) Legacy literature via Plazi. (3) Well-known resources such as geographical places or institutions via DBPedia. (4) GBIF's backbone taxonomy as a default but not preferential hierarchy of taxon concepts. (5) OpenBiodiv identifiers are matched to nomenclator identifiers (e.g. ZooBank) whenever possible. Names form two networks in the graph: (1) A directed-acyclical graph (DAG) of supercedence that can be followed to the corresponding sinks to infer the currently applicable scientific name for a given taxon. (2) A network of bi-directional relations indicating the relatedness of names. These names may be compared to the related names inferred on the basis of distributional semantics by the co-organizers of this workshop (Nguyen et al. 2017).
ropenbio: an R package for RDF*3-ization of biodiversity information resources according to the OpenBiodiv ontology. It will be submitted to the rOpenSci project. While many of its high-level functions are specific to OpenBiodiv, the low-level functions, and its RDF-ization framework can be used for any R-based RDF-ization effort.
OpenBiodiv.net: a front-end of the system allowing users to run low-level SPARQL queries as well to use an extensible set of semantic apps running on top of the Biodiversity Knowledge Graph