18 research outputs found

    Training and hackathon on building biodiversity knowledge graphs

    Get PDF
    Knowledge graphs have the potential to unite disconnected digitized biodiversity data, and there are a number of efforts underway to build biodiversity knowledge graphs. More generally, the recent popularity of knowledge graphs, driven in part by the advent and success of the Google Knowledge Graph, has breathed life into the ongoing development of semantic web infrastructure and prototypes in the biodiversity informatics community. We describe a one week training event and hackathon that focused on applying three specific knowledge graph technologies – the Neptune graph database; Metaphactory; and Wikidata - to a diverse set of biodiversity use cases. We give an overview of the training, the projects that were advanced throughout the week, and the critical discussions that emerged. We believe that the main barriers towards adoption of biodiversity knowledge graphs are the lack of understanding of knowledge graphs and the lack of adoption of shared unique identifiers. Furthermore, we believe an important advancement in the outlook of knowledge graph development is the emergence of Wikidata as an identifier broker and as a scoping tool. To remedy the current barriers towards biodiversity knowledge graph development, we recommend continued discussions at workshops and at conferences, which we expect to increase awareness and adoption of knowledge graph technologies

    Camtrap DP: an open standard for the FAIR exchange and archiving of camera trap data

    Get PDF
    Camera trapping has revolutionized wildlife ecology and conservation by providing automated data acquisition, leading to the accumulation of massive amounts of camera trap data worldwide. Although management and processing of camera trap-derived Big Data are becoming increasingly solvable with the help of scalable cyber-infrastructures, harmonization and exchange of the data remain limited, hindering its full potential. There is currently no widely accepted standard for exchanging camera trap data. The only existing proposal, “Camera Trap Metadata Standard” (CTMS), has several technical shortcomings and limited adoption. We present a new data exchange format, the Camera Trap Data Package (Camtrap DP), designed to allow users to easily exchange, harmonize and archive camera trap data at local to global scales. Camtrap DP structures camera trap data in a simple yet flexible data model consisting of three tables (Deployments, Media and Observations) that supports a wide range of camera deployment designs, classification techniques (e.g., human and AI, media-based and event-based) and analytical use cases, from compiling species occurrence data through distribution, occupancy and activity modeling to density estimation. The format further achieves interoperability by building upon existing standards, Frictionless Data Package in particular, which is supported by a suite of open software tools to read and validate data. Camtrap DP is the consensus of a long, in-depth, consultation and outreach process with standard and software developers, the main existing camera trap data management platforms, major players in the field of camera trapping and the Global Biodiversity Information Facility (GBIF). Under the umbrella of the Biodiversity Information Standards (TDWG), Camtrap DP has been developed openly, collaboratively and with version control from the start. We encourage camera trapping users and developers to join the discussion and contribute to the further development and adoption of this standard. Biodiversity data, camera traps, data exchange, data sharing, information standardspublishedVersio

    Having Your Cake and Eating It Too: JSON-LD as an RDF serialization format

    No full text
    One impediment to the uptake of linked data technology is developers’ unfamiliarity with typical Resource Description Framework (RDF) serializations like Turtle and RDF/XML. JSON for Linking Data (JSON-LD) is designed to bypass this problem by expressing linked data in the well-known Javascript Object Notation (JSON) format that is popular with developers. JSON-LD is now Google’s preferred format for exposing Schema.org structured data in web pages for search optimization, leading to its widespread use by web developers. Another successful use of JSON-LD is by the International Image Interoperability Framework (IIIF), which limits its use to a narrow design pattern, which is readily consumed by a variety of applications. This presentation will show how a similar design pattern has been used in Audubon Core and with Biodiversity Information Standards (TDWG) controlled vocabularies to serialize data in a manner that is both easily consumed by conventional applications, but which also can be seamlessly loaded as RDF into triplestores or other linked data applications. The presentation will also suggest how JSON-LD might be used in other contexts within TDWG vocabularies, including with the Darwin Core Resource Relationship terms

    Baskauf, Steven J.

    No full text

    ORGANIZATION OF BIODIVERSITY RESOURCES BASED ON THE PROCESS OF THEIR CREATION AND THE ROLE OF INDIVIDUAL ORGANISMS AS RESOURCE RELATIONSHIP NODES

    No full text
    Abstract. - Kinds of occurrences (evidence of particular living organisms) can be grouped by common data and metadata characteristics that are determined by the way that the occurrence represents the organism. The creation of occurrence resources follows a pattern which can be used as the basis for organizing both the metadata associated with those resources and the relationships among the resources. The central feature of this organizational system is a resource representing the individual organism. This resource serves as a node which connects the organism's occurrences and any determinations of the organism's taxonomic identity. I specify a relatively small number of predicates which can define the important relationships among these resources and suggest which metadata properties should logically be associated with each kind of resource

    Implementation of the TDWG Standards Documentation Specification

    No full text
    The Standards Documentation Specification (SDS) was ratified as a TDWG standard in 2017 (Baskauf et al. 2017). It specified formatting guidelines for documents, but also established a data model for standards components and their versions. The SDS provided broad guidelines for Internationalized Resource Identifier (IRI) use, and associated specific metadata properties to particular components, but left many details to implementers. Since 2017, progress has been made toward implementing the requirements of the SDS. Part of that work was reformatting existing documents to comply with the standard, but another major effort was designation of IRI patterns, assembling the necessary metadata, and creating a system that can generate the many vocabulary-related documents necessary to comply with the SDS. Thse documents not only include human-readable documents, but also machine-readable documents that can be acquired through content negotiation. In addition, the system makes available dumps that can be loaded into a graph database for querying. In this presentation, we will review these developments and see how the newly-available machine-readable metadata can be used to answer questions about existing standards and future controlled vocabularies

    Creating and Maintaining TDWG Vocabularies using Spreadsheets

    No full text
    Because TDWG vocabularies change and grow as they are developed by the community, it is nearly impossible to document their version history and generate both machine and human readable documentation by manual editing of multiple documents in several formats. In this talk, I will provide an overview of the workflow that has been established to maintain vocabularies in accordance with the TDWG Standards Documentation and Vocabulary Maintenance specifications. I will show how vocabulary creators and maintainers can use simple CSV spreadsheets to create new vocabularies or to update existing ones. I will also provide an overview of the Python scripts that TDWG infrastructure maintainers use to process those simple spreadsheets to turn them into the authoritative files in TDWG's rs.tdwg.org GitHub repository, which serves as the data source for both machine readable serializations of the vocabularies and human readable standards documents

    Translating TDWG Controlled Vocabularies

    No full text
    Users may be more likely to understand and utilize standards if they are able to read labels and definitions of terms in their own languages. Increasing standards usage in non-English speaking parts of the world will be important for making biodiversity data from across the globe more uniformly available. For these reasons, it is important for Biodiversity Information Standards (TDWG) to make its standards widely available in as many languages as possible. Currently, TDWG has six ratified controlled vocabularies*1, 2, 3, 4, 5, 6 that were originally available only in English. As an outcome of this workshop, we have made term labels and definitions in those vocabularies available in the languages of translators who participated in its sessions. In the introduction, we reviewed the concept of vocabularies, explained the distinction between term labels and controlled value strings, and described how multilingual labels and definitions fit into the standards development process. The introduction was followed by working sessions in which individual translators or small groups working in a single language filled out Google Sheets with their translations. The resulting translations were compiled along with attribution information for the translators and made freely available in JavaScript Object Notation (JSON) and comma separated values (CSV) formats.*

    Darwin-SW version 1.0

    No full text
    <p>Version 1.0 follows a period of use and testing. Some property annotations were changed to follow the conventions of Darwin Core and Dublin Core. Version 1.0 specifies which of the inverse property pairs is preferred if linking is done only in one direction. Evidence-related properties are declared to be subClassOf relevant Relations Ontology terms.</p

    Using the Audubon Core Controlled Vocabularies for subjectPart and subjectOrientation

    No full text
    When the Audubon Core Multimedia Resources Metadata Schema*1 was ratified, it included two terms for describing what was being viewed in an image of an organism: ac:subjectPart, to indicate the morphological component of the organism included in the view, and ac:subjectOrientation, to describe the direction or viewing angle of the subject part relative to the image aquisition device. Although it was recommended that values for those terms come from controlled vocabularies, no such vocabularies had been created by TDWG. In 2019, the Views Controlled Vocabularies Task Group*2 was chartered to develop controlled vocabularies for these two terms. The result was two Simple Knowledge Organization System*3 (SKOS) concept schemes*4, 5, and a mechanism for determining which subjectOrientation values are appropriate for a given subjectPart and which subjectParts are appropriate for various organism groups. In this presentation, we briefly review the vocabulary development process, key features of the vocabularies, and give an overview of how the vocabularies can be used in several example cases
    corecore