Search CORE

3,075 research outputs found

GinJinn: An object‐detection pipeline for automated feature extraction from herbarium specimens

Author: Oberprieler Christoph
Ott Tankred
Palm Christoph
Vogt Robert
Publication venue
Publication date: 01/01/2020
Field of study

Premise The generation of morphological data in evolutionary, taxonomic, and ecological studies of plants using herbarium material has traditionally been a labor‐intensive task. Recent progress in machine learning using deep artificial neural networks (deep learning) for image classification and object detection has facilitated the establishment of a pipeline for the automatic recognition and extraction of relevant structures in images of herbarium specimens. Methods and Results We implemented an extendable pipeline based on state‐of‐the‐art deep‐learning object‐detection methods to collect leaf images from herbarium specimens of two species of the genus Leucanthemum . Using 183 specimens as the training data set, our pipeline extracted one or more intact leaves in 95% of the 61 test images. Conclusions We establish GinJinn as a deep‐learning object‐detection tool for the automatic recognition and extraction of individual leaves or other structures from herbarium specimens. Our pipeline offers greater flexibility and a lower entrance barrier than previous image‐processing approaches based on hand‐crafted features

Institutional Repository of the Freie Universität Berlin

University of Regensburg Publication Server

Extraction and parsing of herbarium specimen data: Exploring the use of the Dublin core application profile framework

Author: Huang Jane
McCotter Melody J.
Moen William E.
Publication venue
Publication date: 03/02/2010
Field of study

Herbaria around the world house millions of plant specimens; botanists and other researchers value these resources as ingredients in biodiversity research. Even when the specimen sheets are digitized and made available online, the critical information about the specimen stored on the sheet are not in a usable (i.e., machine-processible) form. This paper describes a current research and development project that is designing and testing high-throughput workflows that combine machine- and human-processes to extract and parse the specimen label data. The primary focus of the paper is the metadata needs for the workflow and the creation of the structured metadata records describing the plant specimen. In the project, we are exploring the use of the new Dublin Core Metadata Initiative framework for application profiles. First articulated as the Singapore Framework for Dublin Core Application Profiles in 2007, the use of this framework is in its infancy. The promises of this framework for maximum interoperability and for documenting the use of metadata for maximum reusability, and for supporting metadata applications that are in conformance with Web architectural principles provide the incentive to explore and add implementation experience regarding this new framework

Illinois Digital Environment for Access to Learning and Scholarship Repository

Specimens as research objects: reconciliation across distributed repositories to enable metadata propagation

Author: Nicolson Nicky
Paton Alan
Phillips Sarah
Tucker Allan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/09/2018
Field of study

Botanical specimens are shared as long-term consultable research objects in a global network of specimen repositories. Multiple specimens are generated from a shared field collection event; generated specimens are then managed individually in separate repositories and independently augmented with research and management metadata which could be propagated to their duplicate peers. Establishing a data-derived network for metadata propagation will enable the reconciliation of closely related specimens which are currently dispersed, unconnected and managed independently. Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044 type citations, 1,121,865 georeferences, 1,097,168 images and 2,191,179 scientific name determinations. The results enable the creation of networks to identify which repositories could work in collaboration. Some classes of annotation (particularly those regarding scientific name determinations) represent units of scientific work: appropriate management of this data would allow the accumulation of scholarly credit to individual researchers: potential further work in this area is discussed.Comment: 9 pages, 1 table, 3 figure

arXiv.org e-Print Archive

Crossref

High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

Author: Amin Aliasgar
Arsiwala Zainab
Best Jason
Huang Jane Q.
McCotter Melody
Moen William E.
Neill Amanda
Publication venue: Georgia Institute of Technology
Publication date: 01/05/2009
Field of study

4th International Conference on Open RepositoriesThis presentation was part of the session : Conference PostersHundreds of thousands of specimens in herbaria and natural history museums worldwide are potential candidates for digitization, making them more accessible to researchers. An herbarium contains collections of preserved plant specimens created for scientific use. Herbarium specimens are ideal natural history objects for digitization, as the plants are pressed flat and dried, and mounted on individual sheets of paper, creating a nearly two-dimensional object. Building digital repositories of herbarium specimens can increase use and exposure of the collections while simultaneously reducing physical handling. As important as the digitized specimens are, the data contained on the associated specimen labels provide critical information about each specimen (e.g., scientific name, geographic location of specimen, etc.). The volume and heterogeneity of these printed label data present challenges in transforming them into meaningful digital form to support research. The Apiary Project is addressing these challenges by exploring and developing transformation processes in a systematic workflow that yields high-quality machine-processable label data in a cost- and time-efficient manner. The University of North Texas's Texas Center for Digital Knowledge (TxCDK) and the Botanical Research Institute of Texas (BRIT), with funding from an Institute of Museum and Library Services National Leadership Grant, are conducting fundamental research with the goal of identifying how human intelligence can be combined with machine processes for effective and efficient transformation of specimen label information. The results of this research will yield a new workflow model for effective and efficient label data transformation, correction, and enhancement.Institute of Museum and Library Services, National Leadership Gran

Scholarly Materials And Research @ Georgia Tech

Back Matter 7 (2)

Author
Publication venue: Scholarship @ Claremont
Publication date: 01/01/1970
Field of study

Digitization workflows for flat sheets and packets of plants, algae, and fungi

Author: Allard Dorothy
Brown Herrick
Carter J. Richard
Denslow Michael W.
Ellwood Elizabeth R.
Germain‐aubrey Charlotte C.
Gilbert Ed
Gillespie Emily
Goertzen Leslie R.
Legler Ben
Marchant D. Blaine
Marsico Travis D.
Mast Austin R.
Morris Ashley B.
Murrell Zack
Nazaire Mare
Neefus Chris
Nelson Gil
Oberreiter Shanna
Paul Deborah
Rabeler Richard K.
Ruhfel Brad R.
Sasek Thomas
Shaw Joey
Soltis Pamela S.
Sweeney Patrick
Wallace Lisa E.
Watson Kimberly
Weeks Andrea
Publication venue: 'Botanical Society of America'
Publication date: 01/01/2015
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/141708/1/aps31500065.pd

PubMed Central

Carolina Digital Repository

Deep Blue Documents at the University of Michigan