9 research outputs found

    Comparative breeding biology of the Northern Rockhopper penguin Eudyptes moseleyi on Gough and Nightingale Islands

    Get PDF
    Includes bibliographical references.The Northern Rockhopper Penguin Eudyptes moseleyi is listed as Endangered due to an estimated 57% decrease in breeding numbers over the past 37 years. Approximately 85% of the global population breeds at the Tristan da Cunha archipelago (Tristan, Inaccessible and Nightingale Islands) and nearby Gough Island in the central South Atlantic Ocean. The population on Gough Island declined by 50-60% between 1982 and 2005, but in the Tristan da Cunha archipelago the population trend over the last few decades is believed to be stable despite long-term human exploitation (particularly egg collection on Nightingale Island in recent years).This study compares aspects of the breeding biology on Gough Island (where population numbers are decreasing) and Nightingale Island (where numbers are thought to be stable) based on data gathered from five colonies in the 2012/13 and 2013/4 breeding seasons. On Nightingale Island, breeding success was 6.5% lower and 40-day old chick mass 47% less(implying poor juvenile recruitment)compared to on Gough Island. Poor foraging conditions for birds on Nightingale Island is the most probable explanation for these results, and future studies should focus on the foraging locations used by birds on both islands. Additionally, egg collection practices on Nightingale Island may have had a negative impact on the population, and I recommend that the temporary ban on egg-collection at Nightingale Island is made permanent. It is also possible that the population on Gough Island is no longer in decline, or is declining for reasons unrelated to breeding success; to verify this and confirm the findings of this study, future population trends and chick fledging mass on both islands should be monitored over the long-term

    Presence-absence of plant habitat specialists in 15 patches of dry calcareous grassland

    Get PDF
    Background Dry grasslands on calcareous bedrock in warm climates around the Oslo Fjord are naturally fragmented biodiversity hotspots. This habitat geographically coincides with the most densely populated area of Norway. Many habitat specialists, along with the habitat itself, are red-listed because of land-use change, forest encroachment, and invasive species that cause habitat loss and greater isolation of remaining patches. To ensure effective conservation, data on species presences and absences are necessary to quantify states, changes, and extinction risks in specific populations and habitat patches. New information We present presence-absence data of 49 vascular plant species in 15 patches of dry calcareous grassland habitat, surveyed in 2009, 2019, and in 2020. The species are considered to be habitat specialists and, thus, unlikely to occur between the patches. sampling-event, vascular plants, specialist species, presence-absence data, calcareous grassland, habitat patch, GBIFpublishedVersio

    COMPARTIR DATOS DE BIODIVERSIDAD A TRAVÉS DE GBIF COLOMBIA: Una invitación al sector empresarial

    Get PDF
    La infraestructura Global de Información en Biodiversidad (GBIF) es la red de datos sobre biodiversidad más grande del mundo. Como infraestructura internacional de datos abiertos, permite que cualquier persona pueda acceder, compartir y utilizar información sobre las especies de nuestro planetaBogoáSiB Colombi

    BioDATA - Biodiversity Data for Internationalisation in Higher Education

    Get PDF
    BioDATA is an international project on developing skills in biodiversity data management and data publishing. Between 2018 and 2021, undergraduate and postgraduate students from Armenia, Belarus, Tajikistan, and Ukraine, have an opportunity to take part in the intensive courses to become certified professionals in biodiversity data management. They will gain practical skills and obtain appropriate knowledge on: international data standards (Darwin Core); data cleaning software, data publishing software such as the Integrated Publishing Toolkit (IPT), and preparation of data papers. Working with databases, creating datasets, managing data for statistical analyses and publishing research papers are essential for the everyday tasks of a modern biologist. At the same time, these skills are rarely taught in higher education. Most of the contemporary professionals in biodiversity have to gain these skills independently, through colleagues, or through supervision. In addition, all the participants familiarize themselves with one of the important international research data infrastructures such as the Global Biodiversity Information Facility (GBIF). The project is coordinated by the University of Oslo (Norway) and supported by the Global Biodiversity Information Facility (GBIF). The project is funded by the Norwegian Agency for International Cooperation and Quality Enhancement in Higher Education (DIKU)

    "Publish First": A Rapid, GPT-4 Based Digitisation System for Small Institutes with Minimal Resources

    No full text
    We present a streamlined technical solution ("Publish First") designed to assist smaller, resource-constrained herbaria in rapidly publishing their specimens to the Global Biodiversity Information Facility (GBIF).Specimen data from smaller herbaria, particularly those in biodiversity-rich regions of the world, provide a valuable and often unique contribution to the global pool of biodiversity knowledge (Marsico et al. 2020). However, these institutions often face challenges not applicable to larger herbaria, including a lack of staff with technical skills, limited staff hours for digitization work, inadequate financial resources for specialized scanning equipment, cameras, lights, and imaging stands, limited (or no) access to computers and collection management software, and unreliable internet connections. Data-scarce and biodiversity rich countries are also often linguistically diverse (Gorenflo et al. 2012), and staff may not have English skills, which means pre-existing online data publication resources and guides are of limited use.The "Publish First" method we are trialing, addresses several of these issues: it drastically simplifies the publication process so technical skills are not necessary; it minimizes administrative tasks saving time; it uses simple, cheap and easily available hardware; it does not require any specialized software; and the process is so simple that there is little to no need for any written instructions."Publish first" requires staff to attach QR code labels containing identifiers to herbarium specimen sheets, scan these sheets using a document scanner costing around €300, then drag and drop these files to an S3 bucket (a cloud container that specialises in storing files). Subsequently, these images are automatically processed through an Optical Character Recognition (OCR) service to extract text, which is then passed on to OpenAI's Generative Pre-Transformer 4 (GPT-4) Application Programming Interface (API), for standardization. The standardized data is integrated into a Darwin Core Archive file that is automatically published through GBIF's Integrated Publishing Toolkit (IPT) (GBIF 2021).The most technically challenging aspect of this project has been the standardization of OCR data to Darwin Core using the GPT-4 API, particularly in crafting precise prompts to address the inherent inconsistency and lack of reliability in these Large Language Models (LLMs). Despite this, GPT-4 outperformed our manual scraping efforts. Our choice of GPT-4 as a model was a naive one: we implemented the workflow on some pre-digitized specimens from previously published Norwegian collections, compared the published data on GBIF with GPT-4's Darwin Core standardized output, and found the results satisfactory. Moving forward, we plan to undertake more rigorous additional research to compare the effectiveness and cost-efficiency of different LLMs as Darwin Core standardization engines. We are also particularly interested in exploring the new "function calling" feature added to the GPT-4 API, as it promises to allow us to retrieve standardized data in a more consistent and structured format.This workflow is currently under trial in Tajikistan, and may possibly be used in Uzbekistan, Armenia and Italy in the near future

    "Publish First": A Rapid, GPT-4 Based Digitisation System for Small Institutes with Minimal Resources

    No full text
    We present a streamlined technical solution ("Publish First") designed to assist smaller, resource-constrained herbaria in rapidly publishing their specimens to the Global Biodiversity Information Facility (GBIF).Specimen data from smaller herbaria, particularly those in biodiversity-rich regions of the world, provide a valuable and often unique contribution to the global pool of biodiversity knowledge (Marsico et al. 2020). However, these institutions often face challenges not applicable to larger herbaria, including a lack of staff with technical skills, limited staff hours for digitization work, inadequate financial resources for specialized scanning equipment, cameras, lights, and imaging stands, limited (or no) access to computers and collection management software, and unreliable internet connections. Data-scarce and biodiversity rich countries are also often linguistically diverse (Gorenflo et al. 2012), and staff may not have English skills, which means pre-existing online data publication resources and guides are of limited use.The "Publish First" method we are trialing, addresses several of these issues: it drastically simplifies the publication process so technical skills are not necessary; it minimizes administrative tasks saving time; it uses simple, cheap and easily available hardware; it does not require any specialized software; and the process is so simple that there is little to no need for any written instructions."Publish first" requires staff to attach QR code labels containing identifiers to herbarium specimen sheets, scan these sheets using a document scanner costing around €300, then drag and drop these files to an S3 bucket (a cloud container that specialises in storing files). Subsequently, these images are automatically processed through an Optical Character Recognition (OCR) service to extract text, which is then passed on to OpenAI's Generative Pre-Transformer 4 (GPT-4) Application Programming Interface (API), for standardization. The standardized data is integrated into a Darwin Core Archive file that is automatically published through GBIF's Integrated Publishing Toolkit (IPT) (GBIF 2021).The most technically challenging aspect of this project has been the standardization of OCR data to Darwin Core using the GPT-4 API, particularly in crafting precise prompts to address the inherent inconsistency and lack of reliability in these Large Language Models (LLMs). Despite this, GPT-4 outperformed our manual scraping efforts. Our choice of GPT-4 as a model was a naive one: we implemented the workflow on some pre-digitized specimens from previously published Norwegian collections, compared the published data on GBIF with GPT-4's Darwin Core standardized output, and found the results satisfactory. Moving forward, we plan to undertake more rigorous additional research to compare the effectiveness and cost-efficiency of different LLMs as Darwin Core standardization engines. We are also particularly interested in exploring the new "function calling" feature added to the GPT-4 API, as it promises to allow us to retrieve standardized data in a more consistent and structured format.This workflow is currently under trial in Tajikistan, and may possibly be used in Uzbekistan, Armenia and Italy in the near future

    An Update on Persistent Identifiers in Norwegian Biodiversity Data

    No full text
    Persistent identifiers (PIDs) are reference keys to pieces of digital information or digital objects (Meadows and Haak 2018). PIDs are long-lasting, trustworthy and ideally globally unique, allowing information to be unambigiously associated with a digital object. This allows, for example, collection objects to be annotated with data (e.g., improved geographic coordinates) from different web services and databases (Page 2008). In 2014, Norway began an initiative to provide all museum specimens at the University of Oslo's Natural History Museum (UiO NHM) with PIDs persistent and globally unique identifiers (PIDs) in the form of universally unique identifiers (UUIDs) prefixed by a persistent uniform resource locator (PURL) (Endresen and Svindseth 2014). This poster provides an update of progress made in this endeavor, and details the technical set up and workflow, as well as problems encountered and reflections on the process. So far, roughly 40% of UiO NHM's collections have PIDs entered in the Darwin Core materialSampleID field for the collections management system. The main technical problem has been in configuring the PURL service to set up redirects correctly. In the future, we plan to migrate from PURLs to the Handle system which provides identifiers for digital objects, or to using DOIs (Digital Object Identifiers), which are an implementation of the Handle system (Paskin 2017)

    Modelling Research Expeditions in Wikidata: Best Practice for Standardisation and Contextualisation

    No full text
    Expeditions and other collecting events are a major source of objects in natural history museums (e.g., Mesibov 2021). Historically, these trips were often transdisciplinary: biological and Earth science specimens were collected at the same time as ethnological or anthropological objects. As a result, specimens and other material gathered during the same expedition, as well as the related data and metadata, are often distributed across multiple institutions. Many expeditions were driven by colonial agendas, aiming to discover new resources to exploit, and their findings were seldom shared with the source countries and local people. Understanding these expeditions illuminates the colonial origins of museum collections, and contributes to recognizing and addressing their impacts (e.g., Das and Lowe 2018, Ashby and Machin 2021).Research expeditions continue to contribute to natural history collections. There is a need to link historical or contemporary research expeditions to other entities, requiring the unambiguous labelling (and persistent identifiers) of such events. Stable identifiers for expeditions plus the sharing of metadata and descriptions in a wide range of languages will facilitate access to scattered information about the event, the institutions housing specimens and objects, the participants and the locations visited, and assist with the linking of distributed material and related research data. However, structured data for scientific expeditions are currently lacking. While identifier systems have been created for many entities over the last few decades, there is no dedicated identifier for research expeditions and similar events. Several studies have shown the importance of people identifiers for linking collection data (e.g., Groom et al. 2022), and we argue the same is true for expeditions.Wikidata is a multilingual community-curated knowledge base containing data structured in a human- and machine-readable format. It allows easy creation, updating and enriching of items on expeditions, and provides stable identifiers for them that can be used in collection management systems. Expeditions can be linked to participants and other agents, regions, localities, objects, archival material, maps, publications, field notebooks, documentary footage and art works resulting from the expeditions, thus making historical information more easily accessible and assisting with the acknowledgment of any imperial or colonial impact that may have resulted from the expedition. Expeditions in Wikidata can be hierarchical, e.g., linking a series of related events or under an umbrella project together providing a machine-readable way to harvest all project data. Wikidata also can provide links between present day countries and historical names for locations (e.g., former colonial names). Expeditions published as Linked Open Data make datasets more FAIR (Findable, Accessible, Interoperable, Reusable), and are also useful in data transcription and validation processes. Visualisation of  itinerary data and travel routes also facilitate data quality checks.An informal working group of people interested in the topic was formed to discuss standards and share best practices and recommendations regarding terminology, data modelling and contextualisation. Building upon previous work (e.g., Bauer et al. 2022, von Mering et al. 2022, Leachman 2023), we aim to work towards the enrichment, linking and standardisation of data about research expeditions. If the Wikidata identifiers of these expeditions and participants are added to the records of the corresponding entities in the collection management system, institutions can link from their own collection metadata to the relations made in Wikidata, including to collections in other institutions. The participants of the expedition can be further linked to specimens gathered during the expedition with the use of tools, such as Bionomia, which can facilitate data round-tripping between these collections and specimen records, the Global Biodiversity Information Facility (GBIF) and Wikidata (Shorthouse 2020). Other initiatives such as the Distributed System of Scientific Collections (DiSSCo) are also interested in incorporating these identifiers as links and annotations

    BioDATA: Biodiversity data mobilisation and data publication training in Eurasia

    No full text
    BioDATA (Biodiversity Data for Internationalisation in Higher Education) is an international project to develop and deliver biodiversity data training for undergraduate and postgraduate students from Armenia, Belarus, Tajikistan, and Ukraine. By training early career (student) biodiversity scholars, we aim at turning the current academic and education biodiversity landscape into a more open-data-friendly one. Professional practitioners (researchers, museum curators, and collection managers involved in data publishing) from each country were also invited to join the project as assistant teachers (mentors). The project is developed by the Research School in Biosystematics - ForBio and the Norwegian GBIF-node, both at the Natural History Museum of the University of Oslo in collaboration with the Secretariat of the Global Biodiversity Information Facility (GBIF) and partners from each of the target countries. The teaching material is based on the GBIF curriculum for data mobilization and all students will have the opportunity to gain the respective GBIF certification. All materials are made freely available for reuse and even in this very early phase of the project, we have already seen the first successful reuse of teaching materials among the project partners. The first BioDATA training event was organized in Minsk (Belarus) in February 2019 with the objective to train a minimum of four mentors from each target country. The mentor-trainees from this event will help us to teach the course to students in their home country together with teachers from the project team. BioDATA mentors will have the opportunity to gain GBIF certification as expert mentors which will open opportunities to contribute to future training events in the larger GBIF network. The BioDATA training events for the students will take place in Dushanbe (Tajikistan) in June 2019, in Minsk (Belarus) in November 2019, in Yerevan (Armenia) in April 2020, and in Kiev (Ukraine) in October 2020. Students from each country are invited to express their interest to participate by contacting their national project partner. We will close the project with a final symposium at the University of Oslo in March 2021. The project is funded by the Norwegian Agency for International Cooperation and Quality Enhancement in Higher Education (DIKU)
    corecore