Search CORE

17 research outputs found

Applications of Natural Language Processing in Biodiversity Science

Author: Cui Hong
Mozzherin Dmitry
Thessen Anne E.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2012
Field of study

Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science

Crossref

Woods Hole Open Access Server

Directory of Open Access Journals

PubMed Central

Semantic annotation of morphological descriptions: an overall strategy

Author: A Taylor
D Kirkup
E Riloff
G Curry
G Diggs
G Sautter
H Cui
H Cui
H Cui
H Cui
H Cui
H Cui
Hong Cui
MM Wood
R Abascal
S Lydon
S Soderland
X Tang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

User-centered semantic dataset retrieval

Author: Löffler Felicitas
Publication venue
Publication date: 01/01/2023
Field of study

Finding relevant research data is an increasingly important but time-consuming task in daily research practice. Several studies report on difficulties in dataset search, e.g., scholars retrieve only partial pertinent data, and important information can not be displayed in the user interface. Overcoming these problems has motivated a number of research efforts in computer science, such as text mining and semantic search. In particular, the emergence of the Semantic Web opens a variety of novel research perspectives. Motivated by these challenges, the overall aim of this work is to analyze the current obstacles in dataset search and to propose and develop a novel semantic dataset search. The studied domain is biodiversity research, a domain that explores the diversity of life, habitats and ecosystems. This thesis has three main contributions: (1) We evaluate the current situation in dataset search in a user study, and we compare a semantic search with a classical keyword search to explore the suitability of semantic web technologies for dataset search. (2) We generate a question corpus and develop an information model to figure out on what scientific topics scholars in biodiversity research are interested in. Moreover, we also analyze the gap between current metadata and scholarly search interests, and we explore whether metadata and user interests match. (3) We propose and develop an improved dataset search based on three components: (A) a text mining pipeline, enriching metadata and queries with semantic categories and URIs, (B) a retrieval component with a semantic index over categories and URIs and (C) a user interface that enables a search within categories and a search including further hierarchical relations. Following user centered design principles, we ensure user involvement in various user studies during the development process

Digitale Bibliothek Thüringen

The data concept behind the data: From metadata models and labelling schemes towards a generic spectral library

Author: Arnold Stephan
Canters Frank
Heiden Uta
Hueni A.
Ji Chaonan
Jilge Marianne
Priem Frederik
Publication venue
Publication date: 01/01/2022
Field of study

Spectral libraries play a major role in imaging spectroscopy. They are commonly used to store end-member and spectrally pure material spectra, which are primarily used for mapping or unmixing purposes. However, the development of spectral libraries is time consuming and usually sensor and site dependent. Spectral libraries are therefore often developed, used and tailored only for a specific case study and only for one sensor. Multi-sensor and multi-site use of spectral libraries is difficult and requires technical effort for adaptation, transformation, and data harmonization steps. Especially the huge amount of urban material specifications and its spectral variations hamper the setup of a complete spectral library consisting of all available urban material spectra. By a combined use of different urban spectral libraries, besides the improvement of spectral inter- and intra-class variability, missing material spectra could be considered with respect to a multi-sensor/ -site use. Publicly available spectral libraries mostly lack the metadata information that is essential for describing spectra acquisition and sampling background, and can serve to some extent as a measure of quality and reliability of the spectra and the entire library itself. In the GenLib project, a concept for a generic, multi-site and multi-sensor usable spectral library for image spectra on the urban focus was developed. This presentation will introduce a 1) unified, easy-to-understand hierarchical labeling scheme combined with 2) a comprehensive metadata concept that is 3) implemented in the SPECCHIO spectral information system to promote the setup and usability of a generic urban spectral library (GUSL). The labelling scheme was developed to ensure the translation of individual spectral libraries with their own labelling schemes and their usually varying level of details into the GUSL framework. It is based on a modified version of the EAGLE classification concept by combining land use, land cover, land characteristics and spectral characteristics. The metadata concept consists of 59 mandatory and optional attributes that are intended to specify the spatial context, spectral library information, references, accessibility, calibration, preprocessing steps, and spectra specific information describing library spectra implemented in the GUSL. It was developed on the basis of existing metadata concepts and was subject of an expert survey. The metadata concept and the labelling scheme are implemented in the spectral information system SPECCHIO, which is used for sharing and holding GUSL spectra. It allows easy implementation of spectra as well as their specification with the proposed metadata information to extend the GUSL. Therefore, the proposed data model represents a first fundamental step towards a generic usable and continuously expandable spectral library for urban areas. The metadata concept and the labelling scheme also build the basis for the necessary adaptation and transformation steps of the GUSL in order to use it entirely or in excerpts for further multi-site and multi-sensor applications

Institute of Transport Research:Publications

AI for Everyone?

Author
Publication venue: 'University of Westminster Press'
Publication date: 14/09/2022
Field of study

We are entering a new era of technological determinism and solutionism in which governments and business actors are seeking data-driven change, assuming that Artificial Intelligence is now inevitable and ubiquitous. But we have not even started asking the right questions, let alone developed an understanding of the consequences. Urgently needed is debate that asks and answers fundamental questions about power. This book brings together critical interrogations of what constitutes AI, its impact and its inequalities in order to offer an analysis of what it means for AI to deliver benefits for everyone. The book is structured in three parts: Part 1, AI: Humans vs. Machines, presents critical perspectives on human-machine dualism. Part 2, Discourses and Myths About AI, excavates metaphors and policies to ask normative questions about what is ‘desirable’ AI and what conditions make this possible. Part 3, AI Power and Inequalities, discusses how the implementation of AI creates important challenges that urgently need to be addressed. Bringing together scholars from diverse disciplinary backgrounds and regional contexts, this book offers a vital intervention on one of the most hyped concepts of our times

Directory of Open Access Books (DOAB)

AI for Everyone? Critical Perspectives

Author: Akdag Salah A.A.
Akdag Salah A.A.
Araújo W.F.
Araújo W.F.
Babu A.
Babu A.
Brevini B.
Brevini B.
Daly A.
Daly A.
Dencik L.
Dencik L.
Devit S.K.
Devit S.K.
Grohmann R.
Grohmann R.
Hofkirchner W.
Hofkirchner W.
Kaplan A.
Kaplan A.
Mann M.
Mann M.
McQuillan D.
McQuillan D.
Ng J.
Ng J.
O’Connell C.
O’Connell C.
Prodnik J.A.
Prodnik J.A.
Rehak R.
Rehak R.
Shahin S.
Shahin S.
Steinhoff J.
Steinhoff J.
Van de Wiele C.
Van de Wiele C.
Verdegem P.
Verdegem P.
Publication venue: 'University of Westminster Press'
Publication date: 01/01/2021
Field of study

WestminsterResearch

Towards data justice unionism? A labour perspective on AI governance

Author: Dencik Lina
Publication venue: Westminster University Press
Publication date: 20/09/2021
Field of study

Goldsmiths Research Online

AI for Everyone?

Author
Publication venue: 'University of Westminster Press'
Publication date
Field of study

OAPEN Library

Book of abstracts, 4th World Congress on Agroforestry

Author: Dupraz Christian (ed.)
Gosme Marie (ed.)
Lawson Gerry (ed.)
Publication venue: 'CIRAD (Centre de Cooperation Internationale en Recherche Agronomique Pour le Developpement)'
Publication date: 01/01/2019
Field of study

International audienc

Agritrop

HAL-CIRAD