Search CORE

3,347 research outputs found

Recommended from our members

Data standardization

Author: Gal MS
Rubinfeld DL
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

With data rapidly becoming the lifeblood of the global economy, the ability to improve its use significantly affects both social and private welfare. Data standardization is key to facilitating and improving the use of data when data portability and interoperability are needed. Absent data standardization, a “Tower of Babel” of different databases may be created, limiting synergetic knowledge production. Based on interviews with data scientists, this Article identifies three main technological obstacles to data portability and interoperability: metadata uncertainties, data transfer obstacles, and missing data. It then explains how data standardization can remove at least some of these obstacles and lead to smoother data flows and better machine learning. The Article then identifies and analyzes additional effects of data standardization. As shown, data standardization has the potential to support a competitive and distributed data collection ecosystem and lead to easier policing in cases where rights are infringed or unjustified harms are created by data-fed algorithms. At the same time, increasing the scale and scope of data analysis can create negative externalities in the form of better profiling, increased harms to privacy, and cybersecurity harms. Standardization also has implications for investment and innovation, especially if lock-in to an inefficient standard occurs. The Article then explores whether market-led standardization initiatives can be relied upon to increase welfare, and the role governmental-facilitated data standardization should play, if at all

eScholarship - University of California

Knowledge Organization Systems (KOS) in the Semantic Web: A Multi-Dimensional Review

Author: Mayr Philipp
Zeng Marcia Lei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/01/2018
Field of study

Since the Simple Knowledge Organization System (SKOS) specification and its SKOS eXtension for Labels (SKOS-XL) became formal W3C recommendations in 2009 a significant number of conventional knowledge organization systems (KOS) (including thesauri, classification schemes, name authorities, and lists of codes and terms, produced before the arrival of the ontology-wave) have made their journeys to join the Semantic Web mainstream. This paper uses "LOD KOS" as an umbrella term to refer to all of the value vocabularies and lightweight ontologies within the Semantic Web framework. The paper provides an overview of what the LOD KOS movement has brought to various communities and users. These are not limited to the colonies of the value vocabulary constructors and providers, nor the catalogers and indexers who have a long history of applying the vocabularies to their products. The LOD dataset producers and LOD service providers, the information architects and interface designers, and researchers in sciences and humanities, are also direct beneficiaries of LOD KOS. The paper examines a set of the collected cases (experimental or in real applications) and aims to find the usages of LOD KOS in order to share the practices and ideas among communities and users. Through the viewpoints of a number of different user groups, the functions of LOD KOS are examined from multiple dimensions. This paper focuses on the LOD dataset producers, vocabulary producers, and researchers (as end-users of KOS).Comment: 31 pages, 12 figures, accepted paper in International Journal on Digital Librarie

arXiv.org e-Print Archive

SSOAR - Social Science Open Access Repository

Realizing the Promise of a Digital Ecosystem for Science and Scholarship

Author: Huerta PhD, Michael
Publication venue: 'The University of Kansas'
Publication date: 01/09/2017
Field of study

The University of Kansas: Journals@KU

Biodiversity Informatics

Enhancing Scholarly Publications: Developing Hybrid Monographs in the Humanities and Social Sciences

Author: Jankowski Nicholas
Scharnhorst Andrea
Tatum Clifford
Tatum Zuotian
Publication venue: 'CISP Journal Services'
Publication date: 19/12/2012
Field of study

Enhancing publications has a long history but is gaining acceleration as authors and publishers explore electronic tablets as devices for dissemination and presentation. Enhancement of scholarly publications, in contrast, more often takes place in a Web environment and is coupled with presentation of supplementary materials related to research. The approach to enhancing scholarly publications presented in this article goes a step further and involves the interlinking of the “objects” of a document: datasets, supplementary materials, secondary analyses, and post-publication interventions. This approach connects the user-centricity of Web 2.0 with the Semantic Web. It aims at facilitating long-term content structure through standardized formats intended to improve interoperability between concepts and terms within and across knowledge domains. We explored this conception of enhancement on a small set of books prepared for traditional academic publishers. While the project was primarily an exercise in development, the conclusion section of the article reflects on areas where conceptual and empirical studies could be initiated to complement this new direction in scholarly publishing.&nbsp

Scholarly and Research Communication (E-Journal)

Enabling semantic queries across federated bioinformatics databases

Author: Anisimova M
Dessimoz C
Gil M
Mendes de Farias T
Robinson-Rechavi M
Sima AC
Stockinger H
Stockinger K
Zbinden E
Publication venue
Publication date: 07/11/2019
Field of study

MOTIVATION: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases. RESULTS: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface

UCL Discovery

Semantic Linking of Research Infrastructure Metadata

Author: A Stellato
B Jörg
C Lagoze
D Corsar
D Le-Phuoc
G Bella
G Montoya
I Altintas
J Madin
K Patroumpas
MD Wilkinson
O Hartig
O Zamazal
P Martin
T Baker
T Berners-Lee
T Miksa
X Liao
Y Marketakis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

The INCF Digital Atlasing Program: Report on Digital Atlasing Standards in the Rodent Brain

Author: Albert Burger
Fons Verbeek
G. Allan Johnson
Ilya Zaslavsky
Jonathan Nissanov
Jyl Boline
Luis Puelles
Lydia Ng
Maryann Martone
Michael Hawrylycz
Seth Ruffins
Tsutomu Hashikawa
Publication venue
Publication date: 23/11/2009
Field of study

The goal of the INCF Digital Atlasing Program is to provide the vision and direction necessary to make the rapidly growing collection of multidimensional data of the rodent brain (images, gene expression, etc.) widely accessible and usable to the international research community. This Digital Brain Atlasing Standards Task Force was formed in May 2008 to investigate the state of rodent brain digital atlasing, and formulate standards, guidelines, and policy recommendations.

Our first objective has been the preparation of a detailed document that includes the vision and specific description of an infrastructure, systems and methods capable of serving the scientific goals of the community, as well as practical issues for achieving
the goals. This report builds on the 1st INCF Workshop on Mouse and Rat Brain Digital Atlasing Systems (Boline et al., 2007, _Nature Preceedings_, doi:10.1038/npre.2007.1046.1) and includes a more detailed analysis of both the current state and desired state of digital atlasing along with specific recommendations for achieving these goals

Crossref

Nature Precedings