189 research outputs found
Dataset Search in Biodiversity Research: Do Metadata in Data Repositories Reflect Scholarly Information Needs?
Abstract
The increasing amount of publicly available research data provides the opportunity to link and integrate data in order to create and prove novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consuming task in daily research practice. In this study, we explore what hampers dataset retrieval in biodiversity research, a field that produces a large amount of heterogeneous data. In particular, we focus on scholarly search interests and metadata, the primary source of data in a dataset retrieval system. We show that existing metadata currently poorly reflect information needs and therefore are the biggest obstacle in retrieving relevant data. Our findings indicate that for data seekers in the biodiversity domain environments, materials and chemicals, species, biological and chemical processes, locations, data parameters and data types are important information categories. These interests are well covered in metadata elements of domain-specific standards. However, instead of utilizing these standards, large data repositories tend to use metadata standards with domain-independent metadata fields that cover search interests only to some extent. A second problem are arbitrary keywords utilized in descriptive fields such as title, description or subject. Keywords support scholars in a full text search only if the provided terms syntactically match or their semantic relationship to terms used in a user query is known
Recommended from our members
Results of the ontology alignment evaluation initiative 2019
The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity (from simple thesauri to expressive OWL ontologies) and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2019 campaign offered 11 tracks with 29 test cases, and was attended by 20 participants. This paper is an overall presentation of that campaign
Techniques for organizational memory information systems
The KnowMore project aims at providing active support to humans working on knowledge-intensive tasks. To this end the knowledge available in the modeled business processes or their incarnations in specific workflows shall be used to improve information handling. We present a representation formalism for knowledge-intensive tasks and the specification of its object-oriented realization. An operational semantics is sketched by specifying the basic functionality of the Knowledge Agent which works on the knowledge intensive task representation.
The Knowledge Agent uses a meta-level description of all information sources available in the Organizational Memory. We discuss the main dimensions that such a description scheme must be designed along, namely information content, structure, and context. On top of relational database management systems, we basically realize deductive object- oriented modeling with a comfortable annotation facility. The concrete knowledge descriptions are obtained by configuring the generic formalism with ontologies which describe the required modeling dimensions.
To support the access to documents, data, and formal knowledge in an Organizational Memory an integrated domain ontology and thesaurus is proposed which can be constructed semi-automatically by combining document-analysis and knowledge engineering methods. Thereby the costs for up-front knowledge engineering and the need to consult domain experts can be considerably reduced. We present an automatic thesaurus generation tool and show how it can be applied to build and enhance an integrated ontology /thesaurus. A first evaluation shows that the proposed method does indeed facilitate knowledge acquisition and maintenance of an organizational memory
A method for creating digital signature policies.
Increased political pressures towards a more efficient public sector have resulted in the increased proliferation of electronic documents and associated technologies such as Digital Signatures. Whilst Digital Signatures provide electronic document security functions, they do not confer legal meaning of a signature which captures the conditions under which a signature can be deemed to be legally valid. Whilst in the paper-world this information is often communicated implicitly, verbally or through notes within the document itself, in the electronic world a technological tool is required to communicate this meaning; one such technological aid is the Digital Signature Policy. In a transaction where the legality of a signature must be established, a Digital Signature Policy can confer the necessary contextual information that is required to make such a judgment. The Digital Signature Policy captures information such as the terms to which a signatory wishes to bind himself, the actual legal clauses and acts being invoked by the process of signing, the conditions under which a signatory's signature is deemed legally valid and other such information. As this is a relatively new technology, little literature exists on this topic. This research was conducted in an Action Research collaboration with a Spanish Public Sector organisation that sought to introduce Digital Signature Policy technology; their specific research problem was that the production of Digital Signature Policies was time consuming, resource intensive, arduous and suffered from lack of quality. The research therefore sought to develop a new and improved method for creating Digital Signature Policies. The researcher collaborated with the problem owner, as is typical of Participative Action Research. The research resulted in the development of a number of Information Systems artefacts, the development of a method for creating Digital Signature Policies and finally led to a stage where the problem owner could successfully develop the research further without the researcher's further input
Business process documentation in creative work systems:a design science study in television production
Die vorliegende Arbeit untersucht den Einfluss von Kreativität auf die Modellierung und Dokumentation von Geschäftsprozessen. Dabei wird ausgehend von einer theoretischen Basis eine Methode entwickelt, die die Erfassung kreativitätsintensiver Prozesse ermöglicht. Deren Besonderheit ist, dass sie das in der Prozessmodellierung vorherrschende strenge Kontrollflussparadigma aufbricht, um so dem Anspruch kreativer Arbeitssysteme an Flexibilität gerecht zu werden. Ausgehend vom kreativen Produkt werden dabei sukzessiv kreative Teilprozesse von administrativen Aufgaben isoliert, um so ein angemessenes Management beider zu ermöglichen. Die Methode wird in einer umfassenden Studie im Kontext der Fernsehproduktion in Deutschland evaluiert. Auf Basis semi-strukturierter Interviews werden dabei umfassende Modelle für die Produktlinien TV-Film, Serie, Daily Soap und Entertainment diskutiert. Die Ergebnisse werden abschließend auf die Methode und die zu Grunde liegende Theorie zurückgespiegelt.This study investigates the influence of creativity on the modeling and
documentation of business processes. Based on a substantive theory a
modeling method is developed that allows for the capturing of
creativity-intensive processes. This method discards a strict and formal
interpretation of the predominant control-flow paradigm in process
modeling and thus conforms to the flexibility requirements of creative
work systems. Emanating from the creative product, creative and
administrative subprocesses are successively revealed, thus enabling the
adequate management of both process types. The method is applied to the
context of German television production. Comprising process models for
the product lines television movie, primetime series, daily soap and
entertainment are derived from qualitative interview data. The results
of this evaluation are fed back to both the method as well as the
foundational theory
PREDON Scientific Data Preservation 2014
LPSC14037Scientific data collected with modern sensors or dedicated detectors exceed very often the perimeter of the initial scientific design. These data are obtained more and more frequently with large material and human efforts. A large class of scientific experiments are in fact unique because of their large scale, with very small chances to be repeated and to superseded by new experiments in the same domain: for instance high energy physics and astrophysics experiments involve multi-annual developments and a simple duplication of efforts in order to reproduce old data is simply not affordable. Other scientific experiments are in fact unique by nature: earth science, medical sciences etc. since the collected data is "time-stamped" and thereby non-reproducible by new experiments or observations. In addition, scientific data collection increased dramatically in the recent years, participating to the so-called "data deluge" and inviting for common reflection in the context of "big data" investigations. The new knowledge obtained using these data should be preserved long term such that the access and the re-use are made possible and lead to an enhancement of the initial investment. Data observatories, based on open access policies and coupled with multi-disciplinary techniques for indexing and mining may lead to truly new paradigms in science. It is therefore of outmost importance to pursue a coherent and vigorous approach to preserve the scientific data at long term. The preservation remains nevertheless a challenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms. To address this challenge, the PREDON project has been initiated in France in 2012 within the MASTODONS program: a Big Data scientific challenge, initiated and supported by the Interdisciplinary Mission of the National Centre for Scientific Research (CNRS). PREDON is a study group formed by researchers from different disciplines and institutes. Several meetings and workshops lead to a rich exchange in ideas, paradigms and methods. The present document includes contributions of the participants to the PREDON Study Group, as well as invited papers, related to the scientific case, methodology and technology. This document should be read as a "facts finding" resource pointing to a concrete and significant scientific interest for long term research data preservation, as well as to cutting edge methods and technologies to achieve this goal. A sustained, coherent and long term action in the area of scientific data preservation would be highly beneficial
Recommended from our members
A modular, open-source information extraction framework for identifying clinical concepts and processes of care in clinical narratives
In this thesis, a synthesis is presented of the knowledge models required by clinical informa- tion systems that provide decision support for longitudinal processes of care. Qualitative research techniques and thematic analysis are novelly applied to a systematic review of the literature on the challenges in implementing such systems, leading to the development of an original conceptual framework. The thesis demonstrates how these process-oriented systems make use of a knowledge base derived from workflow models and clinical guidelines, and argues that one of the major barriers to implementation is the need to extract explicit and implicit information from diverse resources in order to construct the knowledge base. Moreover, concepts in both the knowledge base and in the electronic health record (EHR) must be mapped to a common ontological model. However, the majority of clinical guideline information remains in text form, and much of the useful clinical information residing in the EHR resides in the free text fields of progress notes and laboratory reports. In this thesis, it is shown how natural language processing and information extraction techniques provide a means to identify and formalise the knowledge components required by the knowledge base. Original contributions are made in the development of lexico-syntactic patterns and the use of external domain knowledge resources to tackle a variety of information extraction tasks in the clinical domain, such as recognition of clinical concepts, events, temporal relations, term disambiguation and abbreviation expansion. Methods are developed for adapting existing tools and resources in the biomedical domain to the processing of clinical texts, and approaches to improving the scalability of these tools are proposed and evalu- ated. These tools and techniques are then combined in the creation of a novel approach to identifying processes of care in the clinical narrative. It is demonstrated that resolution of coreferential and anaphoric relations as narratively and temporally ordered chains provides a means to extract linked narrative events and processes of care from clinical notes. Coreference performance in discharge summaries and progress notes is largely dependent on correct identification of protagonist chains (patient, clinician, family relation), pronominal resolution, and string matching that takes account of experiencer, temporal, spatial, and anatomical context; whereas for laboratory reports additional, external domain knowledge is required. The types of external knowledge and their effects on system performance are identified and evaluated. Results are compared against existing systems for solving these tasks and are found to improve on them, or to approach the performance of recently reported, state-of-the- art systems. Software artefacts developed in this research have been made available as open-source components within the General Architecture for Text Engineering framework
- …