66 research outputs found
How can heat maps of indexing vocabularies be utilized for information seeking purposes?
The ability to browse an information space in a structured way by exploiting
similarities and dissimilarities between information objects is crucial for
knowledge discovery. Knowledge maps use visualizations to gain insights into
the structure of large-scale information spaces, but are still far away from
being applicable for searching. The paper proposes a use case for enhancing
search term recommendations by heat map visualizations of co-word
relation-ships taken from indexing vocabulary. By contrasting areas of
different "heat" the user is enabled to indicate mainstream areas of the field
in question more easily.Comment: URL workshop proceedings: http://ceur-ws.org/Vol-1311
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric techniques are not yet widely used to enhance retrieval
processes in digital libraries, although they offer value-added effects for
users. In this paper we will explore how statistical modelling of scholarship,
such as Bradfordizing or network analysis of coauthorship network, can improve
retrieval services for specific communities, as well as for large, cross-domain
large collections. This paper aims to raise awareness of the missing link
between information retrieval (IR) and bibliometrics / scientometrics and to
create a common ground for the incorporation of bibliometric-enhanced services
into retrieval at the digital library interface.Comment: 4 pages, IEEE BigData 2013, Workshop on Scholarly Big Data:
Challenges and Idea
Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications
In this paper, we describe our effort to create a new corpus for the evaluation of detecting and linking so-called survey variables in social science publications (e.g., "Do you believe in Heaven?"). The task is to recognize survey variable mentions in a given text, disambiguate
them, and link them to the corresponding variable within a knowledge base. Since there are generally hundreds of candidates to link to and due to the wide variety of forms they can take, this is a challenging task within NLP. The contribution of our work is the first gold standard corpus for the variable detection and linking task. We describe the annotation guidelines and the annotation process. The produced corpus is multilingual - German and English - and includes manually curated word and phrase alignments. Moreover, it includes text samples that could not be assigned to any variables, denoted as negative examples. Based on the new dataset, we conduct an evaluation of several state-of-the-art text classification and textual similarity methods. The annotated corpus is made available along with an open-source baseline system for variable mention identification and linking
Mining Social Science Publications for Survey Variables
Research in Social Science is usually based on survey data where individual research questions relate to observable concepts (variables). However, due to a lack of standards for data citations a reliable identification of the variables used is often difficult. In this paper, we present a work-in-progress study that seeks to provide a solution to the variable detection task based on supervised machine learning algorithms, using a linguistic analysis pipeline to extract a rich feature set, including terminological concepts and similarity metric scores. Further, we present preliminary results on a small dataset that has been specifically designed for this task, yielding modest improvements over the baseline
Applying Science Models for Search
The paper proposes three different kinds of science models as value-added
services that are integrated in the retrieval process to enhance retrieval
quality. The paper discusses the approaches Search Term Recommendation,
Bradfordizing and Author Centrality on a general level and addresses
implementation issues of the models within a real-life retrieval environment.Comment: 14 pages, 3 figures, ISI 201
- …