Search CORE

2,524 research outputs found

Query Expansion for Survey Question Retrieval in the Social Sciences

Author: B Zapilko
C Carpineto
D Hienert
DC Blair
E Brent
GW Furnas
J Xu
K Järvelin
P Schaer
S Dallmeier-Tiessen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/06/2015
Field of study

In recent years, the importance of research data and the need to archive and to share it in the scientific community have increased enormously. This introduces a whole new set of challenges for digital libraries. In the social sciences typical research data sets consist of surveys and questionnaires. In this paper we focus on the use case of social science survey question reuse and on mechanisms to support users in the query formulation for data sets. We describe and evaluate thesaurus- and co-occurrence-based approaches for query expansion to improve retrieval quality in digital libraries and research data archives. The challenge here is to translate the information need and the underlying sociological phenomena into proper queries. As we can show retrieval quality can be improved by adding related terms to the queries. In a direct comparison automatically expanded queries using extracted co-occurring terms can provide better results than queries manually reformulated by a domain expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory and Practice of Digital Libraries 2015 (TPDL 2015

arXiv.org e-Print Archive

Crossref

Towards a Universal Wordnet by Learning from Combined Evidenc

Author: de Melo G.
Weikum G.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2009
Field of study

Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

MPG.PuRe

A Spinning Wheel for YARN: User Interface for a Crowdsourced Thesaurus

Author: Braslavski P.
Mukhin M.
Ustalov D.
Браславский П. И.
Мухин М. Ю.
Усталов Д. А.
Publication venue
Publication date: 01/01/2014
Field of study

YARN (Yet Another RussNet) project started in 2013 aims at creating a large open thesaurus for Russian using crowdsourcing. This paper describes synset assembly interface developed within the project — motivation behind it, design, usage scenarios, implementation details, and first experimental results

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Indexing Languages for Information Management, a Promising Future or an Obsolete Resource?

Author: Fraga Anabel
Morato Lara Jorge Luis
Moreiro González José Antonio
Sánchez Cuadrado Sonia
Publication venue: University of Westminster School of Media, Arts and Design
Publication date: 01/01/2009
Field of study

Indexing languages have traditionally been an essential tool for organizing and retrieving documental information. The inclusion of indexing languages into the digital environment leads to new frontiers, but also new opportunities. This study shows the historical evolution of the indexing languages and its application in document management field. We analyze diverse trends for their digital use from two perspectives: their integration with other digital and linguistic resources, and the adjustment of them into the Web environment. Finally, there is an analysis of how these languages are used in the Web 2.0 and the incorporation of ontologies in the Semantic Web.This work was carried out within the framework of a research Project financed by the Spanish government (Ministerio de Educación y Ciencia, Secretaría de Estado de Universidades e Investigación, TIN 2007-67153)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Geographical information retrieval with ontologies of place

Author: A. Tversky
B. Smith
B. Smith
C. B. Jones
D. Tudhope
D. Walker
H. Cuoclelis
J. H. Lee
K. Beard
M. A. Rodriguez
M. Agosti
M.R. Curry
N. Guarino
N. Guarino
P. Gould
P. Harpring
R. R. Larson
R. Rada
Y. W. Kirn
Publication venue
Publication date: 01/01/2001
Field of study

Geographical context is required of many information retrieval tasks in which the target of the search may be documents, images or records which are referenced to geographical space only by means of place names. Often there may be an imprecise match between the query name and the names associated with candidate sources of information. There is a need therefore for geographical information retrieval facilities that can rank the relevance of candidate information with respect to geographical closeness of place as well as semantic closeness with respect to the information of interest. Here we present an ontology of place that combines limited coordinate data with semantic and qualitative spatial relationships between places. This parsimonious model of geographical place supports maintenance of knowledge of place names that relate to extensive regions of the Earth at multiple levels of granularity. The ontology has been implemented with a semantic modelling system linking non-spatial conceptual hierarchies with the place ontology. An hierarchical spatial distance measure is combined with Euclidean distance between place centroids to create a hybrid spatial distance measure. This is integrated with thematic distance, based on classification semantics, to create an integrated semantic closeness measure that can be used for a relevance ranking of retrieved objects

CiteSeerX

Southampton (e-Prints Soton)

Crossref

University of South Wales Research Explorer

Open Research Online (The Open University)