6 research outputs found
Recommended from our members
Tomato tomahto: European perspectives on information science
'And oh, if we ever part, then that might break my heart’ (Gershwin and Gershwin,). In the lyric, ‘tomato tomahto', the marked, or explicitly differentiated term, is tomahto, corresponding more to the English rather than the United States pronunciation. The marked term of a contrast characteristically designates the exception or minor term and the distinctions contained in the unmarked term may be implicitly, and incompletely, understood. Analogously, information science has often been implicitly based in the United States and influenced by American modes of thought, while European, including English, developments have been the occasionally marked and often minor term. This panel explores European perspectives on information science, explicitly and implicitly contrasting them with United States perspectives, from a base in a number of languages and in Europe and beyond. The panel employs diverse and complementary viewpoints and should make for a lively discussion. It concludes, in sympathy with Gershwin, that cooperation and integration, corresponding to increasing globalization, is the way forward. The combination of European and beyond and United States perspectives on information science is especially appropriate for the first ASIS&T Annual Meeting outside North America
Multi Word Term Queries for Focused Information Retrieval.
International audienceIn this paper, we address both standard and focused retrieval tasks based on comprehensible language models and interactive query expansion (IQE). Query topics are expanded using an initial set of Multi Word Terms (MWTs) selected from top n ranked documents. MWTs are special text units that represent domain concepts and objects. As such, they can better represent query topics than ordinary phrases or n-grams. We tested different query representations: bag-of-words, phrases, flat list of MWTs, subsets of MWTs. We also combined the initial set of MWTs obtained in an IQE process with automatic query expansion (AQE) using language models and smoothing mechanism. We chose as baseline the Indri IR engine based on the language model using Dirichlet smoothing. The experiment is carried out on two benchmarks: TREC Enterprise track (TRECent) 2007 and 2008 collections; INEX 2008 Ad-hoc track using the Wikipedia collection
Recommended from our members
Pluri, multi-, trans-meta-and interdisciplinary nature of LIS. Does it really matter?
The field of LIS is beset by recurrent debates as to its disciplinary status. For decades, the interdisciplinary nature of information science has been upheld without much proof from the ground. But if LIS is not an interdiscipline, is it then a meta-, a trans-a pluri-, a multi-or simply a discipline? The different proposals for qualifying the nature of LIS or for delineating its frontiers suggest that its fundamental nature remains unclear for its community. But is LIS alone in this dilemma and does it really matter? Does it stop the field from progressing
Design and development of a concept-based multi-document summarization system for research abstracts
This paper describes a new concept-based multi-document summarization system that employs discourse parsing, information extraction and information integration. Dissertation abstracts in the field of sociology were selected as sample documents for this study. The summarization process includes four major steps — (1) parsing dissertation abstracts into five standard sections; (2) extracting research concepts (often operationalized as research variables) and their relationships, the research methods used and the contextual relations from specific sections of the text; (3) integrating similar concepts and relationships across different abstracts; and (4) combining and organizing the different kinds of information using a variable-based framework, and presenting them in an interactive web-based interface. The accuracy of each summarization step was evaluated by comparing the system-generated output against human coding. The user evaluation carried out in the study indicated that the majority of subjects (70%) preferred the concept-based summaries generated using the system to the sentence-based summaries generated using traditional sentence extraction techniques