Search CORE

2,463 research outputs found

Automatic Metadata Generation using Associative Networks

Author: de Lin S.
Han H.
Herbert Van De Sompel
Johan Bollen
Mao S.
Marko A. Rodriguez
Rorvig M.
Yang H.-C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2009
Field of study

In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this article leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and is shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and co-occurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This article discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and the results of an experiment and validation of the proposed method using a standard bibliographic dataset

arXiv.org e-Print Archive

Crossref

The Most Influential Paper Gerard Salton Never Wrote

Author: Dubin David
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign.
Publication date: 01/01/2004
Field of study

Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled ???A Vector Space Model for Information Retrieval??? (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specifi c computations. Citations to the phantom paper refl ect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Contextualization of topics - browsing through terms, authors, journals and cluster allocations

Author: Koopman Rob
Scharnhorst Andrea
Wang Shenghui
Publication venue
Publication date: 16/04/2015
Field of study

This paper builds on an innovative Information Retrieval tool, Ariadne. The tool has been developed as an interactive network visualization and browsing tool for large-scale bibliographic databases. It basically allows to gain insights into a topic by contextualizing a search query (Koopman et al., 2015). In this paper, we apply the Ariadne tool to a far smaller dataset of 111,616 documents in astronomy and astrophysics. Labeled as the Berlin dataset, this data have been used by several research teams to apply and later compare different clustering algorithms. The quest for this team effort is how to delineate topics. This paper contributes to this challenge in two different ways. First, we produce one of the different cluster solution and second, we use Ariadne (the method behind it, and the interface - called LittleAriadne) to display cluster solutions of the different group members. By providing a tool that allows the visual inspection of the similarity of article clusters produced by different algorithms, we present a complementary approach to other possible means of comparison. More particular, we discuss how we can - with LittleAriadne - browse through the network of topical terms, authors, journals and cluster solutions in the Berlin dataset and compare cluster solutions as well as see their context.Comment: proceedings of the ISSI 2015 conference (accepted

arXiv.org e-Print Archive

Facets and Typed Relations as Tools for Reasoning Processes in Information Retrieval

Author: A. Shiri
A.B. Buxton
L.M. Garshol
R. Green
V. Broughton
W. Gödert
W. Gödert
Publication venue
Publication date: 01/01/2014
Field of study

Faceted arrangement of entities and typed relations for representing different associations between the entities are established tools in knowledge representation. In this paper, a proposal is being discussed combining both tools to draw inferences along relational paths. This approach may yield new benefit for information retrieval processes, especially when modeled for heterogeneous environments in the Semantic Web. Faceted arrangement can be used as a se-lection tool for the semantic knowledge modeled within the knowledge repre-sentation. Typed relations between the entities of different facets can be used as restrictions for selecting them across the facets

arXiv.org e-Print Archive

Crossref

Recommended from our members

An investigation to study the feasibility of on-line bibliographic information retrieval system using an APP

Author: Dattagupta Rana
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/1977
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.This thesis reports an investigation on the feasibility study of a searching mechanism using an APP suitable for an on-line bibliographic retrieval, operation, especially for retrospective searches. From the study of the searching methods used in the conventional systems it is seen that elaborate file- and data- structures are introduced to improve the response time of the system. These consequently lead to software and hardware redundancies. To mask these complexities of the system an expensive computer with higher capabilities and more powerful instruction set is commonly used. Thus the service of the systen becomes cost-ineffective. On the other hand the primitive operations of a searching mechanism, such as, association, domain selection, intersection and unions, are the intrinsic features of an associative parallel processor. Therefore it is important to establish the feasibility of an APP as a cost-effective searching mechanise. In this thesis a searching mechanism using an 'ON-THE-FLY' searching technique has been proposed. The parallel search unit uses a Byte-oriented VRL-APP for efficient character string processing. At the time of undertaking this work the specification for neither the retrieval systems nor the BO-VRL APP's were well established; hence a two-phase investigation was originated. In the Phase I of the work a bottom up approach was adopted to derive a formal and precise specification for the BO-VRL-APP. During the Phase II of the work a top-down approach was opted for the implementation of the searching mechanism. An experimental research vehicle has been developed to establish the feasibility of an APP as a cost-effective searching mechanism. Although rigorous proof of the feasibility has not been obtained, the thesis establishes that the APP is well suited for on-line bibligraphic information retrieval operations where substring searches including boolean selection and threshold weights are efficiently supported

Brunel University Research Archive

Query Expansion of Zero-Hit Subject Searches: Using a Thesaurus in Conjunction with NLP Techniques

Author: A. Shiri
E.P. Lau
J. Greenberg
L. Hollink
L. Villén-Rueda
R. Mandala
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

The focus of our study is zero-hit queries in keyword subject searches and the effort of increasing recall in these cases by reformulating and, then, expanding the initial queries using an external source of knowledge, namely a thesaurus. To this end, the objectives of this study are twofold. First, we perform the mapping of query terms to the thesaurus terms. Second, we use the matched terms to expand the user’s initial query by taking advantage of the thesaurus relations and implementing natural language processing (NLP) techniques. We report on the overall procedure and elaborate on key points and considerations of each step of the process

E-LIS

Crossref