2,463 research outputs found
Automatic Metadata Generation using Associative Networks
In spite of its tremendous value, metadata is generally sparse and
incomplete, thereby hampering the effectiveness of digital information
services. Many of the existing mechanisms for the automated creation of
metadata rely primarily on content analysis which can be costly and
inefficient. The automatic metadata generation system proposed in this article
leverages resource relationships generated from existing metadata as a medium
for propagation from metadata-rich to metadata-poor resources. Because of its
independence from content analysis, it can be applied to a wide variety of
resource media types and is shown to be computationally inexpensive. The
proposed method operates through two distinct phases. Occurrence and
co-occurrence algorithms first generate an associative network of repository
resources leveraging existing repository metadata. Second, using the
associative network as a substrate, metadata associated with metadata-rich
resources is propagated to metadata-poor resources by means of a discrete-form
spreading activation algorithm. This article discusses the general framework
for building associative networks, an algorithm for disseminating metadata
through such networks, and the results of an experiment and validation of the
proposed method using a standard bibliographic dataset
The Most Influential Paper Gerard Salton Never Wrote
Gerard Salton is often credited with developing the vector space model
(VSM) for information retrieval (IR). Citations to Salton give the impression
that the VSM must have been articulated as an IR model sometime between
1970 and 1975. However, the VSM as it is understood today evolved over a
longer time period than is usually acknowledged, and an articulation of the
model and its assumptions did not appear in print until several years after
those assumptions had been criticized and alternative models proposed. An
often cited overview paper titled ???A Vector Space Model for Information
Retrieval??? (alleged to have been published in 1975) does not exist, and
citations to it represent a confusion of two 1975 articles, neither of which
were overviews of the VSM as a model of information retrieval. Until the
late 1970s, Salton did not present vector spaces as models of IR generally
but rather as models of specifi c computations. Citations to the phantom
paper refl ect an apparently widely held misconception that the operational
features and explanatory devices now associated with the VSM must have
been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio
Contextualization of topics - browsing through terms, authors, journals and cluster allocations
This paper builds on an innovative Information Retrieval tool, Ariadne. The
tool has been developed as an interactive network visualization and browsing
tool for large-scale bibliographic databases. It basically allows to gain
insights into a topic by contextualizing a search query (Koopman et al., 2015).
In this paper, we apply the Ariadne tool to a far smaller dataset of 111,616
documents in astronomy and astrophysics. Labeled as the Berlin dataset, this
data have been used by several research teams to apply and later compare
different clustering algorithms. The quest for this team effort is how to
delineate topics. This paper contributes to this challenge in two different
ways. First, we produce one of the different cluster solution and second, we
use Ariadne (the method behind it, and the interface - called LittleAriadne) to
display cluster solutions of the different group members. By providing a tool
that allows the visual inspection of the similarity of article clusters
produced by different algorithms, we present a complementary approach to other
possible means of comparison. More particular, we discuss how we can - with
LittleAriadne - browse through the network of topical terms, authors, journals
and cluster solutions in the Berlin dataset and compare cluster solutions as
well as see their context.Comment: proceedings of the ISSI 2015 conference (accepted
Facets and Typed Relations as Tools for Reasoning Processes in Information Retrieval
Faceted arrangement of entities and typed relations for representing
different associations between the entities are established tools in knowledge
representation. In this paper, a proposal is being discussed combining both
tools to draw inferences along relational paths. This approach may yield new
benefit for information retrieval processes, especially when modeled for
heterogeneous environments in the Semantic Web. Faceted arrangement can be used
as a se-lection tool for the semantic knowledge modeled within the knowledge
repre-sentation. Typed relations between the entities of different facets can
be used as restrictions for selecting them across the facets
Recommended from our members
An investigation to study the feasibility of on-line bibliographic information retrieval system using an APP
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.This thesis reports an investigation on the feasibility study of a
searching mechanism using an APP suitable for an on-line bibliographic
retrieval, operation, especially for retrospective searches.
From the study of the searching methods used in the conventional
systems it is seen that elaborate file- and data- structures are
introduced to improve the response time of the system. These
consequently lead to software and hardware redundancies. To mask
these complexities of the system an expensive computer with higher
capabilities and more powerful instruction set is commonly used.
Thus the service of the systen becomes cost-ineffective.
On the other hand the primitive operations of a searching mechanism,
such as, association, domain selection, intersection and unions, are
the intrinsic features of an associative parallel processor. Therefore
it is important to establish the feasibility of an APP as a cost-effective
searching mechanise.
In this thesis a searching mechanism using an 'ON-THE-FLY' searching
technique has been proposed. The parallel search unit uses a Byte-oriented
VRL-APP for efficient character string processing.
At the time of undertaking this work the specification for neither the
retrieval systems nor the BO-VRL APP's were well established; hence a
two-phase investigation was originated. In the Phase I of the work a
bottom up approach was adopted to derive a formal and precise
specification for the BO-VRL-APP. During the Phase II of the work
a top-down approach was opted for the implementation of the searching
mechanism.
An experimental research vehicle has been developed to establish
the feasibility of an APP as a cost-effective searching mechanism.
Although rigorous proof of the feasibility has not been obtained,
the thesis establishes that the APP is well suited for on-line
bibligraphic information retrieval operations where substring searches
including boolean selection and threshold weights are efficiently
supported
Query Expansion of Zero-Hit Subject Searches: Using a Thesaurus in Conjunction with NLP Techniques
The focus of our study is zero-hit queries in keyword subject searches and the effort of increasing recall in these cases by reformulating and, then, expanding the initial queries using an external source of knowledge, namely a thesaurus. To this end, the objectives of this study are twofold. First, we perform the mapping of query terms to the thesaurus terms. Second, we use the matched terms to expand the user’s initial query by taking advantage of the thesaurus relations and implementing natural language processing (NLP) techniques. We report on the overall procedure and elaborate on key points and considerations of each step of the process
- …