106,285 research outputs found
Words are Malleable: Computing Semantic Shifts in Political and Media Discourse
Recently, researchers started to pay attention to the detection of temporal
shifts in the meaning of words. However, most (if not all) of these approaches
restricted their efforts to uncovering change over time, thus neglecting other
valuable dimensions such as social or political variability. We propose an
approach for detecting semantic shifts between different viewpoints--broadly
defined as a set of texts that share a specific metadata feature, which can be
a time-period, but also a social entity such as a political party. For each
viewpoint, we learn a semantic space in which each word is represented as a low
dimensional neural embedded vector. The challenge is to compare the meaning of
a word in one space to its meaning in another space and measure the size of the
semantic shifts. We compare the effectiveness of a measure based on optimal
transformations between the two spaces with a measure based on the similarity
of the neighbors of the word in the respective spaces. Our experiments
demonstrate that the combination of these two performs best. We show that the
semantic shifts not only occur over time, but also along different viewpoints
in a short period of time. For evaluation, we demonstrate how this approach
captures meaningful semantic shifts and can help improve other tasks such as
the contrastive viewpoint summarization and ideology detection (measured as
classification accuracy) in political texts. We also show that the two laws of
semantic change which were empirically shown to hold for temporal shifts also
hold for shifts across viewpoints. These laws state that frequent words are
less likely to shift meaning while words with many senses are more likely to do
so.Comment: In Proceedings of the 26th ACM International on Conference on
Information and Knowledge Management (CIKM2017
Query expansion with naive bayes for searching distributed collections
The proliferation of online information resources increases the importance of effective and efficient distributed searching. However, the problem of word mismatch seriously hurts the effectiveness of distributed information retrieval. Automatic query expansion has been suggested as a technique for dealing with the fundamental issue of word mismatch. In this paper, we propose a method - query expansion with Naive Bayes to address the problem, discuss its implementation in IISS system, and present experimental results demonstrating its effectiveness. Such technique not only enhances the discriminatory power of typical queries for choosing the right collections but also hence significantly improves retrieval results
Semantic Modeling of Analytic-based Relationships with Direct Qualification
Successfully modeling state and analytics-based semantic relationships of
documents enhances representation, importance, relevancy, provenience, and
priority of the document. These attributes are the core elements that form the
machine-based knowledge representation for documents. However, modeling
document relationships that can change over time can be inelegant, limited,
complex or overly burdensome for semantic technologies. In this paper, we
present Direct Qualification (DQ), an approach for modeling any semantically
referenced document, concept, or named graph with results from associated
applied analytics. The proposed approach supplements the traditional
subject-object relationships by providing a third leg to the relationship; the
qualification of how and why the relationship exists. To illustrate, we show a
prototype of an event-based system with a realistic use case for applying DQ to
relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic
Computing (IEEE ICSC 2015
Exploring Topic-based Language Models for Effective Web Information Retrieval
The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model
Modeling social information skills
In a modern economy, the most important resource consists in\ud
human talent: competent, knowledgeable people. Locating the right person for\ud
the task is often a prerequisite to complex problem-solving, and experienced\ud
professionals possess the social skills required to find appropriate human\ud
expertise. These skills can be reproduced more and more with specific\ud
computer software, an approach defining the new field of social information\ud
retrieval. We will analyze the social skills involved and show how to model\ud
them on computer. Current methods will be described, notably information\ud
retrieval techniques and social network theory. A generic architecture and its\ud
functions will be outlined and compared with recent work. We will try in this\ud
way to estimate the perspectives of this recent domain
- …