2,931 research outputs found
Combining Thesaurus Knowledge and Probabilistic Topic Models
In this paper we present the approach of introducing thesaurus knowledge into
probabilistic topic models. The main idea of the approach is based on the
assumption that the frequencies of semantically related words and phrases,
which are met in the same texts, should be enhanced: this action leads to their
larger contribution into topics found in these texts. We have conducted
experiments with several thesauri and found that for improving topic models, it
is useful to utilize domain-specific knowledge. If a general thesaurus, such as
WordNet, is used, the thesaurus-based improvement of topic models can be
achieved with excluding hyponymy relations in combined topic models.Comment: Accepted to AIST-2017 conference (http://aistconf.ru/). The final
publication will be available at link.springer.co
A survey on the use of relevance feedback for information access systems
Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
- âŠ