2,954 research outputs found
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Information Retrieval Models
Many applications that handle information on the internet would be completely\ud
inadequate without the support of information retrieval technology. How would\ud
we find information on the world wide web if there were no web search engines?\ud
How would we manage our email without spam filtering? Much of the development\ud
of information retrieval technology, such as web search engines and spam\ud
filters, requires a combination of experimentation and theory. Experimentation\ud
and rigorous empirical testing are needed to keep up with increasing volumes of\ud
web pages and emails. Furthermore, experimentation and constant adaptation\ud
of technology is needed in practice to counteract the effects of people that deliberately\ud
try to manipulate the technology, such as email spammers. However,\ud
if experimentation is not guided by theory, engineering becomes trial and error.\ud
New problems and challenges for information retrieval come up constantly.\ud
They cannot possibly be solved by trial and error alone. So, what is the theory\ud
of information retrieval?\ud
There is not one convincing answer to this question. There are many theories,\ud
here called formal models, and each model is helpful for the development of\ud
some information retrieval tools, but not so helpful for the development others.\ud
In order to understand information retrieval, it is essential to learn about these\ud
retrieval models. In this chapter, some of the most important retrieval models\ud
are gathered and explained in a tutorial style
Xu: An Automated Query Expansion and Optimization Tool
The exponential growth of information on the Internet is a big challenge for
information retrieval systems towards generating relevant results. Novel
approaches are required to reformat or expand user queries to generate a
satisfactory response and increase recall and precision. Query expansion (QE)
is a technique to broaden users' queries by introducing additional tokens or
phrases based on some semantic similarity metrics. The tradeoff is the added
computational complexity to find semantically similar words and a possible
increase in noise in information retrieval. Despite several research efforts on
this topic, QE has not yet been explored enough and more work is needed on
similarity matching and composition of query terms with an objective to
retrieve a small set of most appropriate responses. QE should be scalable,
fast, and robust in handling complex queries with a good response time and
noise ceiling. In this paper, we propose Xu, an automated QE technique, using
high dimensional clustering of word vectors and Datamuse API, an open source
query engine to find semantically similar words. We implemented Xu as a command
line tool and evaluated its performances using datasets containing news
articles and human-generated QEs. The evaluation results show that Xu was
better than Datamuse by achieving about 88% accuracy with reference to the
human-generated QE.Comment: Accepted to IEEE COMPSAC 201
A heuristic information retrieval study : an investigation of methods for enhanced searching of distributed data objects exploiting bidirectional relevance feedback
A thesis submitted for the degree of Doctor of Philosophy of the University of LutonThe primary aim of this research is to investigate methods of improving the effectiveness of current information retrieval systems. This aim can be achieved by accomplishing numerous supporting objectives.
A foundational objective is to introduce a novel bidirectional, symmetrical fuzzy logic theory which may prove valuable to information retrieval, including internet searches of distributed data objects. A further objective is to design, implement and apply the novel theory to an experimental information retrieval system called ANACALYPSE, which automatically computes the relevance of a large number of unseen documents from expert relevance feedback on a small number of documents read.
A further objective is to define a methodology used in this work as an experimental information retrieval framework consisting of multiple tables including various formulae which anow a plethora of syntheses of similarity functions, ternl weights, relative term frequencies, document weights, bidirectional relevance feedback and history adjusted term weights.
The evaluation of bidirectional relevance feedback reveals a better correspondence between system ranking of documents and users' preferences than feedback free system ranking. The assessment of similarity functions reveals that the Cosine and Jaccard functions perform significantly better than the DotProduct and Overlap functions. The evaluation of history tracking of the documents visited from a root page reveals better system ranking of documents than tracking free information retrieval. The assessment of stemming reveals that system information retrieval performance remains unaffected, while stop word removal does not appear to be beneficial and can sometimes be harmful. The overall evaluation of the experimental information retrieval system in comparison to a leading edge commercial information retrieval system and also in comparison to the expert's golden standard of judged relevance according to established statistical correlation methods reveal enhanced system information retrieval effectiveness
A Taxonomy of Information Retrieval Models and Tools
Information retrieval is attracting significant attention due to the exponential growth of the amount of information available in digital format. The proliferation of information retrieval objects, including algorithms, methods, technologies, and tools, makes it difficult to assess their capabilities and features and to understand the relationships that exist among them. In addition, the terminology is often confusing and misleading, as different terms are used to denote the same, or similar, tasks.
This paper proposes a taxonomy of information retrieval models and tools and provides precise definitions for the key terms. The taxonomy consists of superimposing two views: a vertical taxonomy, that classifies IR models with respect to a set of basic features, and a horizontal taxonomy, which classifies IR systems and services with respect to the tasks they support.
The aim is to provide a framework for classifying existing information retrieval models and tools and a solid point to assess future developments in the field
Flexible information retrieval: some research trends
In this paper some research trends in the field of Information Retrieval are presented. The focus is on the definition of flexible systems, i.e. systems that can represent and manage the vagueness and uncertainty which is characteristic of the process of information searching and retrieval. In this paper the application of soft computing techniques is considered, in particular fuzzy set theory
The study of probability model for compound similarity searching
Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model
Visualization for Information Retrieval based on Fast Search Technology
The core of search engine is information retrieval technique. Using information retrieval system backs more retrieval results, some of them more relevant than other, and some is not relevant. While using search engine to retrieve information has grown very substantially, there remain problems with the information retrieval systems. The interface of the systems does not help them to perceive the precision of these results. It is therefore not surprising that graphical visualizations have been employed in search engines to assist users. The main objective of Internet users is to find the required information with high efficiency and effectiveness. In this paper we present brief sides of information visualization's role in enhancing web information retrieval system as in some of its techniques such as tree view, title view, map view, bubble view and cloud view and its tools such as highlighting and Colored Query Result
Logical and uncertainty models for information access: current trends
The current trends of research in information access as emerged from the 1999 Workshop on Logical and Uncertainty Models for Information Systems (LUMIS'99) are briefly reviewed in this paper. We believe that some of these issues will be central to future research on theory and applications of logical and uncertainty models for information access
- …