13 research outputs found
Probabilistic models of information retrieval based on measuring the divergence from randomness
We introduce and create a framework for deriving probabilistic models of Information Retrieval. The models are nonparametric models of IR obtained in the language model approach. We derive term-weighting models by measuring the divergence of the actual term distribution from that obtained under a random process. Among the random processes we study the binomial distribution and Bose--Einstein statistics. We define two types of term frequency normalization for tuning term weights in the document--query matching process. The first normalization assumes that documents have the same length and measures the information gain with the observed term once it has been accepted as a good descriptor of the observed document. The second normalization is related to the document length and to other statistics. These two normalization methods are applied to the basic models in succession to obtain weighting formulae. Results show that our framework produces different nonparametric models forming baseline alternatives to the standard tf-idf model
An adaptive technique for content-based image retrieval
We discuss an adaptive approach towards Content-Based Image Retrieval. It is based on the Ostensive Model of developing information needs—a special kind of relevance feedback model that learns from implicit user feedback and adds a temporal notion to relevance. The ostensive approach supports content-assisted browsing through visualising the interaction by adding user-selected images to a browsing path, which ends with a set of system recommendations. The suggestions are based on an adaptive query learning scheme, in which the query is learnt from previously selected images. Our approach is an adaptation of the original Ostensive Model based on textual features only, to include content-based features to characterise images. In the proposed scheme textual and colour features are combined using the Dempster-Shafer theory of evidence combination. Results from a user-centred, work-task oriented evaluation show that the ostensive interface is preferred over a traditional interface with manual query facilities. This is due to its ability to adapt to the user's need, its intuitiveness and the fluid way in which it operates. Studying and comparing the nature of the underlying information need, it emerges that our approach elicits changes in the user's need based on the interaction, and is successful in adapting the retrieval to match the changes. In addition, a preliminary study of the retrieval performance of the ostensive relevance feedback scheme shows that it can outperform a standard relevance feedback strategy in terms of image recall in category search
Is this document relevant?... probably: A survey of probabilistic models in Information Retrieval
The paper provides an introduction to and survey of probabilistic approaches to modelling Information Retrieval. The basic concepts of probabilistic approaches to Information Retrieval are outlined, and the principles and assumptions upon which the approaches are based are presented. The various models that have been proposed in the development of IR are described, classified, and compared. The models are classified and compared using a common formalism. New approaches that constitute the basis of future research are described
"Is This Document Relevant? ...Probably": A Survey of Probabilistic Models in Information Retrieval
The paper provides an introduction to and survey of probabilistic approaches to modelling Information Retrieval. The basic concepts of probabilistic approaches to Information Retrieval are outlined, and the principles and assumptions upon which the approaches are based are presented. The various models that have been proposed in the development of IR are described, classified, and compared. The models are classified and compared using a common formalism. New approaches that constitute the basis of future research are described