372 research outputs found
09101 Abstracts Collection -- Interactive Information Retrieval
From 01.03. to 06.03.2009, the Dagstuhl Seminar 09101 ``Interactive Information Retrieval \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Probabilistic models of information retrieval based on measuring the divergence from randomness
We introduce and create a framework for deriving probabilistic models of Information Retrieval. The models are nonparametric models of IR obtained in the language model approach. We derive term-weighting models by measuring the divergence of the actual term distribution from that obtained under a random process. Among the random processes we study the binomial distribution and Bose--Einstein statistics. We define two types of term frequency normalization for tuning term weights in the document--query matching process. The first normalization assumes that documents have the same length and measures the information gain with the observed term once it has been accepted as a good descriptor of the observed document. The second normalization is related to the document length and to other statistics. These two normalization methods are applied to the basic models in succession to obtain weighting formulae. Results show that our framework produces different nonparametric models forming baseline alternatives to the standard tf-idf model
Evolving text classification rules with genetic programming
We describe a novel method for using genetic programming to create compact classification rules using combinations of N-grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that the rules may have a number of other uses beyond classification and provide a basis for text mining applications
Looking at Vector Space and Language Models for IR using Density Matrices
In this work, we conduct a joint analysis of both Vector Space and Language
Models for IR using the mathematical framework of Quantum Theory. We shed light
on how both models allocate the space of density matrices. A density matrix is
shown to be a general representational tool capable of leveraging capabilities
of both VSM and LM representations thus paving the way for a new generation of
retrieval models. We analyze the possible implications suggested by our
findings.Comment: In Proceedings of Quantum Interaction 201
Inoculation of raccoons with a wild-type-based recombinant canine distemper virus results in viremia, lymphopenia, fever, and widespread histological lesions
Raccoons are naturally susceptible to canine distemper virus (CDV) infection and can be a potential source of spill-over events. CDV is a highly contagious morbillivirus that infects multiple species of carnivores and omnivores, resulting in severe and often fatal disease. Here, we used a recombinant CDV (rCDV) based on a full-genome sequence detected in a naturally infected raccoon to perform pathogenesis studies in raccoons. Five raccoons were inoculated intratracheally with a recombinant virus engineered to express a fluorescentreporter protein, and extensive virological, serological, histological, and immunohistochemical assessments were performed at differenttime points post inoculation. rCDV-infected white blood cells were detected as early as 4 days post inoculation (dpi). Raccoon necropsies at 6 and 8 dpi revealed replication in the lymphoid tissues, preceding spread into peripheral tissues observed during necropsies at 21 dpi. Whereas lymphocytes, and to a lesser extent myeloid cells, were the main target cells of CDV at early time points, CDV additionally targeted epithelia at 21 dpi. At this later time point, CDV-infected cells were observed throughout the host. We observed lymphopenia and lymphocyte depletion from lymphoid tissues after CDV infection, in the absence of detectable CDV neutralizing antibodies and an impaired ability to clear CDV, indicating that the animals were severely immunosuppressed. The use of a wild-type-based recombinant virus in a natural host species infection study allowed systematic and sensitive assessment of antigen detection by immunohistochemistry, enabling further comparative pathology studies of CDV infection in differentspecies.</p
Evaluating implicit feedback models using searcher simulations
In this article we describe an evaluation of relevance feedback (RF) algorithms using searcher simulations. Since these algorithms select additional terms for query modification based on inferences made from searcher interaction, not on relevance information searchers explicitly provide (as in traditional RF), we refer to them as implicit feedback models. We introduce six different models that base their decisions on the interactions of searchers and use different approaches to rank query modification terms. The aim of this article is to determine which of these models should be used to assist searchers in the systems we develop. To evaluate these models we used searcher simulations that afforded us more control over the experimental conditions than experiments with human subjects and allowed complex interaction to be modeled without the need for costly human experimentation. The simulation-based evaluation methodology measures how well the models learn the distribution of terms across relevant documents (i.e., learn what information is relevant) and how well they improve search effectiveness (i.e., create effective search queries). Our findings show that an implicit feedback model based on Jeffrey's rule of conditioning outperformed other models under investigation
Cracking the code of oscillatory activity
Neural oscillations are ubiquitous measurements of cognitive processes and dynamic routing and gating of information. The fundamental and so far unresolved problem for neuroscience remains to understand how oscillatory activity in the brain codes information for human cognition. In a biologically relevant cognitive task, we instructed six human observers to categorize facial expressions of emotion while we measured the observers' EEG. We combined state-of-the-art stimulus control with statistical information theory analysis to quantify how the three parameters of oscillations (i.e., power, phase, and frequency) code the visual information relevant for behavior in a cognitive task. We make three points: First, we demonstrate that phase codes considerably more information (2.4 times) relating to the cognitive task than power. Second, we show that the conjunction of power and phase coding reflects detailed visual features relevant for behavioral response-that is, features of facial expressions predicted by behavior. Third, we demonstrate, in analogy to communication technology, that oscillatory frequencies in the brain multiplex the coding of visual features, increasing coding capacity. Together, our findings about the fundamental coding properties of neural oscillations will redirect the research agenda in neuroscience by establishing the differential role of frequency, phase, and amplitude in coding behaviorally relevant information in the brai
The good, the bad and the implicit: a comprehensive approach to annotating explicit and implicit sentiment
We present a fine-grained scheme for the annotation of polar sentiment in text, that accounts for explicit sentiment (so-called private states), as well as implicit expressions of sentiment (polar facts). Polar expressions are annotated below sentence level and classified according to their subjectivity status. Additionally, they are linked to one or more targets with a specific polar orientation and intensity. Other components of the annotation scheme include source attribution and the identification and classification of expressions that modify polarity. In previous research, little attention has been given to implicit sentiment, which represents a substantial amount of the polar expressions encountered in our data. An English and Dutch corpus of financial newswire, consisting of over 45,000 words each, was annotated using our scheme. A subset of this corpus was used to conduct an inter-annotator agreement study, which demonstrated that the proposed scheme can be used to reliably annotate explicit and implicit sentiment in real-world textual data, making the created corpora a useful resource for sentiment analysis
- …