14,289 research outputs found
Utilizing sub-topical structure of documents for information retrieval.
Text segmentation in natural language processing typically refers to the process of decomposing a document into constituent subtopics. Our work centers on the application of text segmentation techniques within information retrieval (IR) tasks. For example, for scoring a document by combining the retrieval scores of its constituent segments, exploiting the proximity of query terms in documents for ad-hoc search, and for question answering (QA), where retrieved passages from multiple documents are aggregated and presented as a single document to a searcher. Feedback in ad hoc IR task is shown to beneïŹt from the use of extracted sentences instead of terms from the pseudo relevant documents for query expansion. Retrieval effectiveness for patent prior art search task is enhanced by applying text segmentation to the patent queries. Another aspect of our work involves augmenting text segmentation techniques to produce segments which are more readable with less unresolved anaphora. This is particularly useful for QA and snippet generation tasks where the objective is to aggregate relevant and novel information from multiple documents satisfying user information need on one hand, and ensuring that the automatically generated content presented to the user is easily readable without reference to the original source document
Discovery Is Never By Chance: Designing for (Un)Serendipity
Serendipity has a long tradition in the history of science as having played a key role in many significant discoveries. Computer scientists, valuing the role of serendipity in discovery, have attempted to design systems that encourage serendipity. However, that research has focused primarily on only one aspect of serendipity: that of chance encounters. In reality, for serendipity to be valuable chance encounters must be synthesized into insight. In this paper we show, through a formal consideration of serendipity and analysis of how various systems have seized on attributes of interpreting serendipity, that there is a richer space for design to support serendipitous creativity, innovation and discovery than has been tapped to date. We discuss how ideas might be encoded to be shared or discovered by âassociation-huntingâ agents. We propose considering not only the inventorâs role in perceiving serendipity, but also how that inventorâs perception may be enhanced to increase the opportunity for serendipity. We explore the role of environment and how we can better enable serendipitous discoveries to find a home more readily and immediately
Recommended from our members
Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.
Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches
What makes re-finding information difficult? A study of email re-finding
Re-nding information that has been seen or accessed before is a task which can be relatively straight-forward, but often it can be extremely challenging, time-consuming and frustrating. Little is known, however, about what makes one re-finding task harder or easier than another. We performed a user study to learn about the contextual factors that influence users' perception of task diculty in the context of re-finding email messages. 21 participants were issued re-nding tasks to perform on their own personal collections. The participants' responses to questions about the tasks combined with demographic data and collection statistics for the experimental population provide a rich basis to investigate the variables that can influence the perception of diculty. A logistic regression model was developed to examine the relationships be- tween variables and determine whether any factors were associated with perceived task diculty. The model reveals strong relationships between diculty and the time lapsed since a message was read, remembering when the sought-after email was sent, remembering other recipients of the email, the experience of the user and the user's ling strategy. We discuss what these findings mean for the design of re-nding interfaces and future re-finding research
- âŠ