Search CORE

5,944 research outputs found

Applying Wikipedia to Interactive Information Retrieval

Author: Milne David N.
Publication venue: 'University of Waikato'
Publication date: 15/09/2010
Field of study

There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

Research Commons@Waikato

Recommended from our members

Cutting Through the Online Review Jungle — Investigating Selective eWOM Processing

Author: Gottschalk S. A.
Mafael A.
Publication venue: 'Elsevier BV'
Publication date: 01/02/2017
Field of study

Consumers frequently rely on online reviews, a prominent form of electronic word-of-mouth (eWOM), before making a purchase decision. However, consumers are usually confronted with hundreds of reviews for a single product or service, as well as rich information cues on online review websites (review texts, helpfulness ratings, author information, etc.). In turn, consumers face more information cues on online review websites than they can or want to process, and are likely to proceed selectively. This paper investigates selective processing of such eWOM information cues. Results of Study 1, an exploratory study using verbal protocols, confirm that consumers display selective eWOM processing patterns and are able to articulate them. Study 2 develops and applies a measurement instrument to capture these patterns. A subsequent cluster analysis on members of a large-scale online panel (N = 2,295) indicates five prominent eWOM processing types, termed “The Efficients”, “The Meticulous”, “The Quality-Evaluators”, “The Cautious Critics”, and “The Swift Pessimists”. Insights of this research can help firms to better understand consumers' eWOM processing and improve the user-friendliness of online review websites

City Research Online

Recommended from our members

Modeling Narrative Discourse

Author: Elson David K.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

This thesis describes new approaches to the formal modeling of narrative discourse. Although narratives of all kinds are ubiquitous in daily life, contemporary text processing techniques typically do not leverage the aspects that separate narrative from expository discourse. We describe two approaches to the problem. The first approach considers the conversational networks to be found in literary fiction as a key aspect of discourse coherence; by isolating and analyzing these networks, we are able to comment on longstanding literary theories. The second approach proposes a new set of discourse relations that are specific to narrative. By focusing on certain key aspects, such as agentive characters, goals, plans, beliefs, and time, these relations represent a theory-of-mind interpretation of a text. We show that these discourse relations are expressive, formal, robust, and through the use of a software system, amenable to corpus collection projects through the use of trained annotators. We have procured and released a collection of over 100 encodings, covering a set of fables as well as longer texts including literary fiction and epic poetry. We are able to inferentially find similarities and analogies between encoded stories based on the proposed relations, and an evaluation of this technique shows that human raters prefer such a measure of similarity to a more traditional one based on the semantic distances between story propositions

Columbia University Academic Commons