Search CORE

247 research outputs found

Applying Wikipedia to Interactive Information Retrieval

Author: Milne David N.
Publication venue: 'University of Waikato'
Publication date: 15/09/2010
Field of study

There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

Research Commons@Waikato

Spartan Daily, September 21, 2000

Author: San Jose State University School of Journalism and Mass Communications
Publication venue: SJSU ScholarWorks
Publication date: 21/09/2000
Field of study

Volume 115, Issue 15https://scholarworks.sjsu.edu/spartandaily/9583/thumbnail.jp

SJSU ScholarWorks

Information Outlook, September 2007

Author: Special Libraries Association
Publication venue: SJSU ScholarWorks
Publication date: 01/09/2007
Field of study

Volume 11, Issue 9https://scholarworks.sjsu.edu/sla_io_2007/1008/thumbnail.jp

SJSU ScholarWorks

Information visibility on the Web and conceptions of success and failure in Web searching.

Author: Mansourian Yazdan
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/01/2006
Field of study

This thesis reports the procedure and findings of an empirical study about end users' interaction with web-based search tools. The first part is dedicated to address early research questions to discover web user's conceptions of the invisible web. The second part addresses primary research questions to explore web users' conceptualizations of the causes of their search success/failure and their awareness of and reaction to missed information while searching the web. The third part is devoted to a number of emergent research questions to reexamine the dataset in the light of a number of theoretical frameworks including Locus of Control, Self-efficacy, Attribution Theory and Bounded Rationality and Satisficing theory. The data collection was carried out in three phases based on in-depth, open-ended and semi-structured interviews with a sample of academic staff, research staff and research students from three biology-related departments at the University of Sheffield. A combination of inductive and deductive approaches was employed to address three sets of research questions. The first part of analysis which was based on Grounded Theory led to discovery of a new concept called 'information visibility' which does make a distinction between technical objective conceptions of the invisible web that commonly appear in the literature, and a cognitive subjective conception based on searchers' perceptions of search failure. Accordingly, the study introduced a 'model of information visibility on the web' which suggests a complementary definition for the invisible web. Inductive exploration of the data to address the primary research questions culminated in identification of different kinds of success (i.e. anticipated, serendipitous, and unexpected success) and failure (i.e. unexpected, unexplained and inevitable failure). The results also showed that the participants in the study were aware of the possibility of missing some relevant information in their searches and the risk of missing potentially important information is a matter of concern to them. However, regarding the context of each search they have different perceptions of the importance and the volume of missed information and accordingly they react to it differently. In view of that, two matrices including the "matrix of search impact" and the "matrix of search depth" were developed to address users' search behaviours regarding their awareness of and reaction to missed information. The matrix of search impact suggests that there are different perceptions of the risk of missing information including "inconsequential", "tolerable", "damaging" and "disastrous". The matrix of search depth illustrates different search strategies including "minimalist", "opportunistic", "nervous" and "extensive". The third part of the study indicated that Locus of Control and Attribution Theory are useful theoretical frameworks for helping us to better understand web-based information seeking. Furthermore, interpretation of the data with regards to Bounded Rationality and Satisficing theory supported the inductive findings and showed that web users' estimations of the likely volume and importance of missed information affect their decision to persist in searching. At the final stage of the study, an integrative model of information seeking behaviour on the web was developed. This six-layer model incorporates the results of both inductive and deductive stages of the study

White Rose E-theses Online

OpenGrey Repository

The BG News August 22, 2007

Author: Bowling Green State University
Publication venue: ScholarWorks@BGSU
Publication date: 22/08/2007
Field of study

The BGSU campus student newspaper August 22, 2007. Volume 98 - Issue 4https://scholarworks.bgsu.edu/bg-news/8780/thumbnail.jp

Bowling Green State University: ScholarWorks@BGSU

Questions And Answers: Exploring Mobile User Needs

Author: Su-Ping (Carole) Chang
Publication venue: 'University of Waikato'
Publication date: 03/05/2011
Field of study

The users of mobile devices increasingly use networked services to address their information needs. Questions asked by mobile users are strongly influenced by context factors, such as location and user activity. However in research which has empirically documented the link between mobile information needs and context factors, information about expected answers is scant. Therefore, the goal of this study is to explore the context factors which influence the mobile information needs and the answers expected by mobile users. The results, are obtained by analysing information from paper diaries and digital diaries. This project involved a user study, comprising two different types of studies concerning a paper diary and a digital diary. The analysis of both the paper diary and the digital diary was conducted through grounded theory and taxonomy of information needs. our results indicate a relationship between mobile information needs and context factors and expected answers. Our study explored this relationship between mobile information needs and context factors, and provides a better understanding of the expected answers related to mobile information needs

Research Commons@Waikato

A multiple case study exploration of undergraduate subject searching

Author: Graham Rumi Y.
Publication venue: 'University of Toronto Medical Journal'
Publication date: 01/01/2011
Field of study

Subject searching—seeking information with a subject or topic in mind—is often involved in carrying out undergraduate assignments such as term papers and research reports. It is also an important component of information literacy—the abilities and experiences of effectively finding and evaluating, and appropriately using, needed information—which universities hope to cultivate in undergraduates by the time they complete their degree programs. By exploring the subject searching of a small group of upper-level, academically successful undergraduates over a school year I sought to acquire a deeper understanding of the contexts and characteristics of their subject searching, and of the extent to which it was similar in quality to that of search and domain experts. Primary data sources for this study comprised subject searching diaries maintained by participants, and three online subject searches they demonstrated at the beginning, middle, and end of the study during which they talked aloud while I observed, followed by focused interviews. To explore the quality of study participants’ subject searching I looked for indications of advanced thinking in thoughts they spoke aloud during demonstration sessions relating to using strategy, evaluating, and creating personal understanding, which represent three of the most challenging and complex aspects of information literacy. Applying a layered interpretive process, I identified themes within several hundred instances of participants’ advanced thinking relating to these three information literacy elements, with evaluative themes occurring most often. I also noted three factors influencing the extent of similarity iii between the quality of participants’ advanced thinking and that of search and domain experts which reflected matters that tended to be i) pragmatic or principled, , ii) technical or conceptual, and iii) externally or internally focused. Filtered through these factors, participants’ instances of advanced thinking brought to mind three levels of subject searching abilities: the competent student, the search expert, and the domain expert. Although relatively few in number, I identified at least some advanced thinking evincing domain expert qualities in voiced thoughts of all but one participant, suggesting the gap between higher order thinking abilities of upper-level undergraduates and information literate individuals is not always dauntingly large.Ye

University of Toronto Research Repository

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Recommended from our members

Towards Nootropia : a non-linear approach to adaptive document filtering

Author: Nanas Nikolaos
Publication venue
Publication date: 01/01/2003
Field of study

In recent years, it has become increasingly difficult for users to find relevant information within the accessible glut. Research in Information Filtering (IF) tackles this problem through a tailored representation of the user interests, a user profile. Traditionally, IF inherits techniques from the related and more well established domains of Information Retrieval and Text Categorisation. These include, linear profile representations that exclude term dependencies and may only effectively represent a single topic of interest, and linear learning algorithms that achieve a steady profile adaptation pace. We argue that these practices are not attuned to the dynamic nature of user interests. A user may be interested in more than one topic in parallel, and both frequent variations and occasional radical changes of interests are inevitable over time. With our experimental system "Nootropia", we achieve adaptive document filtering with a single, multi-topic user profile. A hierarchical term network that takes into account topical and lexical correlations between terms and identifies topic-subtopic relations between them, is used to represent a user's multiple topics of interest and distinguish between them. A series of non-linear document evaluation functions is then established on the hierarchical network. Experiments using a variation of TREC's routing subtask to test the ability of a single profile to represent two and three topics of interest, reveal the approach's superiority over a linear profile representation. Adaptation of this single, multi-topic profile to a variety of changes in the user interests, is achieved through a process of self-organisation that constantly readjusts the profile stucturally, in response to user feedback. We used virtual users and another variation of TREC's routing subtask to test the profile on two learning and two forgetting tasks. The results clearly indicate the profile's ability to adapt to both frequent variations and radical changes in user interests

Open Research Online (The Open University)

OpenGrey Repository

Learning Representations of Social Media Users

Author: Benton Adrian
Publication venue
Publication date: 02/12/2018
Field of study

User representations are routinely used in recommendation systems by platform developers, targeted advertisements by marketers, and by public policy researchers to gauge public opinion across demographic groups. Computer scientists consider the problem of inferring user representations more abstractly; how does one extract a stable user representation - effective for many downstream tasks - from a medium as noisy and complicated as social media? The quality of a user representation is ultimately task-dependent (e.g. does it improve classifier performance, make more accurate recommendations in a recommendation system) but there are proxies that are less sensitive to the specific task. Is the representation predictive of latent properties such as a person's demographic features, socioeconomic class, or mental health state? Is it predictive of the user's future behavior? In this thesis, we begin by showing how user representations can be learned from multiple types of user behavior on social media. We apply several extensions of generalized canonical correlation analysis to learn these representations and evaluate them at three tasks: predicting future hashtag mentions, friending behavior, and demographic features. We then show how user features can be employed as distant supervision to improve topic model fit. Finally, we show how user features can be integrated into and improve existing classifiers in the multitask learning framework. We treat user representations - ground truth gender and mental health features - as auxiliary tasks to improve mental health state prediction. We also use distributed user representations learned in the first chapter to improve tweet-level stance classifiers, showing that distant user information can inform classification tasks at the granularity of a single message.Comment: PhD thesi

arXiv.org e-Print Archive

Johns Hopkins University

JScholarship