247 research outputs found
Applying Wikipedia to Interactive Information Retrieval
There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval
Spartan Daily, September 21, 2000
Volume 115, Issue 15https://scholarworks.sjsu.edu/spartandaily/9583/thumbnail.jp
Information Outlook, September 2007
Volume 11, Issue 9https://scholarworks.sjsu.edu/sla_io_2007/1008/thumbnail.jp
Information visibility on the Web and conceptions of success and failure in Web searching.
This thesis reports the procedure and findings of an empirical study about end users'
interaction with web-based search tools. The first part is dedicated to address early research
questions to discover web user's conceptions of the invisible web. The second part addresses
primary research questions to explore web users' conceptualizations of the causes of their
search success/failure and their awareness of and reaction to missed information while
searching the web. The third part is devoted to a number of emergent research questions to reexamine
the dataset in the light of a number of theoretical frameworks including Locus of
Control, Self-efficacy, Attribution Theory and Bounded Rationality and Satisficing theory.
The data collection was carried out in three phases based on in-depth, open-ended and
semi-structured interviews with a sample of academic staff, research staff and research students
from three biology-related departments at the University of Sheffield. A combination of
inductive and deductive approaches was employed to address three sets of research questions.
The first part of analysis which was based on Grounded Theory led to discovery of a new
concept called 'information visibility' which does make a distinction between technical
objective conceptions of the invisible web that commonly appear in the literature, and a
cognitive subjective conception based on searchers' perceptions of search failure. Accordingly,
the study introduced a 'model of information visibility on the web' which suggests a
complementary definition for the invisible web. Inductive exploration of the data to address the
primary research questions culminated in identification of different kinds of success (i.e.
anticipated, serendipitous, and unexpected success) and failure (i.e. unexpected, unexplained
and inevitable failure). The results also showed that the participants in the study were aware of
the possibility of missing some relevant information in their searches and the risk of missing
potentially important information is a matter of concern to them. However, regarding the
context of each search they have different perceptions of the importance and the volume of
missed information and accordingly they react to it differently. In view of that, two matrices
including the "matrix of search impact" and the "matrix of search depth" were developed to
address users' search behaviours regarding their awareness of and reaction to missed
information. The matrix of search impact suggests that there are different perceptions of the
risk of missing information including "inconsequential", "tolerable", "damaging" and
"disastrous". The matrix of search depth illustrates different search strategies including
"minimalist", "opportunistic", "nervous" and "extensive".
The third part of the study indicated that Locus of Control and Attribution Theory are
useful theoretical frameworks for helping us to better understand web-based information
seeking. Furthermore, interpretation of the data with regards to Bounded Rationality and
Satisficing theory supported the inductive findings and showed that web users' estimations of
the likely volume and importance of missed information affect their decision to persist in
searching. At the final stage of the study, an integrative model of information seeking
behaviour on the web was developed. This six-layer model incorporates the results of both
inductive and deductive stages of the study
The BG News August 22, 2007
The BGSU campus student newspaper August 22, 2007. Volume 98 - Issue 4https://scholarworks.bgsu.edu/bg-news/8780/thumbnail.jp
Questions And Answers: Exploring Mobile User Needs
The users of mobile devices increasingly use networked services to address their information needs. Questions asked by mobile users are strongly influenced by context factors, such as location and user activity. However in research which has empirically documented the link between mobile information needs and context factors, information about expected answers is scant. Therefore, the goal of this study is to explore the context factors which influence the mobile information needs and the answers expected by mobile users. The results, are obtained by analysing information from paper diaries and digital diaries. This project involved a user study, comprising two different types of studies concerning a paper diary and a digital diary. The analysis of both the paper diary and the digital diary was conducted through grounded theory and taxonomy of information needs. our results indicate a relationship between mobile information needs and context factors and expected answers. Our study explored this relationship between mobile information needs and context factors, and provides a better understanding of the expected answers related to mobile information needs
A multiple case study exploration of undergraduate subject searching
Subject searching—seeking information with a subject or topic in mind—is often involved in carrying out undergraduate assignments such as term papers and research reports. It is also an important component of information literacy—the abilities and experiences of effectively finding and evaluating, and appropriately using, needed information—which universities hope to cultivate in undergraduates by the time they complete their degree programs. By exploring the subject searching of a small group of upper-level, academically successful undergraduates over a school year I sought to acquire a deeper understanding of the contexts and characteristics of their subject searching, and of the extent to which it was similar in quality to that of search and domain experts.
Primary data sources for this study comprised subject searching diaries maintained by participants, and three online subject searches they demonstrated at the beginning, middle, and end of the study during which they talked aloud while I observed, followed by focused interviews. To explore the quality of study participants’ subject searching I looked for indications of advanced thinking in thoughts they spoke aloud during demonstration sessions relating to using strategy, evaluating, and creating personal understanding, which represent three of the most challenging and complex aspects of information literacy.
Applying a layered interpretive process, I identified themes within several hundred instances of participants’ advanced thinking relating to these three information literacy elements, with evaluative themes occurring most often. I also noted three factors influencing the extent of similarity
iii
between the quality of participants’ advanced thinking and that of search and domain experts which reflected matters that tended to be i) pragmatic or principled, , ii) technical or conceptual, and iii) externally or internally focused. Filtered through these factors, participants’ instances of advanced thinking brought to mind three levels of subject searching abilities: the competent student, the search expert, and the domain expert. Although relatively few in number, I identified at least some advanced thinking evincing domain expert qualities in voiced thoughts of all but one participant, suggesting the gap between higher order thinking abilities of upper-level undergraduates and information literate individuals is not always dauntingly large.Ye
Recommended from our members
Towards Nootropia : a non-linear approach to adaptive document filtering
In recent years, it has become increasingly difficult for users to find relevant information within the accessible glut. Research in Information Filtering (IF) tackles this problem through a tailored representation of the user interests, a user profile. Traditionally, IF inherits techniques from the related and more well established domains of Information Retrieval and Text Categorisation. These include, linear profile representations that exclude term dependencies and may only effectively represent a single topic of interest, and linear learning algorithms that achieve a steady profile adaptation pace. We argue that these practices are not attuned to the dynamic nature of user interests. A user may be interested in more than one topic in parallel, and both frequent variations and occasional radical changes of interests are inevitable over time. With our experimental system "Nootropia", we achieve adaptive document filtering with a single, multi-topic user profile. A hierarchical term network that takes into account topical and lexical correlations between terms and identifies topic-subtopic relations between them, is used to represent a user's multiple topics of interest and distinguish between them. A series of non-linear document evaluation functions is then established on the hierarchical network. Experiments using a variation of TREC's routing subtask to test the ability of a single profile to represent two and three topics of interest, reveal the approach's superiority over a linear profile representation. Adaptation of this single, multi-topic profile to a variety of changes in the user interests, is achieved through a process of self-organisation that constantly readjusts the profile stucturally, in response to user feedback. We used virtual users and another variation of TREC's routing subtask to test the profile on two learning and two forgetting tasks. The results clearly indicate the profile's ability to adapt to both frequent variations and radical changes in user interests
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
- …