4 research outputs found

    Supporting finding and re-finding through personalization

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 165-176).Although one of the most common uses for the Internet to search for information, Web search tools often fail to connect people with what they are looking for. This is because search tools are designed to satisfy people in general, not the searcher in particular. Different individuals with different information needs often type the same search terms into a search box and expect different results. For example, the query "breast cancer" may be used by a student to find information on the disease for a fifth grade science report, and by a cancer patient to find treatment options. This thesis explores how Web search personalization can help individuals take advantage of their unique past information interactions when searching. Several studies of search behavior are presented and used to inform the design of a personalized search system that significantly improves result quality. Without requiring any extra effort from the user, the system is able to return simple breast cancer tutorials for the fifth grader's "breast cancer" query, and lists of treatment options for the patient's. While personalization can help identify relevant new information, new information can create problems re-finding when presented in a way that does not account for previous information interactions.(cont.) Consider the cancer patient who repeats a search for breast cancer treatments: she may want to learn about new treatments while reviewing the information she found earlier about her current treatment. To not interfere with refinding, repeat search results should be personalized not by ranking the most relevant results first, but rather by ranking them where the user most expects them to be. This thesis presents a model of what people remember about search results, and shows that it is possible to invisibly merge new information into previously viewed search result lists where information has been forgotten. Personalizing repeat search results in this way enables people to effectively find both new and old information using the same search result list.by Jaime Teevan.Ph.D

    Predicting re-finding activity and difficulty

    Get PDF
    In this study, we address the problem of identifying if users are attempting to re-find information and estimating the level of difficulty of the re- finding task. We propose to consider the task information (e.g. multiple queries and click information) rather than only queries. Our resultant prediction models are shown to be significantly more accurate (by 2%) than the current state of the art. While past research assumes that previous search history of the user is available to the prediction model, we examine if re-finding detection is possible without access to this information. Our evaluation indicates that such detection is possible, but more challenging. We further describe the first predictive model in detecting re-finding difficulty, showing it to be significantly better than existing approaches for detecting general search difficulty

    Identification of re-finding tasks and search difficulty

    Get PDF
    We address the problem of identifying if users are attempting to re-find information and estimating the level of difficulty of the re-finding task. Identifying re-finding tasks and detecting search difficulties will enable search engines to respond dynamically to the search task being undertaken. To this aim, we conduct user studies and query log analysis to make a better understanding of re-finding tasks and search difficulties. Computing features particularly gathered in our user studies, we generate training sets from query log data, which is used for constructing automatic identification (prediction) models. Using machine learning techniques, our built re-finding identification model, which is the first model at the task level, could significantly outperform the existing query-based identifications. While past research assumes that previous search history of the user is available to the prediction model, we examine if re-finding detection is possible without access to this information. Our evaluation indicates that such detection is possible, but more challenging. We further describe the first predictive model in detecting re-finding difficulty, showing it to be significantly better than existing approaches for detecting general search difficulty. We also analyze important features for both identifications of re-finding and difficulties. Next, we investigate detailed identification of re-finding tasks and difficulties in terms of the type of the vertical document to be re-found. The accuracy of constructed predictive models indicates that re-finding tasks are indeed distinguishable across verticals and in comparison to general search tasks. This illustrates the requirement of adapting existing general search techniques for the re-finding context in terms of presenting vertical-specific results. Despite the overall reduction of accuracy in predictions independent of the original search of the user, it appears that identifying “image re-finding” is less dependent on such past information. Investigating the real-time prediction effectiveness of the models show that predicting ``image'' document re-finding obtains the highest accuracy early in the search. Early predictions would benefit search engines with adaptation of search results during re-finding activities. Furthermore, we study the difficulties in re-finding across verticals given some of the established indications of difficulties in the general web search context. In terms of user effort, re-finding “image” vertical appears to take more effort in terms of number of queries and clicks than other investigated verticals, while re-finding “reference” documents seems to be more time consuming when there is a longer time gap between the re-finding and corresponding original search. Exploring other features suggests that there could be particular difficulty indications for the re-finding context and specific to each vertical. To sum up, this research investigates the issue of effectively supporting users with re-finding search tasks. To this end, we have identified features that allow for more accurate distinction between re-finding and general tasks. This will enable search engines to better adapt search results for the re-finding context and improve the search experience of the users. Moreover, features indicative of similar/different and easy/difficult re-finding tasks can be employed for building balanced test environments, which could address one of the main gaps in the re-finding context
    corecore