813 research outputs found

    Cross Validation Of Neural Network Applications For Automatic New Topic Identification

    Get PDF
    There are recent studies in the literature on automatic topic-shift identification in Web search engine user sessions; however most of this work applied their topic-shift identification algorithms on data logs from a single search engine. The purpose of this study is to provide the cross-validation of an artificial neural network application to automatically identify topic changes in a web search engine user session by using data logs of different search engines for training and testing the neural network. Sample data logs from the Norwegian search engine FAST (currently owned by Overture) and Excite are used in this study. Findings of this study suggest that it could be possible to identify topic shifts and continuations successfully on a particular search engine user session using neural networks that are trained on a different search engine data log

    Neural network applications for automatic new topic identification on excite web search engine data logs

    Get PDF
    Bu çalışma, 12-17 Kasım 2004 tarihleri arasında Rhode Island[Amerika Birleşik Devletleri]’nde düzenlenen 67. Annual Meeting of the American Society for Information Science and Technology’de bildiri olarak sunulmuştur.The analysis of contextual information in search engine query logs is an important, yet difficult task. Users submit few queries, and search multiple topics sometimes with closely related context. Identification of topic changes within a search session is an important branch of contextual information analysis. The purpose of this study is to propose a topic identification algorithm using neural networks. A sample from the Excite data log is selected to train the neural network and then the neural network is used to identify topic changes in the data log. As a result, 76% of topic shifts and 92% of topic continuations are identified correctly.Sponsor: Amer Soc Informat Sci & Techno

    Characterization of portuguese web searches

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201

    Analysing Web Multimedia Query Reformulation Behaviour

    Get PDF
    Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems

    The Symbiotic Relationship Between Information Retrieval and Informetrics

    Get PDF
    Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other

    Analyzing geographic query reformulation: an exploratory study

    Get PDF
    Search engine users typically engage in multiquery sessions in their quest to fulfill their information needs. Despite a plethora of research findings suggesting that a significant group of users look for information within a specific geographical scope, existing reformulation studies lack a focused analysis of how users reformulate geographic queries. This study comprehensively investigates the ways in which users reformulate such needs in an attempt to fill this gap in the literature. Reformulated sessions were sampled from a query log of a major search engine to extract 2,400 entries that were manually inspected to filter geo sessions. This filter identified 471 search sessions that included geographical intent, and these sessions were analyzed quantitatively and qualitatively. The results revealed that one in five of the users who reformulated their queries were looking for geographically related information. They reformulated their queries by changing the content of the query rather than the structure. Users were not following a unified sequence of modifications and instead performed a single reformulation action. However, in some cases it was possible to anticipate their next move. A number of tasks in geo modifications were identified, including standard, multi-needs, multi-places, and hybrid approaches. The research concludes that it is important to specialize query reformulation studies to focus on particular query types rather than generically analyzing them, as it is apparent that geographic queries have their special reformulation characteristics

    Query Log Mining to Enhance User Experience in Search Engines

    Get PDF
    The Web is the biggest repository of documents humans have ever built. Even more, it is increasingly growing in size every day. Users rely on Web search engines (WSEs) for finding information on the Web. By submitting a textual query expressing their information need, WSE users obtain a list of documents that are highly relevant to the query. Moreover, WSEs tend to store such huge amount of users activities in "query logs". Query log mining is the set of techniques aiming at extracting valuable knowledge from query logs. This knowledge represents one of the most used ways of enhancing the users’ search experience. According to this vision, in this thesis we firstly prove that the knowledge extracted from query logs suffer aging effects and we thus propose a solution to this phenomenon. Secondly, we propose new algorithms for query recommendation that overcome the aging problem. Moreover, we study new query recommendation techniques for efficiently producing recommendations for rare queries. Finally, we study the problem of diversifying Web search engine results. We define a methodology based on the knowledge derived from query logs for detecting when and how query results need to be diversified and we develop an efficient algorithm for diversifying search results

    Application of the Markov Chain Method in a Health Portal Recommendation System

    Get PDF
    This study produced a recommendation system that can effectively recommend items on a health portal. Toward this aim, a transaction log that records users’ traversal activities on the Medical College of Wisconsin’s HealthLink, a health portal with a subject directory, was utilized and investigated. This study proposed a mixed-method that included the transaction log analysis method, the Markov chain analysis method, and the inferential analysis method. The transaction log analysis method was applied to extract users’ traversal activities from the log. The Markov chain analysis method was adopted to model users’ traversal activities and then generate recommendation lists for topics, articles, and Q&A items on the health portal. The inferential analysis method was applied to test whether there are any correlations between recommendation lists generated by the proposed recommendation system and recommendation lists ranked by experts. The topics selected for this study are Infections, the Heart, and Cancer. These three topics were the three most viewed topics in the portal. The findings of this study revealed the consistency between the recommendation lists generated from the proposed system and the lists ranked by experts. At the topic level, two topic recommendation lists generated from the proposed system were consistent with the lists ranked by experts, while one topic recommendation list was highly consistent with the list ranked by experts. At the article level, one article recommendation list generated from the proposed system was consistent with the list ranked by experts, while 14 article recommendation lists were highly consistent with the lists ranked by experts. At the Q&A item level, three Q&A item recommendation lists generated from the proposed system were consistent with the lists ranked by experts, while 12 Q&A item recommendation lists were highly consistent with the lists ranked by experts. The findings demonstrated the significance of users’ traversal data extracted from the transaction log. The methodology applied in this study proposed a systematic approach to generating the recommendation systems for other similar portals. The outcomes of this study can facilitate users’ navigation, and provide a new method for building a recommendation system that recommends items at three levels: the topic level, the article level, and the Q&A item level
    • …
    corecore