80,787 research outputs found

    Techniques for Improving Web Search by Understanding Queries

    No full text
    This thesis investigates the refinement of web search results with a special focus on the use of clustering and the role of queries. It presents a collection of new methods for evaluating clustering methods, performing clustering effectively, and for performing query refinement. The thesis identifies different types of query, the situations where refinement is necessary, and the factors affecting search difficulty. It then analyses hard searches and argues that many of them fail because users and search engines have different query models. The thesis identifies best practice for evaluating web search results and search refinement methods. It finds that none of the commonly used evaluation measures for clustering meet all of the properties of good evaluation measures. It then presents new quality and coverage measures that satisfy all the desired properties and that rank clusterings correctly in all web page clustering situations. The thesis argues that current web page clustering methods work well when different interpretations of the query have distinct vocabulary, but still have several limitations and often produce incomprehensible clusters. It then presents a new clustering method that uses the query to guide the construction of semantically meaningful clusters. The new clustering method significantly improves performance. Finally, the thesis explores how searches and queries are composed of different aspects and shows how to use aspects to reduce the distance between the query models of search engines and users. It then presents fully automatic methods that identify query aspects, identify underrepresented aspects, and predict query difficulty. Used in combination, these methods have many applications — the thesis describes methods for two of them. The first method improves the search results for hard queries with underrepresented aspects by automatically expanding the query using semantically orthogonal keywords related to the underrepresented aspects. The second method helps users refine hard ambiguous queries by identifying the different query interpretations using a clustering of a diverse set of refinements. Both methods significantly outperform existing methods

    Identification of User Search Targets Using Feed Backs 1

    Get PDF
    Abstract Different users may have different search objectives and goals for a huge and confusing search item. The search engine performance can be improved by identifying and analyzing the search goals . In this paper, we propose a studied the approach to identify the user search goals by analyzing search engine query logs. The search goals of different users by clustering the proposed feedback from the search sessions.. to get the best results it is necessary to capture different user search goals. These user goals are nothing but information on different aspects of a query that different users want to obtain. The judgment and analysis of user search goals can be improved by the relevant result obtained from search engine and user's feedback. Here, feedback sessions are used to discover different user search goals based on series of both clicked and un clicked URL's. The pseudo-documents are generated to better represent feedback sessions which can reflect the information need of user. With this the original search results are restructured and to evaluate the performance of restructured search results, classified average precision is used. Keywords Search Goals, Feedback Sessions, Pseudo-Documents I. Introduction Web mining is one of the applications of data mining techniques to discover knowledge from the web. In web search, users are submitted queries to the search engines to get relevant information. But many search engines results are not informative and failed to produce results according to the user search goals. Users are usually giving some vague keywords representing their interests in their minds. Such keywords do not match with the results produced by the search engines. Many works about user search goals analysis should be carried out. Some users give ambiguous queries to the search engines they get mostly the irrelevant results. User search goals are classified as Navigational and Informational, the queries that seek a single website or webpage and queries that reflect the intent of the user to perform a particular transaction respectively. Many related works have been carried out according to the web search applications and the user search goals. In previous works, clustering is done on a set of top ranked results. The user search logs information is not analyzed and the feedback sessions are not considered. Analyzing the clicked URLS only from the web search logs. They only identify whether a pair of queries belong to the same goal or mission and does not care about what the goal is in detail. Semantic based web search for a particular query and the similarity between the words are carried out. Various algorithms such as star clustering algorithm, k-means clustering algorithm are used for clustering the pseudo documents but it also does not cluster the relevant information according to the user search goals. In clustering the cluster labels discovered are also not informative. User search goal is the information on different aspects of a query that users wants to obtain. Information need is a user's desire to obtain the relevant information to satisfy his need. To cluster web search results, the URLs are analyzed by extracting the titles and snippets. But all those works produced noisy results and does not obtain the user search goals precisely. When more irrelevant and relevant results are produced by the search engines it is tim

    Extracting consumers needs for new products a web mining approach

    Get PDF
    Here we introduce a web mining approach for automatically identifying new product ideas extracted from web logs. A web log - also known as blog - is a web site that provides commentary, news, and further information on a subject written by individual persons. We can find a large amount of web logs for nearly each topic where consumers present their needs for new products. These new product ideas probably are valuable for producers as well as for researchers and developers. This is because they can lead to a new product development process. Finding these new product ideas is a well-known task in marketing. Therefore, with this automatic approach we support marketing activities by extracting new and useful product ideas from textual information in internet logs. This approach is implemented by a web-based application named Product Idea Web Log Miner where users from the marketing department provide descriptions of existing products. As a result, new product ideas are extracted from the web logs and presented to the users

    Search Bias Quantification: Investigating Political Bias in Social Media and Web Search

    No full text
    Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results returned by these search engines can shape user opinion about the topic (e.g., event or person) being searched. In case of polarizing topics like politics, where multiple competing perspectives exist, the political bias in the top search results can play a significant role in shaping public opinion towards (or away from) certain perspectives. Given the considerable impact that search bias can have on the user, we propose a generalizable search bias quantification framework that not only measures the political bias in ranked list output by the search system but also decouples the bias introduced by the different sources—input data and ranking system. We apply our framework to study the political bias in searches related to 2016 US Presidential primaries in Twitter social media search and find that both input data and ranking system matter in determining the final search output bias seen by the users. And finally, we use the framework to compare the relative bias for two popular search systems—Twitter social media search and Google web search—for queries related to politicians and political events. We end by discussing some potential solutions to signal the bias in the search results to make the users more aware of them.publishe

    A structured model metametadata technique to enhance semantic searching in metadata repository

    Get PDF
    This paper discusses on a novel technique for semantic searching and retrieval of information about learning materials. A novel structured metametadata model has been created to provide the foundation for a semantic search engine to extract, match and map queries to retrieve relevant results. Metametadata encapsulate metadata instances by using the properties and attributes provided by ontologies rather than describing learning objects. The use of ontological views assists the pedagogical content of metadata extracted from learning objects by using the control vocabularies as identified from the metametadata taxonomy. The use of metametadata (based on the metametadata taxonomy) supported by the ontologies have contributed towards a novel semantic searching mechanism. This research has presented a metametadata model for identifying semantics and describing learning objects in finer-grain detail that allows for intelligent and smart retrieval by automated search and retrieval software

    Multilingual adaptive search for digital libraries

    Get PDF
    This paper describes a framework for Adaptive Multilingual Information Retrieval (AMIR) which allows multilingual resource discovery and delivery using on-the-ïŹ‚y machine translation of documents and queries. Result documents are presented to the user in a contextualised manner. Challenges and affordances of both Adaptive and Multilingual IR, with a particular focus on Digital Libraries, are detailed. The framework components are motivated by a series of results from experiments on query logs and documents from The European Library. We conclude that factoring adaptivity and multilinguality aspects into the search process can enhance the user’s experience with online Digital Libraries
    • 

    corecore