80,787 research outputs found
Techniques for Improving Web Search by Understanding Queries
This thesis investigates the refinement of web search results with a special
focus on the use of clustering and the role of queries. It presents a
collection of new methods for evaluating clustering methods, performing
clustering effectively, and for performing query refinement.
The thesis identifies different types of query, the situations where refinement
is necessary, and the factors affecting search difficulty. It then
analyses hard searches and argues that many of them fail because users
and search engines have different query models.
The thesis identifies best practice for evaluating web search results and
search refinement methods. It finds that none of the commonly used evaluation
measures for clustering meet all of the properties of good evaluation
measures. It then presents new quality and coverage measures that
satisfy all the desired properties and that rank clusterings correctly in all
web page clustering situations.
The thesis argues that current web page clustering methods work well
when different interpretations of the query have distinct vocabulary, but
still have several limitations and often produce incomprehensible clusters.
It then presents a new clustering method that uses the query to guide
the construction of semantically meaningful clusters. The new clustering
method significantly improves performance.
Finally, the thesis explores how searches and queries are composed of
different aspects and shows how to use aspects to reduce the distance between
the query models of search engines and users. It then presents fully
automatic methods that identify query aspects, identify underrepresented
aspects, and predict query difficulty. Used in combination, these methods
have many applications â the thesis describes methods for two of
them. The first method improves the search results for hard queries with
underrepresented aspects by automatically expanding the query using semantically
orthogonal keywords related to the underrepresented aspects.
The second method helps users refine hard ambiguous queries by identifying
the different query interpretations using a clustering of a diverse set
of refinements. Both methods significantly outperform existing methods
Recommended from our members
Personalization via collaboration in web retrieval systems: a context based approach
World Wide Web is a source of information, and searches on the Web can be analyzed to detect patterns in Web users' search behaviors and information needs to effectively handle the users' subsequent needs. The rationale is that the information need of a user at a particular time point occurs in a particular context, and queries are derived from that need. In this paper, we discuss an extension of our personalization approach that was originally developed for a traditional bibliographic retrieval system but has been adapted and extended with a collaborative model for the Web retrieval environment. We start with a brief introduction of our personalization approach in a traditional information retrieval system. Then, based on the differences in the nature of documents, users and search tasks between traditional and Web retrieval environments, we describe our extensions of integrating collaboration in personalization in the Web retrieval environment. The architecture for the extension integrates machine learning techniques for the purpose of better modeling users' search tasks. Finally, a user-oriented evaluation of Web-based adaptive retrieval systems is presented as an important aspect of the overall strategy for personalization
Identification of User Search Targets Using Feed Backs 1
Abstract Different users may have different search objectives and goals for a huge and confusing search item. The search engine performance can be improved by identifying and analyzing the search goals . In this paper, we propose a studied the approach to identify the user search goals by analyzing search engine query logs. The search goals of different users by clustering the proposed feedback from the search sessions.. to get the best results it is necessary to capture different user search goals. These user goals are nothing but information on different aspects of a query that different users want to obtain. The judgment and analysis of user search goals can be improved by the relevant result obtained from search engine and user's feedback. Here, feedback sessions are used to discover different user search goals based on series of both clicked and un clicked URL's. The pseudo-documents are generated to better represent feedback sessions which can reflect the information need of user. With this the original search results are restructured and to evaluate the performance of restructured search results, classified average precision is used. Keywords Search Goals, Feedback Sessions, Pseudo-Documents I. Introduction Web mining is one of the applications of data mining techniques to discover knowledge from the web. In web search, users are submitted queries to the search engines to get relevant information. But many search engines results are not informative and failed to produce results according to the user search goals. Users are usually giving some vague keywords representing their interests in their minds. Such keywords do not match with the results produced by the search engines. Many works about user search goals analysis should be carried out. Some users give ambiguous queries to the search engines they get mostly the irrelevant results. User search goals are classified as Navigational and Informational, the queries that seek a single website or webpage and queries that reflect the intent of the user to perform a particular transaction respectively. Many related works have been carried out according to the web search applications and the user search goals. In previous works, clustering is done on a set of top ranked results. The user search logs information is not analyzed and the feedback sessions are not considered. Analyzing the clicked URLS only from the web search logs. They only identify whether a pair of queries belong to the same goal or mission and does not care about what the goal is in detail. Semantic based web search for a particular query and the similarity between the words are carried out. Various algorithms such as star clustering algorithm, k-means clustering algorithm are used for clustering the pseudo documents but it also does not cluster the relevant information according to the user search goals. In clustering the cluster labels discovered are also not informative. User search goal is the information on different aspects of a query that users wants to obtain. Information need is a user's desire to obtain the relevant information to satisfy his need. To cluster web search results, the URLs are analyzed by extracting the titles and snippets. But all those works produced noisy results and does not obtain the user search goals precisely. When more irrelevant and relevant results are produced by the search engines it is tim
Extracting consumers needs for new products a web mining approach
Here we introduce a web mining approach for automatically identifying new product ideas extracted from web logs. A web log - also known as blog - is a web site that provides commentary, news, and further information on a subject written by individual persons. We can find a large amount of web logs for nearly each topic where consumers present their needs for new products. These new product ideas probably are valuable for producers as well as for researchers and developers. This is because they can lead to a new product development process. Finding these new product ideas is a well-known task in marketing. Therefore, with this automatic approach we support marketing activities by extracting new and useful product ideas from textual information in internet logs. This approach is implemented by a web-based application named Product Idea Web Log Miner where users from the marketing department provide descriptions of existing products. As a result, new product ideas are extracted from the web logs and presented to the users
Search Bias Quantification: Investigating Political Bias in Social Media and Web Search
Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results returned by these search engines can shape user opinion about the topic (e.g., event or person) being searched. In case of polarizing topics like politics, where multiple competing perspectives exist, the political bias in the top search results can play a significant role in shaping public opinion towards (or away from) certain perspectives. Given the considerable impact that search bias can have on the user, we propose a generalizable search bias quantification framework that not only measures the political bias in ranked list output by the search system but also decouples the bias introduced by the different sourcesâinput data and ranking system. We apply our framework to study the political bias in searches related to 2016 US Presidential primaries in Twitter social media search and find that both input data and ranking system matter in determining the final search output bias seen by the users. And finally, we use the framework to compare the relative bias for two popular search systemsâTwitter social media search and Google web searchâfor queries related to politicians and political events. We end by discussing some potential solutions to signal the bias in the search results to make the users more aware of them.publishe
A structured model metametadata technique to enhance semantic searching in metadata repository
This paper discusses on a novel technique for semantic searching and retrieval of information about learning materials. A novel structured metametadata model has been created to provide the foundation for a semantic search engine to extract, match and map queries to retrieve relevant results. Metametadata encapsulate metadata instances by using the properties and attributes provided by ontologies rather than describing learning objects. The use of ontological views assists the pedagogical content of metadata extracted from learning objects by using the control vocabularies as identified from the metametadata taxonomy. The use of metametadata (based on the metametadata taxonomy) supported by the ontologies have contributed towards a novel semantic searching mechanism. This research has presented a metametadata model for identifying semantics and describing learning objects in finer-grain detail that allows for intelligent and smart retrieval by automated search and retrieval software
Multilingual adaptive search for digital libraries
This paper describes a framework for Adaptive Multilingual Information Retrieval (AMIR) which allows multilingual resource discovery and delivery using on-the-ïŹy machine translation of documents and queries. Result documents
are presented to the user in a contextualised manner. Challenges and affordances of both Adaptive and Multilingual IR, with a particular focus on Digital Libraries, are detailed. The framework components are motivated by a series of results from experiments on query logs and documents from The European Library. We conclude that factoring adaptivity and multilinguality aspects into the search process can enhance the userâs experience with online Digital Libraries
- âŠ