62 research outputs found
Identifying Interesting Knowledge Factors from Big Data for Effective E-Market Prediction
Knowledge management plays an important role in disseminating valuable information. Knowledge creation involves analyzing data and transforming information into knowledge. Knowledge management plays an important role in improving organizational decision-making. It is evident that data mining and predictive analytics contribute a major part in the creation of knowledge and forecast the future outcomes. The ability to predict the performance of the advertising campaigns can become an asset to the advertisers. Tools like Google analytics were able to capture user logs. Large amounts of information ranging from visitor location, visitor flow throughout the website to various actions the visitor performs after clicking an ad resides in those logs. This research approach is an effort to identify key knowledge factors in the marketing sector that can further be optimized for effective e-market prediction
Cross Validation Of Neural Network Applications For Automatic New Topic Identification
There are recent studies in the literature on automatic topic-shift identification in Web search engine user sessions; however most of this work applied their topic-shift identification algorithms on data logs from a single search engine. The purpose of this study is to provide the cross-validation of an artificial neural network application to automatically identify topic changes in a web search engine user session by using data logs of different search engines for training and testing the neural network. Sample data logs from the Norwegian search engine FAST (currently owned by Overture) and Excite are used in this study. Findings of this study suggest that it could be possible to identify topic shifts and continuations successfully on a particular search engine user session using neural networks that are trained on a different search engine data log
A study of selection noise in collaborative web search
Collaborative Web search uses the past search behaviour (queries and selections) of a community of users to promote search results that are relevant to the community. The extent to which these promotions are likely to be relevant depends on how reliably past search behaviour can be captured. We consider this issue by analysing the results of collaborative
Web search in circumstances where the behaviour of searchers is unreliable
Recommended from our members
Semi-Automatic Query Expansion Approach to Web- Based Information Retrieval
The query used for Web searching is usually short and may not be able to reflect the intrinsic semantics of the user information need. The purpose of the paper is to take into account user information feedback, and to develop a semi-automatic query expansion approach to improve the effectiveness of Web searching. A search engine has been developed using the vector information retrieval model to validate the semi-automatic query expansion approach. The experiments show that this approach may improve the effectiveness of web searching
Multi-Task Learning for Email Search Ranking with Auxiliary Query Clustering
User information needs vary significantly across different tasks, and
therefore their queries will also differ considerably in their expressiveness
and semantics. Many studies have been proposed to model such query diversity by
obtaining query types and building query-dependent ranking models. These
studies typically require either a labeled query dataset or clicks from
multiple users aggregated over the same document. These techniques, however,
are not applicable when manual query labeling is not viable, and aggregated
clicks are unavailable due to the private nature of the document collection,
e.g., in email search scenarios. In this paper, we study how to obtain query
type in an unsupervised fashion and how to incorporate this information into
query-dependent ranking models. We first develop a hierarchical clustering
algorithm based on truncated SVD and varimax rotation to obtain coarse-to-fine
query types. Then, we study three query-dependent ranking models, including two
neural models that leverage query type information as additional features, and
one novel multi-task neural model that views query type as the label for the
auxiliary query cluster prediction task. This multi-task model is trained to
simultaneously rank documents and predict query types. Our experiments on tens
of millions of real-world email search queries demonstrate that the proposed
multi-task model can significantly outperform the baseline neural ranking
models, which either do not incorporate query type information or just simply
feed query type as an additional feature.Comment: CIKM 201
Intelligent recommendation system for e-learning platforms
As more and more digital resources are available, finding the appropriate document becomes harder.
Thus, a new kind of tools, able to recommend the more appropriated resources according the user needs,
becomes even more necessary. The current project implements an intelligent recommendation system for elearning
platforms. The recommendations are based on one hand, the performance of the user during the
training process and on the other hand, the requests made by the user in the form of search queries. All
information necessary for decision-making process of recommendation will be represented in the user model.
This model will be updated throughout the target user interaction with the platform
SOTXTSTREAM: Density-based self-organizing clustering of text streams
A streaming data clustering algorithm is presented building upon the density-based selforganizing stream clustering algorithm SOSTREAM. Many density-based clustering algorithms are limited by their inability to identify clusters with heterogeneous density. SOSTREAM addresses this limitation through the use of local (nearest neighbor-based) density determinations. Additionally, many stream clustering algorithms use a two-phase clustering approach. In the first phase, a micro-clustering solution is maintained online, while in the second phase, the micro-clustering solution is clustered offline to produce a macro solution. By performing self-organization techniques on micro-clusters in the online phase, SOSTREAM is able to maintain a macro clustering solution in a single phase. Leveraging concepts from SOSTREAM, a new density-based self-organizing text stream clustering algorithm, SOTXTSTREAM, is presented that addresses several shortcomings of SOSTREAM. Gains in clustering performance of this new algorithm are demonstrated on several real-world text stream datasets
Personalized web search using clickthrough data and web page rating
Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to construct personalized information retrieval model from the users' clickthrough data and Web page ratings. This model builds on the userbased collaborative filtering technology and the top-N resource recommending algorithm, which consists of three parts: user profile, user-based collaborative filtering, and the personalized search model. Firstly, we conduct user's preference score to construct the user profile from clicked sequence score and Web page rating. Then it attains similar users with a given user by user-based collaborative filtering algorithm and calculates the recommendable Web page scoring value. Finally, personalized informaion retrieval be modeled by three case applies (rating information for the user himself; at least rating information by similar users; not make use of any rating information). Experimental results indicate that our technique significantly improves the search performance. © 2012 ACADEMY PUBLISHER
Performance Analysis of Information Retrieval Systems
International audienceIt has been shown that there is not a best information retrieval system configuration which would work for any query, but rather that performance can vary from one query to another. It would be interesting if a meta-system could decide which system should process a new query by learning from the context of previously submitted queries. This paper reports a deep analysis considering more than 80,000 search engine configurations applied to 100 queries and the corresponding performance. The goal of the analysis is to identify which search engine configuration responds best to a certain type of query. We considered two approaches to define query types: one is based on query clustering according to the query performance (their difficulty), while the other approach uses various query features (including query difficulty predictors) to cluster queries. We identified two parameters that should be optimized first. An important outcome is that we could not obtain strong conclusive results; considering the large number of systems and methods we used, this result could lead to the conclusion that current query features does not fit the optimizing problem
- âŠ