370,578 research outputs found

    Meta-evaluation of online and offline web search evaluation metrics

    Get PDF
    As in most information retrieval (IR) studies, evaluation plays an essential part in Web search research. Both offline and online evaluation metrics are adopted in measuring the performance of search engines. Offline metrics are usually based on relevance judgments of query-document pairs from assessors while online metrics exploit the user behavior data, such as clicks, collected from search engines to compare search algorithms. Although both types of IR evaluation metrics have achieved success, to what extent can they predict user satisfaction still remains under-investigated. To shed light on this research question, we meta-evaluate a series of existing online and offline metrics to study how well they infer actual search user satisfaction in different search scenarios. We find that both types of evaluation metrics significantly correlate with user satisfaction while they reflect satisfaction from different perspectives for different search tasks. Offline metrics better align with user satisfaction in homogeneous search (i.e. ten blue links) whereas online metrics outperform when vertical results are federated. Finally, we also propose to incorporate mouse hover information into existing online evaluation metrics, and empirically show that they better align with search user satisfaction than click-based online metrics

    Data-driven evaluation metrics for heterogeneous search engine result pages

    Get PDF
    Evaluation metrics for search typically assume items are homoge- neous. However, in the context of web search, this assumption does not hold. Modern search engine result pages (SERPs) are composed of a variety of item types (e.g., news, web, entity, etc.), and their influence on browsing behavior is largely unknown. In this paper, we perform a large-scale empirical analysis of pop- ular web search queries and investigate how different item types influence how people interact on SERPs. We then infer a user brows- ing model given people’s interactions with SERP items – creating a data-driven metric based on item type. We show that the proposed metric leads to more accurate estimates of: (1) total gain, (2) total time spent, and (3) stopping depth – without requiring extensive parameter tuning or a priori relevance information. These results suggest that item heterogeneity should be accounted for when de- veloping metrics for SERPs. While many open questions remain concerning the applicability and generalizability of data-driven metrics, they do serve as a formal mechanism to link observed user behaviors directly to how performance is measured. From this approach, we can draw new insights regarding the relationship be- tween behavior and performance – and design data-driven metrics based on real user behavior rather than using metrics reliant on some hypothesized model of user browsing behavior

    Masquerade Attack Detection Using a Search-Behavior Modeling Approach

    Get PDF
    Masquerade attacks are unfortunately a familiar security problem that is a consequence of identity theft. Detecting masqueraders is very hard. Prior work has focused on user command modeling to identify abnormal behavior indicative of impersonation. This paper extends prior work by presenting one-class Hellinger distance-based and one-class SVM modeling techniques that use a set of novel features to reveal user intent. The specific objective is to model user search profiles and detect deviations indicating a masquerade attack. We hypothesize that each individual user knows their own file system well enough to search in a limited, targeted and unique fashion in order to find information germane to their current task. Masqueraders, on the other hand, will likely not know the file system and layout of another user's desktop, and would likely search more extensively and broadly in a manner that is different than the victim user being impersonated. We extend prior research that uses UNIX command sequences issued by users as the audit source by relying upon an abstraction of commands. We devise taxonomies of UNIX commands and Windows applications that are used to abstract sequences of user commands and actions. We also gathered our own normal and masquerader data sets captured in a Windows environment for evaluation. The datasets are publicly available for other researchers who wish to study masquerade attack rather than author identification as in much of the prior reported work. The experimental results show that modeling search behavior reliably detects all masqueraders with a very low false positive rate of 0.1%, far better than prior published results. The limited set of features used for search behavior modeling also results in huge performance gains over the same modeling techniques that use larger sets of features

    K-Nearest neighbor algorithm on implicit feedback to determine SOP

    Get PDF
    The availability of a lot of existing Standard Operating Procedures (SOP) document information, users often need time to find SOPs that fit their preference. Therefore, this requires a recommendation system based on user content consumption by personalized usage logs to support the establishment of SOP documents managed according to user preferences. The k-nearest neighbor (KNN) algorithm is used to identify the most relevant SOP document for the user by utilizing implicit feedback based on extraction data by monitoring the document search behavior. From the research results obtained 5 classifications as parameters, with a final value of 3:2 ratio that shows the best distance value with the majority of labels according to the concept of calculation KNN algorithm that sees from the nearest neighbor in the dataset. This shows the precision of applying the KNN algorithm in determining SOP documents according to user preferences based on implicit feedback resulting in 80% presentation for SOPs corresponding to profiles and 20% for SOPs that do not fit the user profile. To establish SOP documents to show more accurate results, it should be used in a broad SOP management system and utilize implicit feedback with parameters not only in search logs and more on performance evaluation evaluations

    Human Swarm Interaction for Radiation Source Search and Localization

    Get PDF
    This study shows that appropriate human interaction can benefit a swarm of robots to achieve goals more efficiently. A set of desirable features for human swarm interaction is identified based on the principles of swarm robotics. Human swarm interaction architecture is then proposed that has all of the desirable features. A swarm simulation environment is created that allows simulating a swarm behavior in an indoor environment. The swarm behavior and the results of user interaction are studied by considering radiation source search and localization application of the swarm. Particle swarm optimization algorithm is slightly modified to enable the swarm to autonomously explore the indoor environment for radiation source search and localization. The emergence of intelligence is observed that enables the swarm to locate the radiation source completely on its own. Proposed human swarm interaction is then integrated in a simulation environment and user evaluation experiments are conducted. Participants are introduced to the interaction tool and asked to deploy the swarm to complete the missions. The performance comparison of the user guided swarm to that of the autonomous swarm shows that the interaction interface is fairly easy to learn and that user guided swarm is more efficient in achieving the goals. The results clearly indicate that the proposed interaction helped the swarm achieve emergence

    News Session-Based Recommendations using Deep Neural Networks

    Full text link
    News recommender systems are aimed to personalize users experiences and help them to discover relevant articles from a large and dynamic search space. Therefore, news domain is a challenging scenario for recommendations, due to its sparse user profiling, fast growing number of items, accelerated item's value decay, and users preferences dynamic shift. Some promising results have been recently achieved by the usage of Deep Learning techniques on Recommender Systems, specially for item's feature extraction and for session-based recommendations with Recurrent Neural Networks. In this paper, it is proposed an instantiation of the CHAMELEON -- a Deep Learning Meta-Architecture for News Recommender Systems. This architecture is composed of two modules, the first responsible to learn news articles representations, based on their text and metadata, and the second module aimed to provide session-based recommendations using Recurrent Neural Networks. The recommendation task addressed in this work is next-item prediction for users sessions: "what is the next most likely article a user might read in a session?" Users sessions context is leveraged by the architecture to provide additional information in such extreme cold-start scenario of news recommendation. Users' behavior and item features are both merged in an hybrid recommendation approach. A temporal offline evaluation method is also proposed as a complementary contribution, for a more realistic evaluation of such task, considering dynamic factors that affect global readership interests like popularity, recency, and seasonality. Experiments with an extensive number of session-based recommendation methods were performed and the proposed instantiation of CHAMELEON meta-architecture obtained a significant relative improvement in top-n accuracy and ranking metrics (10% on Hit Rate and 13% on MRR) over the best benchmark methods.Comment: Accepted for the Third Workshop on Deep Learning for Recommender Systems - DLRS 2018, October 02-07, 2018, Vancouver, Canada. https://recsys.acm.org/recsys18/dlrs

    Learning to Attend, Copy, and Generate for Session-Based Query Suggestion

    Full text link
    Users try to articulate their complex information needs during search sessions by reformulating their queries. To make this process more effective, search engines provide related queries to help users in specifying the information need in their search process. In this paper, we propose a customized sequence-to-sequence model for session-based query suggestion. In our model, we employ a query-aware attention mechanism to capture the structure of the session context. is enables us to control the scope of the session from which we infer the suggested next query, which helps not only handle the noisy data but also automatically detect session boundaries. Furthermore, we observe that, based on the user query reformulation behavior, within a single session a large portion of query terms is retained from the previously submitted queries and consists of mostly infrequent or unseen terms that are usually not included in the vocabulary. We therefore empower the decoder of our model to access the source words from the session context during decoding by incorporating a copy mechanism. Moreover, we propose evaluation metrics to assess the quality of the generative models for query suggestion. We conduct an extensive set of experiments and analysis. e results suggest that our model outperforms the baselines both in terms of the generating queries and scoring candidate queries for the task of query suggestion.Comment: Accepted to be published at The 26th ACM International Conference on Information and Knowledge Management (CIKM2017

    VITALAS at TRECVID-2009

    Get PDF
    This paper describes the participation of VITALAS in the TRECVID-2009 evaluation where we submitted runs for the High-Level Feature Extraction (HLFE) and Interactive Search tasks. For the HLFE task, we focus on the evaluation of low-level feature sets and fusion methods. The runs employ multiple low-level features based on all available modalities (visual, audio and text) and the results show that use of such features improves the retrieval eectiveness signicantly. We also use a concept score fusion approach that achieves good results with reduced low-level feature vector dimensionality. Furthermore, a weighting scheme is introduced for cluster assignment in the \bag-of-words" approach. Our runs achieved good performance compared to a baseline run and the submissions of other TRECVID-2009 participants. For the Interactive Search task, we focus on the evaluation of the integrated VITALAS system in order to gain insights into the use and eectiveness of the system's search functionalities on (the combination of) multiple modalities and study the behavior of two user groups: professional archivists and non-professional users. Our analysis indicates that both user groups submit about the same total number of queries and use the search functionalities in a similar way, but professional users save twice as many shots and examine shots deeper in the ranked retrieved list.The agreement between the TRECVID assessors and our users was quite low. In terms of the eectiveness of the dierent search modalities, similarity searches retrieve on average twice as many relevant shots as keyword searches, fused searches three times as many, while concept searches retrieve even up to ve times as many relevant shots, indicating the benets of the use of robust concept detectors in multimodal video retrieval. High-Level Feature Extraction Runs 1. A VITALAS.CERTH-ITI 1: Early fusion of all available low-level features. 2. A VITALAS.CERTH-ITI 2: Concept score fusion for ve low-level features and 100 concepts, text features and bag-of-words with color SIFT descriptor based on dense sampling. 3. A VITALAS.CERTH-ITI 3: Concept score fusion for ve low-level features and 100 concepts combined with text features. 4. A VITALAS.CERTH-ITI 4: Weighting scheme for bag-of-words based on dense sampling of the color SIFT descriptor. 5. A VITALAS.CERTH-ITI 5: Baseline run, bag-of-words based on dense sampling of the color SIFT descriptor. Interactive Search Runs 1. vitalas 1: Interactive run by professional archivists 2. vitalas 2: Interactive run by professional archivists 3. vitalas 3: Interactive run by non-professional users 4. vitalas 4: Interactive run by non-professional user

    Measuring Immediate Effect and Carry-over Effect of Multi-channel Online Ads

    Get PDF
    Faced with various online ads, firms are hard to choose the most appropriate advertising channels which have best advertising effects. Online advertising has immediate and carry-over effects. We constructed a comprehensive evaluation model of multi-channel online advertising effects which can evaluate not only immediate effect but also carry-over effect based on lag effect factors. Then, we conducted a restricted grid search and multiple linear regressions to estimate the immediate effect and carry-over effect of paid search ads, mobile phone message ads and e-mail ads based on user behavior data and transaction data of an e-commerce website. The results show that the immediate effect intensity of paid-search ads is the highest, the carry-over effect duration of e-mail ads is the longest, and the cumulative carry-over effect intensity of e-mail ads is the highest. This study puts forward suggestions on how to evaluate the effects of multi-channel online ads more accurately, which can guide this e-commerce website to make better advertising strategy for online marketing

    Inferring Dynamic User Interests in Streams of Short Texts for User Clustering

    Get PDF
    User clustering has been studied from different angles. In order to identify shared interests, behavior-based methods consider similar browsing or search patterns of users, whereas content-based methods use information from the contents of the documents visited by the users. So far, content-based user clustering has mostly focused on static sets of relatively long documents. Given the dynamic nature of social media, there is a need to dynamically cluster users in the context of streams of short texts. User clustering in this setting is more challenging than in the case of long documents, as it is difficult to capture the users’ dynamic topic distributions in sparse data settings. To address this problem, we propose a dynamic user clustering topic model (UCT). UCT adaptively tracks changes of each user’s time-varying topic distributions based both on the short texts the user posts during a given time period and on previously estimated distributions. To infer changes, we propose a Gibbs sampling algorithm where a set of word pairs from each user is constructed for sampling. UCT can be used in two ways: (1) as a short-term dependency model that infers a user’s current topic distribution based on the user’s topic distributions during the previous time period only, and (2) as a long-term dependency model that infers a user’s current topic distributions based on the user’s topic distributions during multiple time periods in the past. The clustering results are explainable and human-understandable, in contrast to many other clustering algorithms. For evaluation purposes, we work with a dataset consisting of users and tweets from each user. Experimental results demonstrate the effectiveness of our proposed short-term and long-term dependency user clustering models compared to state-of-the-art baselines
    • …
    corecore