Search CORE

132 research outputs found

Differentiable Unbiased Online Learning to Rank

Author: de Rijke Maarten
Oosterhuis Harrie
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.Comment: Conference on Information and Knowledge Management 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Solutions to Detect and Analyze Online Radicalization : A Survey

Author: Correa Denzil
Sureka Ashish
Publication venue
Publication date: 21/01/2013
Field of study

Online Radicalization (also called Cyber-Terrorism or Extremism or Cyber-Racism or Cyber- Hate) is widespread and has become a major and growing concern to the society, governments and law enforcement agencies around the world. Research shows that various platforms on the Internet (low barrier to publish content, allows anonymity, provides exposure to millions of users and a potential of a very quick and widespread diffusion of message) such as YouTube (a popular video sharing website), Twitter (an online micro-blogging service), Facebook (a popular social networking website), online discussion forums and blogosphere are being misused for malicious intent. Such platforms are being used to form hate groups, racist communities, spread extremist agenda, incite anger or violence, promote radicalization, recruit members and create virtual organi- zations and communities. Automatic detection of online radicalization is a technically challenging problem because of the vast amount of the data, unstructured and noisy user-generated content, dynamically changing content and adversary behavior. There are several solutions proposed in the literature aiming to combat and counter cyber-hate and cyber-extremism. In this survey, we review solutions to detect and analyze online radicalization. We review 40 papers published at 12 venues from June 2003 to November 2011. We present a novel classification scheme to classify these papers. We analyze these techniques, perform trend analysis, discuss limitations of existing techniques and find out research gaps

arXiv.org e-Print Archive

CiteSeerX

Probabilistic Multileave Gradient Descent

Author: K Hofmann
M Sanderson
O Chapelle
T Joachims
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Characterizing User Search Intent and Behavior for Click Analysis in Sponsored Search

Author: Ashkan Azin
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

Interpreting user actions to better understand their needs provides an important tool for improving information access services. In the context of organic Web search, considerable effort has been made to model user behavior and infer query intent, with the goal of improving the overall user experience. Much less work has been done in the area of sponsored search, i.e., with respect to the advertisement links (ads) displayed on search result pages by many commercial search engines. This thesis develops and evaluates new models and methods required to interpret user browsing and click behavior and understand query intent in this very different context. The concern of the initial part of the thesis is on extending the query categories for commercial search and on inferring query intent, with a focus on two major tasks: i) enriching queries with contextual information obtained from search result pages returned for these queries, and ii) developing relatively simple methods for the reliable labeling of training data via crowdsourcing. A central idea of this thesis work is to study the impact of contextual factors (including query intent, ad placement, and page structure) on user behavior. Later, this information is incorporated into probabilistic models to evaluate the quality of advertisement links within the context that they are displayed in their history of appearance. In order to account for these factors, a number of query and location biases are proposed and formulated into a group of browsing and click models. To explore user intent and behavior and to evaluate the performance of the proposed models and methods, logs of query and click information provided for research purposes are used. Overall, query intent is found to have substantial impact on predictions of user click behavior in sponsored search. Predictions are further improved by considering ads in the context of the other ads displayed on a result page. The parameters of the browsing and click models are learned using an expectation maximization technique applied to click signals recorded in the logs. The initial motivation of the user to browse the ad list and their browsing persistence are found to be related to query intent and browsing/click behavior. Accommodating these biases along with the location bias in user models appear as effective contextual signals, improving the performance of the existing models

University of Waterloo's Institutional Repository

Watching inside the Screen: Digital Activity Monitoring for Task Recognition and Proactive Information Retrieval

Author: Jacucci Giulio
Ruotsalo Tuukka
Vuong Tung
Publication venue
Publication date: 11/09/2017
Field of study

We investigate to what extent it is possible to infer a user’s work tasks by digital activity monitoring and use the task models for proactive information retrieval. Ten participants volunteered for the study, in which their computer screen was monitored and related logs were recorded for 14 days. Corresponding diary entries were collected to provide ground truth to the task detection method. We report two experiments using this data. The unsupervised task detection experiment was conducted to detect tasks using unsupervised topic modeling. The results show an average task detection accuracy of more than 70% by using rich screen monitoring data. The single-trial task detection and retrieval experiment utilized unseen user inputs in order to detect related work tasks and retrieve task-relevant information on-line. We report an average task detection accuracy of 95%, and the corresponding model-based document retrieval with Normalized Discounted Cumulative Gain of 98%. We discuss and provide insights regarding the types of digital tasks occurring in the data, the accuracy of task detection on different task types, and the role of using different data input such as application names, extracted keywords, and bag-of-words representations in the task detection process. We also discuss the implications of our results for ubiquitous user modeling and privacy.Peer reviewe

Helsingin yliopiston digitaalinen arkisto