    Controlling Fairness and Bias in Dynamic Learning-to-Rank

    Rankings are the primary interface through which many online platforms match users to items (e.g. news, products, music, video). In these two-sided markets, not only the users draw utility from the rankings, but the rankings also determine the utility (e.g. exposure, revenue) for the item providers (e.g. publishers, sellers, artists, studios). It has already been noted that myopically optimizing utility to the users, as done by virtually all learning-to-rank algorithms, can be unfair to the item providers. We, therefore, present a learning-to-rank approach for explicitly enforcing merit-based fairness guarantees to groups of items (e.g. articles by the same publisher, tracks by the same artist). In particular, we propose a learning algorithm that ensures notions of amortized group fairness, while simultaneously learning the ranking function from implicit feedback data. The algorithm takes the form of a controller that integrates unbiased estimators for both fairness and utility, dynamically adapting both as more data becomes available. In addition to its rigorous theoretical foundation and convergence guarantees, we find empirically that the algorithm is highly practical and robust.Comment: First two authors contributed equally. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval 202

    How simple rules determine pedestrian behavior and crowd disasters

    With the increasing size and frequency of mass events, the study of crowd disasters and the simulation of pedestrian flows have become important research areas. Yet, even successful modeling approaches such as those inspired by Newtonian force models are still not fully consistent with empirical observations and are sometimes hard to calibrate. Here, a novel cognitive science approach is proposed, which is based on behavioral heuristics. We suggest that, guided by visual information, namely the distance of obstructions in candidate lines of sight, pedestrians apply two simple cognitive procedures to adapt their walking speeds and directions. While simpler than previous approaches, this model predicts individual trajectories and collective patterns of motion in good quantitative agreement with a large variety of empirical and experimental data. This includes the emergence of self-organization phenomena, such as the spontaneous formation of unidirectional lanes or stop-and-go waves. Moreover, the combination of pedestrian heuristics with body collisions generates crowd turbulence at extreme densities-a phenomenon that has been observed during recent crowd disasters. By proposing an integrated treatment of simultaneous interactions between multiple individuals, our approach overcomes limitations of current physics-inspired pair interaction models. Understanding crowd dynamics through cognitive heuristics is therefore not only crucial for a better preparation of safe mass events. It also clears the way for a more realistic modeling of collective social behaviors, in particular of human crowds and biological swarms. Furthermore, our behavioral heuristics may serve to improve the navigation of autonomous robots.Comment: Article accepted for publication in PNA

    Quantifying Social Influence in an Online Cultural Market

    We revisit experimental data from an online cultural market in which 14,000 users interact to download songs, and develop a simple model that can explain seemingly complex outcomes. Our results suggest that individual behavior is characterized by a two-step process–the decision to sample and the decision to download a song. Contrary to conventional wisdom, social influence is material to the first step only. The model also identifies the role of placement in mediating social signals, and suggests that in this market with anonymous feedback cues, social influence serves an informational rather than normative role

    Anyone Can Become a Troll: Causes of Trolling Behavior in Online Discussions

    In online communities, antisocial behavior such as trolling disrupts constructive discussion. While prior work suggests that trolling behavior is confined to a vocal and antisocial minority, we demonstrate that ordinary people can engage in such behavior as well. We propose two primary trigger mechanisms: the individual's mood, and the surrounding context of a discussion (e.g., exposure to prior trolling behavior). Through an experiment simulating an online discussion, we find that both negative mood and seeing troll posts by others significantly increases the probability of a user trolling, and together double this probability. To support and extend these results, we study how these same mechanisms play out in the wild via a data-driven, longitudinal analysis of a large online news discussion community. This analysis reveals temporal mood effects, and explores long range patterns of repeated exposure to trolling. A predictive model of trolling behavior shows that mood and discussion context together can explain trolling behavior better than an individual's history of trolling. These results combine to suggest that ordinary people can, under the right circumstances, behave like trolls.Comment: Best Paper Award at CSCW 201

    Quantum Breaking of Elastic String

    Breaking of an atomic chain under stress is a collective many-particle tunneling phenomenon. We study classical dynamics in imaginary time by using conformal mapping technique, and derive an analytic formula for the probability of breaking. The result covers a broad temperature interval and interpolates between two regimes: tunneling and thermal activation. Also, we consider the breaking induced by an ultrasonic wave propagating in the chain, and propose to observe it in an STM experiment.Comment: 8 pages, RevTeX 3.0, Landau Institute preprint 261/643

    Bias reduction in traceroute sampling: towards a more accurate map of the Internet

    Traceroute sampling is an important technique in exploring the internet router graph and the autonomous system graph. Although it is one of the primary techniques used in calculating statistics about the internet, it can introduce bias that corrupts these estimates. This paper reports on a theoretical and experimental investigation of a new technique to reduce the bias of traceroute sampling when estimating the degree distribution. We develop a new estimator for the degree of a node in a traceroute-sampled graph; validate the estimator theoretically in Erdos-Renyi graphs and, through computer experiments, for a wider range of graphs; and apply it to produce a new picture of the degree distribution of the autonomous system graph.Comment: 12 pages, 3 figure

    Implementation of Web-Based Respondent-Driven Sampling among Men who Have Sex with Men in Vietnam

    Objective: Lack of representative data about hidden groups, like men who have sex with men (MSM), hinders an evidence-based response to the HIV epidemics. Respondent-driven sampling (RDS) was developed to overcome sampling challenges in studies of populations like MSM for which sampling frames are absent. Internet-based RDS (webRDS) can potentially circumvent limitations of the original RDS method. We aimed to implement and evaluate webRDS among a hidden population. Methods and Design: This cross-sectional study took place 18 February to 12 April, 2011 among MSM in Vietnam. Inclusion criteria were men, aged 18 and above, who had ever had sex with another man and were living in Vietnam. Participants were invited by an MSM friend, logged in, and answered a survey. Participants could recruit up to four MSM friends. We evaluated the system by its success in generating sustained recruitment and the degree to which the sample compositions stabilized with increasing sample size. Results: Twenty starting participants generated 676 participants over 24 recruitment waves. Analyses did not show evidence of bias due to ineligible participation. Estimated mean age was 22 year and 82% came from the two large metropolitan areas. 32 out of 63 provinces were represented. The median number of sexual partners during the last six months was two. The sample composition stabilized well for 16 out of 17 variables. Conclusion: Results indicate that webRDS could be implemented at a low cost among Internet-using MSM in Vietnam. WebRDS may be a promising method for sampling of Internet-using MSM and other hidden groups. Key words: Respondent-driven sampling, Online sampling, Men who have sex with men, Vietnam, Sexual risk behavio
