Search CORE

134 research outputs found

Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Author: Bhagat Smriti
Ioannidis Stratis
Taft Nina
Weinsberg Udi
Publication venue
Publication date: 30/07/2014
Field of study

Recommender systems leverage user demographic information, such as age, gender, etc., to personalize recommendations and better place their targeted ads. Oftentimes, users do not volunteer this information due to privacy concerns, or due to a lack of initiative in filling out their online profiles. We illustrate a new threat in which a recommender learns private attributes of users who do not voluntarily disclose them. We design both passive and active attacks that solicit ratings for strategically selected items, and could thus be used by a recommender system to pursue this hidden agenda. Our methods are based on a novel usage of Bayesian matrix factorization in an active learning setting. Evaluations on multiple datasets illustrate that such attacks are indeed feasible and use significantly fewer rated items than static inference methods. Importantly, they succeed without sacrificing the quality of recommendations to users.Comment: This is the extended version of a paper that appeared in ACM RecSys 201

arXiv.org e-Print Archive

CiteSeerX

ShutUp: End-to-End Containment of Unwanted Traffic

Author: Guha Saikat
Taft Nina
Publication venue
Publication date: 10/07/2008
Field of study

While the majority of Denial-of-Service (DoS) defense proposals assume a purely infrastructure-based architecture, some recent proposals suggest that the attacking endhost may be enlisted as part of the solution, through tamper-proof software, network-imposed incentives, or user altruism. While intriguing, these proposals ultimately raise the deployment bar by requiring both the infrastructure and endhosts to cooperate. In this paper, we explore the design of a pure end-to-end architecture based on tamper-proof endhost software implemented for instance with trusted platforms and virtual machines. We present the design of a ?Shutup Service?, whereby the recipient of unwanted traffic can ask the sender to slowdown or stop. We show that this service is effective in stopping DoS attacks, and in significantly slowing down other types of unwanted traffic such as worms. The Shutup service is incrementally deployable with buy-in from OS or antivirus vendors, requiring only minimal changes to the endhost software stack and no changes to the protocol stack. We show through experimentation that the service is effective and has little impact on legitimate traffic

eCommons@Cornell

Private Decayed Sum Estimation under Continual Observation

Author: Bolot Jean
Fawaz Nadia
Muthukrishnan S.
Nikolov Aleksandar
Taft Nina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/03/2012
Field of study

In monitoring applications, recent data is more important than distant data. How does this affect privacy of data analysis? We study a general class of data analyses - computing predicate sums - with privacy. Formally, we study the problem of estimating predicate sums {\em privately}, for sliding windows (and other well-known decay models of data, i.e. exponential and polynomial decay). We extend the recently proposed continual privacy model of Dwork et al. We present algorithms for decayed sum which are \eps-differentially private, and are accurate. For window and exponential decay sums, our algorithms are accurate up to additive 1/\eps and polylog terms in the range of the computed function; for polynomial decay sums which are technically more challenging because partial solutions do not compose easily, our algorithms incur additional relative error. Further, we show lower bounds, tight within polylog factors and tight with respect to the dependence on the probability of error

arXiv.org e-Print Archive

Crossref

Privacy Tradeoffs in Predictive Analytics

Author: Bhagat Smriti
Fawaz Nadia
Ioannidis Stratis
Montanari Andrea
Taft Nina
Weinsberg Udi
Publication venue
Publication date: 01/01/2014
Field of study

Online services routinely mine user data to predict user preferences, make recommendations, and place targeted ads. Recent research has demonstrated that several private user attributes (such as political affiliation, sexual orientation, and gender) can be inferred from such data. Can a privacy-conscious user benefit from personalization while simultaneously protecting her private attributes? We study this question in the context of a rating prediction service based on matrix factorization. We construct a protocol of interactions between the service and users that has remarkable optimality properties: it is privacy-preserving, in that no inference algorithm can succeed in inferring a user's private attribute with a probability better than random guessing; it has maximal accuracy, in that no other privacy-preserving protocol improves rating prediction; and, finally, it involves a minimal disclosure, as the prediction accuracy strictly decreases when the service reveals less information. We extensively evaluate our protocol using several rating datasets, demonstrating that it successfully blocks the inference of gender, age and political affiliation, while incurring less than 5% decrease in the accuracy of rating prediction.Comment: Extended version of the paper appearing in SIGMETRICS 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Analysis of OD Flows (Raw Data)

Author: Crovella Mark
Diot Christophe
Kolaczyk Eric D.
Lakhina Anukool
Papagiannaki Konstantina
Taft Nina
Publication venue: Boston University Computer Science Department
Publication date: 20/11/2003
Field of study

In a recent paper, Structural Analysis of Network Traffic Flows, we analyzed the set of Origin Destination traffic flows from the Sprint-Europe and Abilene backbone networks. This report presents the complete set of results from analyzing data from both networks. The results in this report are specific to the Sprint-1 and Abilene datasets studied in the above paper. The following results are presented here: 1 Rows of Principal Matrix (V) 2 1.1 Sprint-1 Dataset ................................ 2 1.2 Abilene Dataset.................................. 9 2 Set of Eigenflows 14 2.1 Sprint-1 Dataset.................................. 14 2.2 Abilene Dataset................................... 21 3 Classifying Eigenflows 26 3.1 Sprint-1 Dataset.................................. 26 3.2 Abilene Datase.................................... 44Centre National de la Recherche Scientifique (CNRS) France; Sprint Labs; Office of Naval Research (N000140310043); National Science Foundation (ANI-9986397, CCR-0325701

Boston University Institutional Repository (OpenBU)

Impact of IT Monoculture on Behavioral End Host Intrusion Detection

Author: Barman Dhiman
Chandrashekar Jaideep
Faloutsos Michalis
Giroire Frédéric
Huang Lim
Taft Nina
Publication venue: HAL CCSD
Publication date
Field of study

International audienceIn this paper, we study the impact of today's IT policies, defined based upon a monoculture approach, on the performance of endhost anomaly detectors. This approach leads to the uniform configuration of Host intrusion detection systems (HIDS) across all hosts in an enterprise networks. We assess the performance impact this policy has from the individual's point of view by analyzing network traces collected from 350 enterprise users. We uncover a great deal of diversity in the user population in terms of the â€œtailâ€ behavior, i.e., the component which matters for anomaly detection systems. We demonstrate that the monoculture approach to HIDS configuration results in users that experience wildly different false positive and false negatives rates. We then introduce new policies, based upon leveraging this diversity and show that not only do they dramatically improve performance for the vast majority of users, but they also reduce the number of false positives arriving in centralized IT operation centers, and can reduce attack strength

HAL-UNICE

HostView: Annotating end-host performance measurements with user feedback

Author: Chandrashekar Jaideep
Taft Nina
Teixeira Renata
Zeaiter Joumblatt Diana
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceNetwork disruptions can adversely impact a users' web browsing, cause video/audio interruptions, or render web sites and services unreachable. Such problems are frustrating to Internet users, who are oblivious to the underlying problems, but completely exposed to the service degrada- tions. Ideally users' end systems would have diagnostic tools that can automatically detect, diagnose and possibly repair, performance degradations. Hopefully, this can be done without user intervention. Clearly, the first step for any such (end-host) diagnostic tool is a methodology to automatically detect performance degradations in the network that can affect a user's perception of application performance

CiteSeerX

INRIA a CCSD electronic archive server

Performance of Networked Applications: The Challenges in Capturing the User's Perception

Author: Chandrashekar Jaideep
Taft Nina
Teixeira Renata
Zeaiter Joumblatt Diana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

International audienceThere is much interest recently in doing automated performance diagnosis on user laptops or desktops. One interesting aspect of performance diagnosis that has received little attention is the user perspective on performance. To conduct research on both end-host performance diagnosis and user perception of network and application performance, we designed an end-host data collection tool, called HostView. HostView not only collects network, application and machine level data, but also gathers feedback directly from users. User feedback is obtained via two mechanisms, a system-triggered questionnaire and a user-triggered feedback form, that for example asks users to rate the performance of their network and applications. In this paper, we describe our experience with the first deployment of HostView. Using data from 40 users, we illustrate the diversity of our users, articulate the challenges in this line of research, and report on initial findings in correlating user data to system-level data

CiteSeerX

INRIA a CCSD electronic archive server