21,207 research outputs found

    Privacy Tradeoffs in Predictive Analytics

    Full text link
    Online services routinely mine user data to predict user preferences, make recommendations, and place targeted ads. Recent research has demonstrated that several private user attributes (such as political affiliation, sexual orientation, and gender) can be inferred from such data. Can a privacy-conscious user benefit from personalization while simultaneously protecting her private attributes? We study this question in the context of a rating prediction service based on matrix factorization. We construct a protocol of interactions between the service and users that has remarkable optimality properties: it is privacy-preserving, in that no inference algorithm can succeed in inferring a user's private attribute with a probability better than random guessing; it has maximal accuracy, in that no other privacy-preserving protocol improves rating prediction; and, finally, it involves a minimal disclosure, as the prediction accuracy strictly decreases when the service reveals less information. We extensively evaluate our protocol using several rating datasets, demonstrating that it successfully blocks the inference of gender, age and political affiliation, while incurring less than 5% decrease in the accuracy of rating prediction.Comment: Extended version of the paper appearing in SIGMETRICS 201

    Lexical information from a minimalist point of view

    Get PDF
    Simplicity as a methodological orientation applies to linguistic theory just as to any other field of research: ‘Occam’s razor’ is the label for the basic heuristic maxim according to which an adequate analysis must ultimately be reduced to indispensible specifications. In this sense, conceptual economy has been a strict and stimulating guideline in the development of Generative Grammar from the very beginning. Halle’s (1959) argument discarding the level of taxonomic phonemics in order to unify two otherwise separate phonological processes is an early characteristic example; a more general notion is that of an evaluation metric introduced in Chomsky (1957, 1975), which relates the relative simplicity of alternative linguistic descriptions systematically to the quest for explanatory adequacy of the theory underlying the descriptions to be evaluated. Further proposals along these lines include the theory of markedness developed in Chomsky and Halle (1968), Kean (1975, 1981), and others, the notion of underspecification proposed e.g. in Archangeli (1984), Farkas (1990), the concept of default values and related notions. An important step promoting this general orientation was the idea of Principles and Parameters developed in Chomsky (1981, 1986), which reduced the notion of language particular rule systems to universal principles, subject merely to parametrization with restricted options, largely related to properties of particular lexical items. On this account, the notion of a simplicity metric is to be dispensed with, as competing analyses of relevant data are now supposed to be essentially excluded by the restrictive system of principles
    • …
    corecore