Search CORE

534 research outputs found

Learning a mixture of two multinomial logits

Author: Chierichetti Flavio
Kumar Ravi
Tomkins Andrew
Publication venue
Publication date: 01/01/2018
Field of study

The classical Multinomial Logit (MNL) is a behavioral model for user choice. In this model, a user is offered a slate of choices (a subset of a finite universe of n items), and selects exactly one item from the slate, each with probability proportional to its (positive) weight. Given a set of observed slates and choices, the likelihood-maximizing item weights are easy to learn at scale, and easy to interpret. However, the model fails to represent common real-world behavior. As a result, researchers in user choice often turn to mixtures of MNLs, which are known to approximate a large class of models of rational user behavior. Unfortunately, the only known algorithms for this problem have been heuristic in nature. In this paper we give the first polynomial-time algorithms for exact learning of uniform mixtures of two MNLs. Interestingly, the parameters of the model can be learned for any n by sampling the behavior of random users only on slates of sizes 2 and 3; in contrast, we show that slates of size 2 are insufficient by themselves

Archivio della ricerca- Università di Roma La Sapienza

On the power laws of language: word frequency distributions

Author: CHIERICHETTI FLAVIO
Kumar Ravi
Pang Bo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

About eight decades ago, Zipf postulated that the word frequency distribution of languages is a power law, i.e., it is a straight line on a log-log plot. Over the years, this phenomenon has been documented and studied extensively. For many corpora, however, the empirical distribution barely resembles a power law: when plotted on a loglog scale, the distribution is concave and appears to be composed of two differently sloped straight lines joined by a smooth curve. A simple generative model is proposed to capture this phenomenon. Theword frequency distributions produced by this model are shown to match the observations both analytically and empirically. © 2017 Copyright held by the owner/author(s)

Archivio della ricerca- Università di Roma La Sapienza

Discrete choice, permutations, and reconstruction

Author: Chierichetti Flavio
Kumar Ravi
Tomkins Andrew
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2018
Field of study

In this paper we study the well-known family of Random Utility Models, developed over 50 years ago to codify rational user behavior in choosing one item from a finite set of options. In this setting each user draws i.i.d. from some distribution a utility function mapping each item in the universe to a real-valued utility. The user is then offered a subset of the items, and selects theone of maximum utility. A Max-Dist oracle for this choice model takes any subset of items and returns the probability (over the distribution of utility functions) that each will be selected. A discrete choice algorithm, given access to a Max-Dist oracle, must return a function that approximates the oracle. We show three primary results. First, we show that any algorithm exactly reproducing the oracle must make exponentially many queries. Second, we show an equivalent representation of the distribution over utility functions, based on permutations, and show that if this distribution has support size k, then it is possible to approximate the oracle using O(nk) queries. Finally, we consider settings in which the subset of items is always small. We give an algorithm that makes less than n(1=2)K queries, each to sets of size at most (1/2)K, in order to approximate the Max-Dist oracle on every set of size |T| K with statistical error at most. In contrast, we show that any algorithm that queries for subsets of size 2O( p log n) must make maximal statistical error on some large sets

Crossref

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

Towards empowerment through-out citizen participation. Triggers in tangible and intangible resources for a sense of belonging

Author: Chierichetti Nicolò
Publication venue: 'Academia.edu'
Publication date: 01/01/2022
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Descortesía en las páginas de Facebook de festivales de música

Author: Chierichetti Luisa
Publication venue
Publication date: 01/01/2014
Field of study

El presente artículo se centra en las interacciones que se desarrollan en Facebook (FB), el conocido sitio web de redes sociales que también es un recurso para la comunicación, y la promoción turística. Nos planteamos caracterizar este contexto sociocultural específico, que abarca comportamientos, actitudes y valores conocidos, aceptados y practicados en una comunidad discursiva, para luego describir el fenómeno de la descortesía en un corpus acotado de páginas de Facebook de festivales musicales, ofreciendo algunas reflexiones sobre sus características y sus funciones.This article focuses on interaction in Facebook (FB), as one of the best known and effective social media networks for marketing in tourism communication and industry. After characterizing and describing the specific sociocultural context, which includes behaviors, attitudes and values as accepted and practiced in a discourse community, the article describes the phenomenon of impoliteness in a limited corpus of music festivals, offering some reflections on its features and functions

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

DIALNET

Fair Clustering Through Fairlets

Author: Chierichetti Flavio
Kumar Ravi
Lattanzi Silvio
Vassilvitskii Sergei
Publication venue
Publication date: 01/01/2017
Field of study

We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the

k

-center and the

k

-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions---for instance a point may no longer be assigned to its nearest cluster center! En route we introduce the concept of fairlets, which are minimal sets that satisfy fair representation while approximately preserving the clustering objective. We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms. While finding good fairlets can be NP-hard, we proceed to obtain efficient approximation algorithms based on minimum cost flow. We empirically quantify the value of fair clustering on real-world datasets with sensitive attributes

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Motif counting beyond five nodes

Author: Bressan Marco
Chierichetti Flavio
Kumar Ravi
Leucci Stefano
Panconesi Alessandro
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural algorithms based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that such algorithms are outperformed by color coding (CC) [2], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC; furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC’s memory requirements become demanding when the size of the input graph and that of the graphlets grow. And yet, our experiments show that CC can push the limits of the state-of-the-art, both in terms of the size of the input graph and of that of the graphlets

Archivio della ricerca- Università di Roma La Sapienza

On sampling nodes in a network

Author: CHIERICHETTI FLAVIO
Dasgupta Anirban
Kumar Ravi
Lattanzi Silvio
Sarlós Tamás
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Random walk is an important tool in many graph mining applications including estimating graph parameters, sampling portions of the graph, and extracting dense communities. In this paper we consider the problem of sampling nodes from a large graph according to a prescribed distribution by using random walk as the basic primitive. Our goal is to obtain algorithms that make a small number of queries to the graph but output a node that is sampled according to the prescribed distribution. Focusing on the uniform distribution case, we study the query complexity of three algorithms and show a near-tight bound expressed in terms of the parameters of the graph such as average degree and the mixing time. Both theoretically and empirically, we show that some algorithms are preferable in practice than the others. We also extend our study to the problem of sampling nodes according to some polynomial function of their degrees; this has implications for designing efficient algorithms for applications such as triangle counting

IIT Gandhinagar

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

Voting with Limited Information and Many Alternatives

Author: Chierichetti Flavio
Kleinberg Jon
Publication venue
Publication date: 08/10/2011
Field of study

The traditional axiomatic approach to voting is motivated by the problem of reconciling differences in subjective preferences. In contrast, a dominant line of work in the theory of voting over the past 15 years has considered a different kind of scenario, also fundamental to voting, in which there is a genuinely "best" outcome that voters would agree on if they only had enough information. This type of scenario has its roots in the classical Condorcet Jury Theorem; it includes cases such as jurors in a criminal trial who all want to reach the correct verdict but disagree in their inferences from the available evidence, or a corporate board of directors who all want to improve the company's revenue, but who have different information that favors different options. This style of voting leads to a natural set of questions: each voter has a {\em private signal} that provides probabilistic information about which option is best, and a central question is whether a simple plurality voting system, which tabulates votes for different options, can cause the group decision to arrive at the correct option. We show that plurality voting is powerful enough to achieve this: there is a way for voters to map their signals into votes for options in such a way that --- with sufficiently many voters --- the correct option receives the greatest number of votes with high probability. We show further, however, that any process for achieving this is inherently expensive in the number of voters it requires: succeeding in identifying the correct option with probability at least