Search CORE

318 research outputs found

Average treatment effect estimation via random recursive partitioning

Author: Iacus Stefano
Porro Giuseppe
Publication venue
Publication date: 01/01/2004
Field of study

A new matching method is proposed for the estimation of the average treatment effect of social policy interventions (e.g., training programs or health care measures). Given an outcome variable, a treatment and a set of pre-treatment covariates, the method is based on the examination of random recursive partitions of the space of covariates using regression trees. A regression tree is grown either on the treated or on the untreated individuals {\it only} using as response variable a random permutation of the indexes 1...

n

(

n

being the number of units involved), while the indexes for the other group are predicted using this tree. The procedure is replicated in order to rule out the effect of specific permutations. The average treatment effect is estimated in each tree by matching treated and untreated in the same terminal nodes. The final estimator of the average treatment effect is obtained by averaging on all the trees grown. The method does not require any specific model assumption apart from the tree's complexity, which does not affect the estimator though. We show that this method is either an instrument to check whether two samples can be matched (by any method) and, when this is feasible, to obtain reliable estimates of the average treatment effect. We further propose a graphical tool to inspect the quality of the match. The method has been applied to the National Supported Work Demonstration data, previously analyzed by Lalonde (1986) and others

arXiv.org e-Print Archive

CiteSeerX

Invariant and Metric Free Proximities for Data Matching: An R Package

Author: Giuseppe Porro
Stefano Iacus
Publication venue
Publication date
Field of study

Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. In this paper we present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. An open-source software in the form of a R package is also presented.

Research Papers in Economics

cem: Software for Coarsened Exact Matching

Author: Gary King
Giuseppe Porro
Stefano Iacus
Publication venue
Publication date
Field of study

This program is designed to improve causal inference via a method of matching that is widely applicable in observational data and easy to understand and use (if you understand how to draw a histogram, you will understand this method). The program implements the coarsened exact matching (CEM) algorithm, described below. CEM may be used alone or in combination with any existing matching method. This algorithm, and its statistical properties, are described in Iacus, King, and Porro (2008).

Research Papers in Economics

Measuring Social Well Being in The Big Data Era: Asking or Listening?

Author: Curti Matteo
Iacus Stefano
Porro Giuseppe
Siletti Elena
Publication venue
Publication date: 01/01/2015
Field of study

The literature on well being measurement seems to suggest that "asking" for a self-evaluation is the only way to estimate a complete and reliable measure of well being. At the same time "not asking" is the only way to avoid biased evaluations due to self-reporting. Here we propose a method for estimating the welfare perception of a community simply "listening" to the conversations on Social Network Sites. The Social Well Being Index (SWBI) and its components are proposed through to an innovative technique of supervised sentiment analysis called iSA which scales to any language and big data. As main methodological advantages, this approach can estimate several aspects of social well being directly from self-declared perceptions, instead of approximating it through objective (but partial) quantitative variables like GDP; moreover self-perceptions of welfare are spontaneous and not obtained as answers to explicit questions that are proved to bias the result. As an application we evaluate the SWBI in Italy through the period 2012-2015 through the analysis of more than 143 millions of tweets.Comment: 40 pages, 2 figures. arXiv admin note: text overlap with arXiv:1512.0156

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università dell'Insubria

Social networks, happiness and health: from sentiment analysis to a multidimensional indicator of subjective well-being

Author: Iacus Stefano Maria
Porro Giuseppe
Salini Silvia
Siletti Elena
Publication venue
Publication date: 01/01/2015
Field of study

This paper applies a novel technique of opinion analysis over social media data with the aim of proposing a new indicator of perceived and subjective well-being. This new index, namely SWBI, examines several dimension of individual and social life. The indicator has been compared to some other existing indexes of well-being and health conditions in Italy: the BES (Benessere Equo Sostenibile), the incidence rate of influenza and the abundance of PM10 in urban environments. SWBI is a daily measure available at province level. BES data, currently available only for 2013 and 2014, are annual and available at regional level. Flu data are weekly and distributed as regional data and PM10 are collected daily for different cities. Due to the fact that the time scale and space granularity of the different indexes varies, we apply a novel statistical technique to discover nowcasting features and the classical latent analysis to study the relationships among them. A preliminary analysis suggest that the environmental and health conditions anticipate several dimensions of the perception of well-being as measured by SWBI. Moreover, the set of indicators included in the BES represent a latent dimension of well-being which shares similarities with the latent dimension represented by SWBI.Comment: 26 pages, 5 figur

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università dell'Insubria

CEM: Coarsened Exact Matching in Stata

Author: Gary King
Giuseppe Porro
Matthew Blackwell
Stefano Iacus
Publication venue
Publication date
Field of study

We introduce a Stata implementation of coarsened exact matching, a new method for improving the estimation of causal effects by reducing imbalance in covariates between treated and control groups. Coarsened exact matching is faster, is easier to use and understand, requires fewer assumptions, is more easily automated, and possesses more attractive statistical properties for many applications than do existing matching methods. In coarsened exact matching, users temporarily coarsen their data, exact match on these coarsened data, and then run their analysis on the uncoarsened, matched data. Coarsened exact matching bounds the degree of model dependence and causal effect estimation error by ex ante user choice, is monotonic imbalance bounding (so that reducing the maximum imbalance on one variable has no effect on others), does not require a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, balances all nonlinearities and interactions in sample (i.e., not merely in expectation), and works with multiply imputed datasets. Other matching methods inherit many of the coarsened exact matching method’s properties when applied to further match data preprocessed by coarsened exact matching.

Research Papers in Economics

Invariant and Metric Free Proximities for Data Matching: An R Package

Author: Giuseppe Porro
Stefano M. Iacus
Publication venue: Foundation for Open Access Statistics
Publication date: 01/01/2008
Field of study

Archivio istituzionale della ricerca - Università di Trieste

Crossref

AIR Universita degli studi di Milano

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università dell'Insubria

Journal of Statistical Software

cem: Coarsened Exact Matching in Stata

Author: Blackwell Matthew
Iacus Stefano
King Gary
Porro Giuseppe
Publication venue: StataCorp
Publication date: 01/01/2009
Field of study

This paper introduces a Stata implementation of Coarsened Exact Matching (CEM), a new method for improving the estimation of causal effects by reducing imbalance in co-variates between treated and control groups. CEM is faster, easier to use and understand, requires fewer assumptions, more easily automated, and possesses more attractive statistical properties for many applications than existing matching methods. In CEM, users temporarily coarsen their data, exact match on these coarsened data, then run their analysis on the uncoarsened, matched data. CEM bounds the degree of model dependence and causal effect estimation error by ex ante user choice, is montonic imbalance bounding (so that reducing the maximum imbalance on one variable has no e ect on others), does not require a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, balances all nonlinearities and interactions in-sample (i.e., not merely in expectation), and works with multiply imputed data sets. Other matching methods inheret [sic] many of CEM's properties when applied to further match data preprocessed by CEM. The library cem implements the CEM algorithm in Stata.Governmen

AIR Universita degli studi di Milano

Harvard University - DASH

A proposal to deal with sampling bias in social network big data

Author: Iacus Stefano Maria
Porro Giuseppe
Salini Silvia
Siletti Elena
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 01/01/2018
Field of study

[EN] Selection bias is the bias introduced by the non random selection of data, it leads to question whether the sample obtained is representative of the target population. Generally there are different types of selection bias, but when one manages web-surveys or data from social network as Twitter or Facebook, one mostly need to focus with sampling and self-selection bias. In this work we propose to use offcial statistics to anchor and remove the sampling bias and unreliability of the estimations, due to the use of social network big data, following a weighting method combined with a small area estimations (SAE) approach.Iacus, SM.; Porro, G.; Salini, S.; Siletti, E. (2018). A proposal to deal with sampling bias in social network big data. En 2nd International Conference on Advanced Reserach Methods and Analytics (CARMA 2018). Editorial Universitat Politècnica de València. 29-37. https://doi.org/10.4995/CARMA2018.2018.8302OCS293

Crossref

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università dell'Insubria

RiuNet