Inference for big data assisted by small area methods: an application to OBEC (on-line based enterprise characteristics)

Abstract

Nowadays, the availability of a huge amount of data produced by a wide range of new technologies, so-called big data, is increasing. However, data obtain- able from big data sources are often the result of a non-probability sampling process and adjusting for the selection bias is an important practical problem. In this paper, we propose a novel method of reducing the selection bias associated with the big data source in the context of Small Area Estimation (SAE). Our approach is based on data integration and the combination of a big data sample and a probability sam- ple. An application on OBEC (on-line based enterprise characteristics) combining Istat sampling survey and web scraping data has been proposed

    Similar works