On the two-dataset problem

MacEachern, Steven N.; Miyawaki, Koji

On the two-dataset problem

Authors: Steven N. MacEachern
Koji Miyawaki
Publication date: 6 August 2020
Publisher

Abstract

This paper considers the two-dataset problem, where data are collected from two potentially different populations sharing common aspects. This problem arises when data are collected by two different types of researchers or from two different sources. We may reach invalid conclusions without using knowledge about the data collection process. To address this problem, this paper develops statistical models focusing on the difference in measurement and proposes two prediction errors that help to evaluate the underlying data collection process. As a consequence, it is possible to discuss the heterogeneity/similarity of data in terms of prediction. Two real datasets are selected to illustrate our method

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:1911.00204

Last time updated on 11/08/2020