This paper considers the two-dataset problem, where data are collected from
two potentially different populations sharing common aspects. This problem
arises when data are collected by two different types of researchers or from
two different sources. We may reach invalid conclusions without using knowledge
about the data collection process. To address this problem, this paper develops
statistical models focusing on the difference in measurement and proposes two
prediction errors that help to evaluate the underlying data collection process.
As a consequence, it is possible to discuss the heterogeneity/similarity of
data in terms of prediction. Two real datasets are selected to illustrate our
method