The Rapid Carbon Assessment, conducted by the U.S. Department of Agriculture,
was implemented in order to obtain a representative sample of soil organic
carbon across the contiguous United States. In conjunction with a statistical
model, the dataset allows for mapping of soil carbon prediction across the
U.S., however there are two primary challenges to such an effort. First, there
exists a large degree of heterogeneity in the data, whereby both the first and
second moments of the data generating process seem to vary both spatially and
for different land-use categories. Second, the majority of the sampled
locations do not actually have lab measured values for soil organic carbon.
Rather, visible and near-infrared (VNIR) spectra were measured at most
locations, which act as a proxy to help predict carbon content. Thus, we
develop a heterogeneous model to analyze this data that allows both the mean
and the variance to vary as a function of space as well as land-use category,
while incorporating VNIR spectra as covariates. After a cross-validation study
that establishes the effectiveness of the model, we construct a complete map of
soil organic carbon for the contiguous U.S. along with uncertainty
quantification