065 583 MAPPING INDOOR RADON-222 IN DENMARK: DESIGN AND TEST OF THE STATISTICAL MODEL USED IN THE SECOND NATION-WIDE SURVEY

Abstract

In Denmark, a new survey of indoor radon-222 has been carried out. One-year alpha track measurements (CR-39) have been done in 3019 single-family houses. There is from 3 to 23 house measurements in each of the 275 municipalities. Within each municipality, houses have been selected randomly. One important outcome of the survey is the prediction of the fraction of houses in each municipality with an annual average radon concentration above . To obtain the most accurate estimate and to assess the associated uncertainties, a statistical model has been developed. The purpose of this paper is to describe the design of this model, and to report results of model tests. The model is based on a transformation of the data to normality and on analytical (conditionally) unbiased estimators of the quantities of interest. Bayesian statistics is used to minimize the effect of small sample size. In each municipality, the correction is dependent on the fraction of area where sand and gravel is a dominating surface geology. The uncertainty analysis is done with a Monte Carlo technique. It is demonstrated that the weighted sum of all municipality model estimates of fractions above 200 Bqm -3 (3.9 % with 95 %-confidence interval = [3.4,4.5]) is consistent with the weighted sum of the observations for Denmark taken as a whole (4.6 % with 95 %-confidence interval = [3. 8,5.6]). The total number of single-family houses within each municipality is used as weight. Model estimates are also found to be consistent with observations at the level of individual counties. These typically include a few hundred house measurements. These tests indicate that the model is well suited for its purpose. Keywords: Houses; Radon-222; Survey; Statistical model INTRODUCTION Radon is believed to cause an increased risk of lung cancer and it is therefore of interest to identify houses with high levels of indoor radon. It is important to know how many houses that have "high" levels (e.g. annual levels above 200 or 400 Bqm -3 ) and it is important to know where these houses are located. Likewise, it is also of interest to know about the low-radon houses where there is no cause for alarm. This paper reports on a new Danish survey of indoor radon designed to tackle these problems. The survey is much larger than the first one from 1985/86 SURVEY DESIGN Denmark is divided into 15 counties. Each county consists of a number of smaller municipalities. In total there are 275 municipalities. One-year alpha track measurements (CR-39) were done in 3019 single-family houses from December 1995 to December 1996. Detectors were placed in living 065 Radon in the Living Environment, 19-23 April 1999, Athens, Greece 584 rooms. Within each municipality, houses were selected randomly by the Building and Dwelling Register (BBR). The median number of house measurements per municipality is 11. Nine municipalities have only 6 or less measurements, and nine municipalities have 18 or more measurements. The only geological information used directly in the model is the fraction of area (later referred to as k g ) in each municipality which is dominated by sand and gravel. These values are found by visual inspection of a map of the surface geology of Denmark MODEL Transformations We define the 'house concentration' c of a given house to be the average radon concentration of the living room and the bedroom: , is closer to normality. All of the statistical analyses are therefore conducted for transformed radon concentrations x . Distribution parameters It is assumed that within each municipality k , the transformed radon concentration x is normally distributed with a (true) mean k µ and a (true) standard deviation σ . We allow that k µ can be different from one municipality to another, but require that all municipalities have the same σ . The latter requirement is supported by an analysis of the homogeneity of variances with a modified version of the Levene test based on absolute deviations from the municipality medians of transformed radon concentrations The estimator σˆof σ is found as follows: First, we calculate the simple mean k x and standard deviation k s of the k N measurements in each municipality k : and . Finally, we pool the 275 k σˆ-values into a single weighted mean value: σˆ. The number of house measurements ( k N ) is used as weight. The value amounts to: 0.59418 = σ . The estimators k μ of k µ are found as follows: A simple estimate would be to let . However, as demonstrated by where k g is an estimate of the fraction of the total area of municipality k that has a surface geology dominated by sand and gravel. Based on all 275 municipalities, the regression coefficients amount to 4.54 0 = β (standard error 0.0296) and -0.69 1 = β (standard error 0.06). The R-squared value is 36 %. The variance 2 ε σ of the residuals k ε is 0.082. For each municipality, we calculate: and use the following weighted average as the model estimate of k µ : where the weights are: . Essentially, we estimate k µ to be equal to the observed value k x with some weigthed correction towards what on-the-averaged is found for municipalities with that type of surface geology. If there are few (or no) measurements in a municipality, then . If there are many measurements, then . Essentially, the influence of k θ equivalents about 4 extra measurements in each municipality. The main source of uncertainty in the survey is the small sample size. We apply equation (6) as a way to gently "stabilize" modelling results in all municipalities except those on the island Bornholm. and the bias term: and insert into: the Living Environment, 19-23 April 1999, Athens, Greece which is different from the observed value given by equation (7 f . RESULTS In the survey, house radon levels ( c ) in the range from 2 to 590 Bqm . The middle plot of DISCUSSION Improved estimates by modelling? The primary purpose of the statistical model is to provide estimates of the fraction of houses above 200 Bqm -3 at the level of individual municipalities. The idea is to make estimates that are better (i.e. more accurate and less variable) than estimates deduced from simple observations: in municipality k , and k N is the number of measurements. The main problem with such simple observations is that for the typical case of about 10 house measurements per municipality, the outcome will be in steps of 10 % (i.e. 0 %, 10 %, 20 % etc.). This can be illustrated with synthetic data. We draw 3019 synthetic (transformed radon concentrations) x from a normal distribution with mean 4.33 and standard deviation 0.5941. Subsequently we transform the data to ordinary radon concentrations ( c -values) using the inverse of ) log( b c x + = . The true value of 200 f in this case is 4.60 % (about the same as the national average). The data are grouped in municipalities and counties exactly as in the survey (this is important as the number of measurements determines the variability of parameter estimates). Also, we preserve the fraction of sand and gravel ( k g ) which is needed in equation (6). In this case, however, the regression (see equation 5) will only be by chance. The model is applied exactly as with the real data set. To evaluate the importance of the Bayesian correction, we will also consider simplified-model estimates where 0 ω in equation 6 is set to 0 (such that mean and standard deviation of the results are 4.9 % and 7.2 %, respectively. In one case, 200 f is found to be as high as 40 %. It is particularly problematic that about 60 % of the municipalities are without measured houses with concentrations above . It is little help that many of the remaining municipalities, have observed fractions above 10 %, such that on-the-average the correct result of about 4.6 % is observed. The curved labelled simplified model are the results of model estimates without the Bayesian correction ( 0 0 = ω ). Compared with the first curve, these estimates are much better in the sense that the results are less variable (mean 4.9 % and standard deviation 3.7 %). The final curve labelled full model present by far the best estimates (mean 4.3 % and standard deviation 1.7 %). However, because the data in each municipality come from the same distribution, the variance of the regression residuals ( ε in equation 5) is lower than in the real survey. This means that in this (synthetic) example, the Bayesian correction will correspond to about 9 extra measurements in each municipality (compared to 4 in the real situation). The confidence intervals of the simulations are not shown in the Model versus measurements The (weighted) national average of model predictions amounts to . The latter agreement (that concerns the tail of the distribution) suggests that the assumption of normality is not greatly violated. As shown in the top plot of To illustrate how the model treats counties with different types of geology it is of interest to study Model elements The model includes some special elements: Measurement uncertainty Considerable measurement uncertainty is associated with the c -estimates (typically about 20 %). Part of this comes from the conversion from living room to house concentrations. Such uncertainties tend to have little impact on averages of quantities that relates linearly to the measurements (e.g. arithmetic means) as such random errors on the average will tend to cancel each other. Unfortunately, estimation of the fraction of houses above 200 Bqm -3 is an non-linear function of the individual radon concentration results, and random errors therefore will bias the estimation. This has previously been demonstrated by CONCLUSION A statistical model has been developed. It predicts the fraction of single-family houses (in each municipality) with an annual radon level above 200 Bqm . The investigation suggests that these estimates are better (more accurate and less variable) than simple observations based on direct observation of houses with levels above 200 Bqm . Also, the model provides estimates of uncertainties associated with these predictions. The main source of uncertainty relates to the small sample size (typically only about 11 measurements in each municipality). Comparison between model predictions and measurements indicated that the model is well suited for mapping of indoor radon in Denmark. ACKNOWLEDGEMEN

    Similar works