66 research outputs found
Timescale effect estimation in time-series studies of air pollution and health: A Singular Spectrum Analysis approach
A wealth of epidemiological data suggests an association between
mortality/morbidity from pulmonary and cardiovascular adverse events and air
pollution, but uncertainty remains as to the extent implied by those
associations although the abundance of the data. In this paper we describe an
SSA (Singular Spectrum Analysis) based approach in order to decompose the
time-series of particulate matter concentration into a set of exposure
variables, each one representing a different timescale. We implement our
methodology to investigate both acute and long-term effects of
exposure on morbidity from respiratory causes within the urban area of Bari,
Italy.Comment: Published in at http://dx.doi.org/10.1214/07-EJS123 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Singular Spectrum Analysis: a new decomposition technique applied to environmental systems
EnIn the last few years Singular Spectrum Analysis (SSA), a powerful tool in time series In the last few years Singular Spectrum Analysis (SSA), a powerful tool in time series reconstruction of components may based on the functional clustering algorithm introduced in Bilancia and Stea (2008). We report an example concerning an application in the environmental health field
Geographical clustering of lung cancer in the province of Lecce, Italy: 1992–2001
<p>Abstract</p> <p>Background</p> <p>The triennial mortality rates for lung cancer in the two decades 1981–2001 in the province of Lecce, Italy, are significantly higher than those for the entire region of Apulia (to which the Province of Lecce belongs) and the national reference rates. Moreover, analyzing the rates in the three-year periods 1993–95, 1996–98 and 1999–01, there is a dramatic increase in mortality for both males and females, which still remains essentially unexplained: to understand the extent of this phenomenon, it is worth noting that the standardized mortality rate for males in 1999–01 is equal to 13.92 per 10000 person-years, compared to a value of 6.96 for Italy in the 2000–2002 period.</p> <p>These data have generated a considerable concern in the press and public opinion, which with little scientific reasoning have sometimes identified suspected culprits of the risk excess (for example, the emission caused by a number of large industrial sites located in the provinces of Brindisi and Taranto, bordering the Province of Lecce). The objective of this paper is to study on a scientifically sound basis the spatial distribution of risk for lung cancer mortality in the province of Lecce. Our goal is to demonstrate that most of the previous explanations are not supported by data: to this end, we will follow a hybrid approach that combines both frequentist and Bayesian disease mapping methods. Furthermore, we define a new sequential algorithm based on a modified version of the Besag-York-Mollié (BYM) model, suitably modified to detect geographical clusters of disease.</p> <p>Results</p> <p><it>Standardized mortality ratios (SMRs) for lung cancer in the province of Lecce</it>: For males, the relative risk (measured by means of SMR, i.e. the ratio between observed and expected cases in each area under internal standardization) was judged to be significantly greater than 1 in many municipal areas, the significance being evaluated under the null hypothesis of neutral risk on the ground of area-specific p-values (denoted by <it>ρ</it><sub><it>i</it></sub>); in addition, it was seen that high risk areas were not randomly distributed within the province, but showed a sharp clustering. The most perceptible cluster involved a collection of municipalities around the Maglie area (Istat code: 75039), while the association among the municipalities of Otranto, Poggiardo and Santa Cesarea Terme (Istat codes: 75057, 75061, 75072) was more ambiguous. For females, it was noteworthy the significant risk excess in the city of Lecce (Istat code: 75035), where an SMR of 1.83 and <it>ρ</it><sub><it>i </it></sub>< 0.01 have been registered. <it>BYM model for the province of Lecce</it>: For males, Bayes estimates of relative risks varied around an overall mean of 1.04 with standard deviation of 0.1, with a minimum of 0.77 and a maximum of 1.25. The posterior relative risks for females, although smoothed, showed more variation than for males, ranging form 0.74 to 1.65, around a mean of 0.90 with standard deviation 0.12. For males, 95% posterior credible intervals of relative risks included unity in every area, whereas significantly elevated risk of mortality was confirmed in the Lecce area for females (95% posterior CI: 1.33 – 2.00). <it>BYM model for the whole Apulia</it>: For males, internally standardized maps showed several high risk areas bordering the province of Lecce, belonging to the province of Brindisi, and the presence of a large high risk region, including the southern part of the province of Brindisi and the eastern and southern part of the Salento peninsula, in which an increasing trend in the north-south direction was found.</p> <p><it>Ecological correlation study with deprivation </it>(<it>Cadum Index</it>): For males, posterior mean of the ecological regression coefficient <it>β </it>resulted to be 0.04 with 95% posterior credible interval equal to (-0.01, 0.08); similarly, <it>β </it>was estimated as equal to -0.03 for females (95% posterior credible interval: -0.16, 0.10). Moreover, there was some indication of nonlinearly increasing relative risk with increasing deprivation for higher deprivation levels. For females, it was difficult to postulate the existence of any association between risk and deprivation.</p> <p><it>Cluster detection</it>: cluster detection based on a modified BYM model identified two large unexplained increased risk clusters in the central-eastern and southern part of the peninsula. Other secondary clusters, which raise several complex interpretation issues, are present.</p> <p>Conclusion</p> <p>Our results reduce the alleged role of the industrial facilities located around the province of Taranto: in particular, air pollution produced around the city of Taranto (which lies to the west of the province of Lecce) has been often identified as the main culprit of the mortality excess, a conclusion that was further supported by a recent study on the direction of prevailing winds on Salento. This hypothesis is contradicted by the finding that those municipalities that directly border on the province of Taranto (belonging to the so-called "Jonico-Salentina" band) are those that present low mortality rates (at least for males). In the same way, the responsibilities of energy production plants located in the province of Brindisi (Brindisi province lies to the north) appear to be of little relevance. For females, given the situation observed in the city of Lecce, and given the substantial increase in mortality observed in younger age classes, further investigation is required into the role played by changes in lifestyle, including greater net propensity to smoke that women have shown since the 80s onwards (a phenomenon which could be amplified in a city traditionally cultured and modern as Lecce, as the tobacco habit is a largely cultural phenomenon). For males, the presence of high levels of deprivation throughout the eastern and southern Salento is likely to play an important role: those with lower socio-economic status smoke more, and gender differences may be explained on the basis of the fact that in less developed areas women have less habit to tobacco smoking and alcohol drinking (and other harmful lifestyles), which are seen as purely masculine behaviour: research into the role of material deprivation and individual lifestyle differences between genders should be further developed.</p
A Multiple Imputation Strategy for Eddy Covariance Data
Half-hourly time series of net ecosystem exchange (NEE) of CO2, latent heat flux (LE) and sensible heat flux (H) measured through the micro-meteorological eddy covariance (EC) technique are noisy and show a high percentage of missing data. By using EC measurements that are part of the FLUXNET2015 dataset, we evaluate the performance of a multiple imputation (MI) strategy based on an efficient computational strategy introduced in Honaker and King (2010), combining the classic Expectation-Maximization (EM) algorithm with a bootstrap approach, in order to take draws from a suitable approximation of posterior distribution of model parameters. Armed with these instruments, we are able to introduce three new multiple imputation models, characterized by an increasing level of complexity, and built on top of multivariate normality assumption: 1) MLR, which imputes EC missing values using a static multiple linear regression of observed values of suitable input variables; 2) ADL, which enriches with dynamic properties the static specification of MLR, by considering an autoregressive distributed lag specification; 3) PADL, which adds further complexity by embedding the ADL model in a panel-data perspective. Under several artificial gap scenarios, we show that PADL has a better ability in modeling the complex dynamics of ecosystem fluxes and reconstructing missing data points, thus providing unbiased imputations and preserving the original sampling distribution. The added flexibility arising from the time series cross section structure of PADL warrants improved performances, outperforming those of other imputation methods, as well as of the marginal distribution sampling algorithm (MDS), a widely used gap- filling approach introduced by Reichstein et al. (2005), especially in the case of nighttime flux data. It is expected that the strategy proposed in this paper will become useful in creating multiple imputations for a variety of EC datasets, providing valid inferences for a broad range of scientific estimands (such as annual budgets)
Singular Spectrum Analysis: a new decomposition technique applied to environmental systems
EnIn the last few years Singular Spectrum Analysis (SSA), a powerful tool in time series In the last few years Singular Spectrum Analysis (SSA), a powerful tool in time series reconstruction of components may based on the functional clustering algorithm introduced in Bilancia and Stea (2008). We report an example concerning an application in the environmental health field
Disegno ottimo degli esperimenti e modelli lineari generalizzati
Dottorato di ricerca in statistica. 8. ciclo. Tutore C. CecchiConsiglio Nazionale delle Ricerche - Biblioteca Centrale - P.le Aldo Moro, 7, Rome; Biblioteca Nazionale Centrale - P.za Cavalleggeri, 1, Florence / CNR - Consiglio Nazionale delle RichercheSIGLEITItal
A hierarchical finite mixture model for Bayesian classification in the presence of auxiliary information
Gaussian finite-mixture models are extended to include the use of auxiliary information, the dependence of component membership probabilities being modelled by a generalized linear model for polytomous responses. Among the possible applications of the proposed methodology are probabilistic classification and estimation of group conditional parameters. Identifiability features of such a model are investigated in comparison with standard finite mixtures. A full Bayesian hierarchical representation of the model is developed to implement the Gibbs sampling estimation algorithm. Two examples are presented where the methodology is applied to the analysis of real and synthetic data
- …