57 research outputs found

    Evaluating recent methods to overcome spatial confounding

    Full text link
    The concept of spatial confounding is closely connected to spatial regression, although no general definition has been established. A generally accepted idea of spatial confounding in spatial regression models is the change in fixed effects estimates that may occur when spatially correlated random effects collinear with the covariate are included in the model. Different methods have been proposed to alleviate spatial confounding in spatial linear regression models, but it is not clear if they provide correct fixed effects estimates. In this article, we consider some of those proposals to alleviate spatial confounding such as restricted regression, the spatial+ model, and transformed Gaussian Markov random fields. The objective is to determine which one provides the best estimates of the fixed effects. Dowry death data in Uttar Pradesh in 2001, stomach cancer incidence data in Slovenia in the period 1995-2001 and lip cancer incidence data in Scotland between the years 1975-1980 are analyzed. Several simulation studies are conducted to evaluate the performance of the methods in different scenarios of spatial confounding. Results reflect that the spatial+ method seems to provide fixed effects estimates closest to the true value

    A one-step spatial+ approach to mitigate spatial confounding in multivariate spatial areal models

    Full text link
    Ecological spatial areal models encounter the well-known and challenging problem of spatial confounding. This issue makes it arduous to distinguish between the impacts of observed covariates and spatial random effects. Despite previous research and various proposed methods to tackle this problem, finding a definitive solution remains elusive. In this paper, we propose a one-step version of the spatial+ approach that involves dividing the covariate into two components. One component captures large-scale spatial dependence, while the other accounts for short-scale dependence. This approach eliminates the need to separately fit spatial models for the covariates. We apply this method to analyze two forms of crimes against women, namely rapes and dowry deaths, in Uttar Pradesh, India, exploring their relationship with socio-demographic covariates. To evaluate the performance of the new approach, we conduct extensive simulation studies under different spatial confounding scenarios. The results demonstrate that the proposed method provides reliable estimates of fixed effects and posterior correlations between different responses

    High-dimensional order-free multivariate spatial disease mapping

    Full text link
    Despite the amount of research on disease mapping in recent years, the use of multivariate models for areal spatial data remains limited due to difficulties in implementation and computational burden. These problems are exacerbated when the number of small areas is very large. In this paper, we introduce an order-free multivariate scalable Bayesian modelling approach to smooth mortality (or incidence) risks of several diseases simultaneously. The proposal partitions the spatial domain into smaller subregions, fits multivariate models in each subdivision and obtains the posterior distribution of the relative risks across the entire spatial domain. The approach also provides posterior correlations among the spatial patterns of the diseases in each partition that are combined through a consensus Monte Carlo algorithm to obtain correlations for the whole study region. We implement the proposal using integrated nested Laplace approximations (INLA) in the R package bigDM and use it to jointly analyse colorectal, lung, and stomach cancer mortality data in Spanish municipalities. The new proposal permits the analysis of big data sets and provides better results than fitting a single multivariate model

    Estimating unemployment in very small areas

    Get PDF
    In the last few years, European countries have shown a deep interest in applying small area techniques to produce reliable estimates at county level. However, the specificity of every European country and the heterogeneity of the available auxiliary information, make the use of a common methodology a very difficult task. In this study, the performance of several design-based, model-assisted, and model-based estimators using different auxiliary information for estimating unemployment at small area level is analyzed. The results are illustrated with data from Navarre, an autonomous region located at the north of Spain and divided into seven small areas. After discussing pros and cons of the different alternatives, a composite estimator is chosen, because of its good trade-off between bias and variance. Several methods for estimating the prediction error of the proposed estimator are also provided

    Induction of radiata pine somatic embryogenesis at high temperatures provokes a long-term decrease in dna methylation/hydroxymethylation and differential expression of stress-related genes

    Get PDF
    Based on the hypothesis that embryo development is a crucial stage for the formation of stable epigenetic marks that could modulate the behaviour of the resulting plants, in this study, radiata pine somatic embryogenesis was induced at high temperatures (23¿ C, eight weeks, control; 40¿ C, 4 h; 60¿ C, 5 min) and the global methylation and hydroxymethylation levels of emerging embryonal masses and somatic plants were analysed using LC-ESI-MS/ MS-MRM. In this context, the expression pattern of six genes previously described as stress-mediators was studied throughout the embryogenic process until plant level to assess whether the observed epigenetic changes could have provoked a sustained alteration of the transcriptome. Results indicated that the highest temperatures led to hypomethylation of both embryonal masses and somatic plants. Moreover, we detected for the first time in a pine species the presence of 5-hydroxymethylcytosine, and revealed its tissue specificity and potential involvement in heat-stress responses. Additionally, a heat shock protein-coding gene showed a down-regulation tendency along the process, with a special emphasis given to embryonal masses at first subculture and ex vitro somatic plants. Likewise, the transcripts of several proteins related with translation, oxidative stress response, and drought resilience were differentially expressed

    Empirical Bayes and Fully Bayes procedures to detect high-risk areas in disease mapping

    No full text
    Disease mapping studies have experienced an enormous development in the last twenty years. Both an Empirical Bayes (EB) and a Fully Bayes (FB) approach have been used for smoothing purposes. However, an excess of smoothing might hinder the detection of true high-risk areas. Identifying these extreme regions minimizing the misclassification of background or normal areas, and then, avoiding false alarms is crucial in epidemiology. Bayesian decision rules, based on the posterior distribution of the relative risks, have been investigated for this task, but no similar studies have been conducted under the EB approach. Within this framework, second order correct estimators of the MSE of the log-relative risk predictor can be used to build appropriate confidence intervals for the relative risks. Their ability to detect high-risk areas is investigated through a simulation study using the geographical structure of the well-known Scottish lip cancer data. Bayesian credibility intervals and decision rules, based on the posterior distribution of the relative risks, are also investigated to check if any of the approaches outperforms the others when classifying high-risk regions. The conclusion is that Bayesian decision rules, exploiting the posterior distribution of the relative risks, are more powerful to detect high-risk areas than EB confidence intervals, but no general rules can be defined as a global criterion to be routinely applied in every real setting.

    Pruebas de seguimiento y sesiones de control en la asignatura Métodos Estadísticos de la Ingeniería

    No full text
    Un Ingeniero Industrial necesita conocer las herramientas científicas necesarias para realizar un correcto tratamiento de la información, un control estadístico de la calidad o un experimento industrial. “Métodos Estadísticos de la Ingeniería” es una asignatura enfocada al aprendizaje y la práctica de técnicas estadísticas. De ahí, la amplitud del programa. La primera parte está dedicada al análisis exploratorio de datos, al cálculo de probabilidades y al estudio de variables aleatorias; la segunda parte se centra en la inferencia estadística y la tercera está orientada a la modelización. La asignatura tiene un fuerte componente práctico con varias sesiones de ordenador utilizando el software libre R. Ante el número elevado de alumnos que imposibilita un seguimiento exhaustivo de los mismos, en el curso académico 2007-2008 hemos ideado un plan de trabajo con pruebas de seguimiento y sesiones de control. Durante el curso se realizan tres sesiones de control en las que se evalúan seis pruebas de seguimiento que constan de una serie de ejercicios a resolver con el ordenador y otra serie de ejercicios teórico-prácticos. En las sesiones de control, el profesor resuelve los ejercicios y los alumnos los corrigen y se autoevalúan. El profesor recoge el trabajo de los alumnos para comprobar que sus correcciones y calificaciones son adecuadas. La experiencia ha resultado muy satisfactoria porque se ha conseguido mantener la atención del alumno evitando el abandono de las clases. Las pruebas de seguimiento tienen el valor añadido de proporcionar al alumno una idea clara del nivel exigido en la asignaturaSIN FINANCIACIÓNNo data 200

    A two-stage approach to estimate spatial and spatio-temporal disease risks in the presence of local discontinuities and clusters

    No full text
    Disease risk maps for areal unit data are often estimated from Poisson mixed models with local spatial smoothing, for example by incorporating random effects with a conditional autoregressive prior distribution. However, one of the limitations is that local discontinuities in the spatial pattern are not usually modelled, leading to over-smoothing of the risk maps and a masking of clusters of hot/coldspot areas. In this paper, we propose a novel two-stage approach to estimate and map disease risk in the presence of such local discontinuities and clusters. We propose approaches in both spatial and spatio-temporal domains, where for the latter the clusters can either be fixed or allowed to vary over time. In the first stage, we apply an agglomerative hierarchical clustering algorithm to training data to provide sets of potential clusters, and in the second stage, a two-level spatial or spatio-temporal model is applied to each potential cluster configuration. The superiority of the proposed approach with regard to a previous proposal is shown by simulation, and the methodology is applied to two important public health problems in Spain, namely stomach cancer mortality across Spain and brain cancer incidence in the Navarre and Basque Country regions of Spain

    A BLUP Synthetic Versus an EBLUP Estimator: An Empirical Study of a Small Area Estimation Problem

    No full text
    Model-based estimators are becoming very popular in statistical offices because Governments require accurate estimates for small domains that were not planned when the study was designed, as their inclusion would have produced an increase in the cost of the study. The sample sizes in these domains are very small or even zero; consequently, traditional direct design-based estimators lead to unacceptably large standard errors. In this regard, model-based estimators that 'borrow information' from related areas by using auxiliary information are appropriate. This paper reviews, under the model-based approach, a BLUP synthetic and an EBLUP estimator. The goal is to obtain estimators of domain totals when there are several domains with very small sample sizes or without sampled units. We also provide detailed expressions of the mean squared error at different levels of aggregation. The results are illustrated with real data from the Basque Country Business Survey.Finite population, prediction theory, mixed models, mean squared error, business survey,
    corecore