155 research outputs found

    Improving estimation efficiency for regression with MNAR covariates

    Full text link
    For regression with covariates missing not at random where the missingness depends on the missing covariate values, complete‐case (CC) analysis leads to consistent estimation when the missingness is independent of the response given all covariates, but it may not have the desired level of efficiency. We propose a general empirical likelihood framework to improve estimation efficiency over the CC analysis. We expand on methods in Bartlett et al. (2014, Biostatistics 15, 719–730) and Xie and Zhang (2017, Int J Biostat 13, 1–20) that improve efficiency by modeling the missingness probability conditional on the response and fully observed covariates by allowing the possibility of modeling other data distribution‐related quantities. We also give guidelines on what quantities to model and demonstrate that our proposal has the potential to yield smaller biases than existing methods when the missingness probability model is incorrect. Simulation studies are presented, as well as an application to data collected from the US National Health and Nutrition Examination Survey.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/154274/1/biom13131-sup-0002-web_supp.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/154274/2/biom13131_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/154274/3/biom13131.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/154274/4/biom13131-sup-0003-supmat.pd

    Functional Coefficient Estimation with Both Categorical and Continuous Data

    Get PDF
    We propose a local linear functional coefficient estimator that admits a mix of discrete and contin-uous data for stationary time series. Under weak conditions our estimator is asymptotically normally distributed. A small set of simulation studies is carried out to illustrate the finite sample performance of our estimator. As an application, we estimate a wage determination function that explicitly allows the return to education to depend on other variables. We find evidence of the complex interacting patterns among the regressors in the wage equation, such as increasing returns to education when experience is very low, high return to education for workers with several years of experience, and diminishing returns to education when experience is high. Compared with the commonly used para-metric and semi-parametric methods, our estimator performs better in both goodness-of-fit and in yielding economically interesting interpretation

    Long‐term trends in the distribution, abundance and impact of native “injurious” weeds

    Get PDF
    Questions: How can we quantify changes in the distribution and abundance of injurious weed species (Senecio jacobaea, Cirsium vulgare, Cirsium arvense, Rumex obtusifolius, Rumex crispus and Urtica dioica), over long time periods at wide geographical scales? What impact do these species have on plant communities? To what extent are changes driven by anthropogenically induced drivers such as disturbance, eutrophication and management? Location: Great Britain. Methods: Data from national surveys were used to assess changes in the frequency and abundance of selected weed species between 1978 and 2007. This involved novel method development to create indices of change, and to relate changes in distribution and abundance of these species to plant community diversity and inferred changes in resource availability, disturbance and management. Results: Three of the six weed species became more widespread in GB over this period and all of them increased in abundance (in grasslands, arable habitats, roadsides and streamsides). Patterns were complex and varied by landscape context and habitat type. For most of the species, there were negative relationships between abundance, total plant species richness, grassland, wetland and woodland indicators. Each individual species responds to a different combination of anthropogenic drivers but disturbance, fertility and livestock management significantly influenced most species. Conclusions: The increase in frequency and abundance of weeds over decades has implications for landscape‐scale plant diversity, fodder yield and livestock health. This includes reductions in plant species richness, loss of valuable habitat specialists and homogenisation of vegetation communities. Increasing land‐use intensity, excessive nutrient input, overgrazing, sward damage, poaching and bare ground in fields and undermanagement or too frequent cutting on linear features may have led to increases in weeds. These weeds do have conservation value so we are not advocating eradication, rather co‐existence, without dominance. Land management policy needs to adapt to benefit biodiversity and agricultural productivity

    Socioeconomic disparities in physical health among Aboriginal and Torres Strait Islander children in Western Australia

    Get PDF
    Objective. Few empirical studies have specifically examined the relationship between socio-economic status (SES) and health in Indigenous populations of Australia. We sought to provide insights into the nature of this relationship by examining socio-economic disparities in physical health outcomes among Aboriginal and Torres Strait Islander children in Western Australia. Design. We used a diverse set of health and SES indicators from a representative survey conducted in 20002002 on the health and development of 5289 Indigenous children aged 017 years in Western Australia. Analysis was conducted using multivariate logistic regression within a multilevel framework. Results. After controlling for age and sex, we found statistically significant socio- economic disparities in health in almost half of the associations that were investigated, although the direction, shape and magnitude of associations differed. For ear infections, recurring chest infections and sensory function problems, the patterns were generally consistent with a positive socio-economic gradient where better health was associated with higher SES. The reverse pattern was found for asthma, accidents and injuries, and oral health problems, although this was primarily observed for area-level SES indicators. Conclusion. Conventional notions of social position and class have some influence on the physical health of Indigenous children, although the diversity of results implies that there are other ways of conceptualising and measuring SES that are important for Indigenous populations. We need to consider factors that relate specifically to Indigenous circumstances and culture in the past and present day, and give more thought to how we measure social position in the Indigenous community, to gain a better understanding of the pathways from SES to Indigenous child health

    Access Control for Data Integration in Presence of Data Dependencies

    Full text link
    International audienceDefining access control policies in a data integration scenario is a challenging task. In such a scenario typically each source specifies its local access control policy and cannot anticipate data inferences that can arise when data is integrated at the mediator level. Inferences, e.g., using functional dependencies, can allow malicious users to obtain, at the mediator level, prohibited information by linking multiple queries and thus violating the local policies. In this paper, we propose a framework, i.e., a methodology and a set of algorithms, to prevent such violations. First, we use a graph-based approach to identify sets of queries, called violating transactions, and then we propose an approach to forbid the execution of those transactions by identifying additional access control rules that should be added to the mediator. We also state the complexity of the algorithms and discuss a set of experiments we conducted by using both real and synthetic datasets. Tests also confirm the complexity and upper bounds in worst-case scenarios of the proposed algorithms

    Global sensitivity analysis of stochastic computer models with joint metamodels

    Get PDF
    The global sensitivity analysis method used to quantify the influence of uncertain input variables on the variability in numerical model responses has already been applied to deterministic computer codes; deterministic means here that the same set of input variables gives always the same output value. This paper proposes a global sensitivity analysis methodology for stochastic computer codes, for which the result of each code run is itself random. The framework of the joint modeling of the mean and dispersion of heteroscedastic data is used. To deal with the complexity of computer experiment outputs, nonparametric joint models are discussed and a new Gaussian process-based joint model is proposed. The relevance of these models is analyzed based upon two case studies. Results show that the joint modeling approach yields accurate sensitivity index estimatiors even when heteroscedasticity is strong

    Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants

    Get PDF
    Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling. © 2018 The Author(s).Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling. © 2018 The Author(s).Peer reviewe

    Computational improvements to multi-scale geographically weighted regression

    No full text
    • 

    corecore