25 research outputs found

    Spatial regression with covariate measurement error: A semiparametric approach

    Get PDF
    © 2016, The International Biometric Society. Spatial data have become increasingly common in epidemiology and public health research thanks to advances in GIS (Geographic Information Systems) technology. In health research, for example, it is common for epidemiologists to incorporate geographically indexed data into their studies. In practice, however, the spatially defined covariates are often measured with error. Naive estimators of regression coefficients are attenuated if measurement error is ignored. Moreover, the classical measurement error theory is inapplicable in the context of spatial modeling because of the presence of spatial correlation among the observations. We propose a semiparametric regression approach to obtain bias-corrected estimates of regression parameters and derive their large sample properties. We evaluate the performance of the proposed method through simulation studies and illustrate using data on Ischemic Heart Disease (IHD). Both simulation and practical application demonstrate that the proposed method can be effective in practice

    Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery

    Get PDF
    Background: Automated phenotyping technologies are continually advancing the breeding process. However, collecting various secondary traits throughout the growing season and processing massive amounts of data still take great efforts and time. Selecting a minimum number of secondary traits that have the maximum predictive power has the potential to reduce phenotyping efforts. The objective of this study was to select principal features extracted from UAV imagery and critical growth stages that contributed the most in explaining winter wheat grain yield. Five dates of multispectral images and seven dates of RGB images were collected by a UAV system during the spring growing season in 2018. Two classes of features (variables), totaling to 172 variables, were extracted for each plot from the vegetation index and plant height maps, including pixel statistics and dynamic growth rates. A parametric algorithm, LASSO regression (the least angle and shrinkage selection operator), and a non-parametric algorithm, random forest, were applied for variable selection. The regression coefficients estimated by LASSO and the permutation importance scores provided by random forest were used to determine the ten most important variables influencing grain yield from each algorithm. Results: Both selection algorithms assigned the highest importance score to the variables related with plant height around the grain filling stage. Some vegetation indices related variables were also selected by the algorithms mainly at earlier to mid growth stages and during the senescence. Compared with the yield prediction using all 172 variables derived from measured phenotypes, using the selected variables performed comparable or even better. We also noticed that the prediction accuracy on the adapted NE lines (r = 0.58–0.81) was higher than the other lines (r = 0.21–0.59) included in this study with different genetic backgrounds. Conclusions: With the ultra-high resolution plot imagery obtained by the UAS-based phenotyping we are now able to derive more features, such as the variation of plant height or vegetation indices within a plot other than just an averaged number, that are potentially very useful for the breeding purpose. However, too many features or variables can be derived in this way. The promising results from this study suggests that the selected set from those variables can have comparable prediction accuracies on the grain yield prediction than the full set of them but possibly resulting in a better allocation of efforts and resources on phenotypic data collection and processing

    Environmental, Institutional, and Demographic Predictors of Environmental Literacy among Middle School Children

    Get PDF
    Building environmental literacy (EL) in children and adolescents is critical to meeting current and emerging environmental challenges worldwide. Although environmental education (EE) efforts have begun to address this need, empirical research holistically evaluating drivers of EL is critical. This study begins to fill this gap with an examination of school-wide EE programs among middle schools in North Carolina, including the use of published EE curricula and time outdoors while controlling for teacher education level and experience, student attributes (age, gender, and ethnicity), and school attributes (socio-economic status, student-teacher ratio, and locale). Our sample included an EE group selected from schools with registered school-wide EE programs, and a control group randomly selected from NC middle schools that were not registered as EE schools. Students were given an EL survey at the beginning and end of the spring 2012 semester. Use of published EE curricula, time outdoors, and having teachers with advanced degrees and mid-level teaching experience (between 3 and 5 years) were positively related with EL whereas minority status (Hispanic and black) was negatively related with EL. Results suggest that school-wide EE programs were not associated with improved EL, but the use of published EE curricula paired with time outdoors represents a strategy that may improve all key components of student EL. Further, investments in teacher development and efforts to maintain enthusiasm for EE among teachers with more than 5 years of experience may help to boost student EL levels. Middle school represents a pivotal time for influencing EL, as improvement was slower among older students. Differences in EL levels based on gender suggest boys and girls may possess complementary skills sets when approaching environmental issues. Our findings suggest ethnicity related disparities in EL levels may be mitigated by time spent in nature, especially among black and Hispanic students

    A comprehensive approach to haplotype-specific analysis by penalized likelihood

    No full text
    Haplotypes can hold key information to understand the role of candidate genes in disease etiology. However, standard haplotype analysis has yet been able to fully reveal the information retained by haplotypes. In most analysis, haplotype inference focuses on relative effects compared with an arbitrarily chosen baseline haplotype. It does not depict the effect structure unless an additional inference procedure is used in a secondary post hoc analysis, and such analysis tends to be lack of power. In this study, we propose a penalized regression approach to systematically evaluate the pattern and structure of the haplotype effects. By specifying an L1 penalty on the pairwise difference of the haplotype effects, we present a model-based haplotype analysis to detect and to characterize the haplotypic association signals. The proposed method avoids the need to choose a baseline haplotype; it simultaneously carries out the effect estimation and effect comparison of all haplotypes, and outputs the haplotype group structure based on their effect size. Finally, our penalty weights are theoretically designed to balance the likelihood and the penalty term in an appropriate manner. The proposed method can be used as a tool to comprehend candidate regions identified from a genome or chromosomal scan. Simulation studies reveal the better abilities of the proposed method to identify the haplotype effect structure compared with the traditional haplotype association methods, demonstrating the informativeness and powerfulness of the proposed method
    corecore