11,196 research outputs found

    Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates

    Full text link
    We study semiparametric varying-coefficient partially linear models when some linear covariates are not observed, but ancillary variables are available. Semiparametric profile least-square based estimation procedures are developed for parametric and nonparametric components after we calibrate the error-prone covariates. Asymptotic properties of the proposed estimators are established. We also propose the profile least-square based ratio test and Wald test to identify significant parametric and nonparametric components. To improve accuracy of the proposed tests for small or moderate sample sizes, a wild bootstrap version is also proposed to calculate the critical values. Intensive simulation experiments are conducted to illustrate the proposed approaches.Comment: Published in at http://dx.doi.org/10.1214/07-AOS561 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Towards greater accuracy in individual-tree mortality regression

    Get PDF
    Background mortality is an essential component of any forest growth and yield model. Forecasts of mortality contribute largely to the variability and accuracy of model predictions at the tree, stand and forest level. In the present study, I implement and evaluate state-of-the-art techniques to increase the accuracy of individual tree mortality models, similar to those used in many of the current variants of the Forest Vegetation Simulator, using data from North Idaho and Montana. The first technique addresses methods to correct for bias induced by measurement error typically present in competition variables. The second implements survival regression and evaluates its performance against the traditional logistic regression approach. I selected the regression calibration (RC) algorithm as a good candidate for addressing the measurement error problem. Two logistic regression models for each species were fitted, one ignoring the measurement error, which is the “naïve” approach, and the other applying RC. The models fitted with RC outperformed the naïve models in terms of discrimination when the competition variable was found to be statistically significant. The effect of RC was more obvious where measurement error variance was large and for more shade-intolerant species. The process of model fitting and variable selection revealed that past emphasis on DBH as a predictor variable for mortality, while producing models with strong metrics of fit, may make models less generalizable. The evaluation of the error variance estimator developed by Stage and Wykoff (1998), and core to the implementation of RC, in different spatial patterns and diameter distributions, revealed that the Stage and Wykoff estimate notably overestimated the true variance in all simulated stands, but those that are clustered. Results show a systematic bias even when all the assumptions made by the authors are guaranteed. I argue that this is the result of the Poisson-based estimate ignoring the overlapping area of potential plots around a tree. Effects, especially in the application phase, of the variance estimate justify suggested future efforts of improving the accuracy of the variance estimate. The second technique implemented and evaluated is a survival regression model that accounts for the time dependent nature of variables, such as diameter and competition variables, and the interval-censored nature of data collected from remeasured plots. The performance of the model is compared with the traditional logistic regression model as a tool to predict individual tree mortality. Validation of both approaches shows that the survival regression approach discriminates better between dead and alive trees for all species. In conclusion, I showed that the proposed techniques do increase the accuracy of individual tree mortality models, and are a promising first step towards the next generation of background mortality models. I have also identified the next steps to undertake in order to advance mortality models further

    Optimal model averaging estimation for partially linear models

    Get PDF
    This article studies optimal model averaging for partially linear models with heteroscedasticity. A Mallows-type criterion is proposed to choose the weight. The resulting model averaging estimator is proved to be asymptotically optimal under some regularity conditions. Simulation experiments suggest that the proposed model averaging method is superior to other commonly used model selection and averaging methods. The proposed procedure is further applied to study Japan’s sovereign credit default swap spreads

    Pipe failure prediction and impacts assessment in a water distribution network

    Get PDF
    Abstract Water distribution networks (WDNs) aim to provide water with desirable quantity, quality and pressure to the consumers. However, in case of pipe failure, which is the cumulative effect of physical, operational and weather-related factors, the WDN might fail to meet these objectives. Rehabilitation and replacement of some components of WDNs, such as pipes, is a common practice to improve the condition of the network to provide an acceptable level of service. The overall aim of this thesis is to predict—long-term, annually and short-term—the pipe failure propensity and assess the impacts of a single pipe failure on the level of service. The long-term and annual predictions facilitate the need for effective capital investment, whereas the short-term predictions have an operational use, enabling the water utilities to adjust the daily allocation and planning of resources to accommodate possible increase in pipe failure. The proposed methodology was implemented to the cast iron (CI) pipes in a UK WDN. The long-term and annual predictions are made using a novel combination of Evolutionary Polynomial Regression (EPR) and K-means clustering. The inclusion of K-means improves the predictions’ accuracy by using a set of models instead of a single model. The long-term predictive models consider physical factors, while the annual predictions also include weather-related factors. The analysis is conducted on a group level assuming that pipes with similar properties have similar breakage patterns. Soil type is another aggregation criterion since soil properties are associated with the corrosion of metallic pipes. The short-term predictions are based on a novel Artificial Neural Network (ANN) model that predicts the variations above a predefined threshold in the number of failures in the following days. The ANN model uses only existing weather data to make predictions reducing their uncertainty. The cross-validation technique is used to derive an accurate estimate of accuracy of EPR and ANN models by guaranteeing that all observations are used for both training and testing, and each observation is used for testing only once. The impact of pipe failure is assessed considering its duration, the topology of the network, the geographic location of the failed pipe and the time. The performance indicators used are the ratio of unsupplied demand and the number of customers with partial or no supply. Two scenarios are examined assuming that the failure occurs when there is a peak in either pressure or demand. The pressure-deficient conditions are simulated by introducing a sequence of artificial elements to all the demand nodes with pressure less than the required. This thesis proposes a new combination of a group-based method for deriving the failure rate and an individual-pipe method for evaluating the impacts on the level of service. Their conjunction indicates the most critical pipes. The long-term approach improves the accuracy of predictions, particularly for the groups with very low or very high failure frequency, considering diameter, age and length. The annual predictions accurately predict the fluctuation of failure frequency and its peak during the examined period. The EPR models indicate a strong direct relationship between low temperatures and failure frequency. The short-term predictions interpret the intra-year variation of failure frequency, with most failures occurring during the coldest months. The exhaustive trials led to the conclusion that the use of four consecutive days as input and the following two days as output results in the highest accuracy. The analysis of the relative significance of each input variable indicates that the variables that capture the intensity of low temperatures are the most influential. The outputs of the impact assessment indicate that the failure of most of the pipes in both scenarios (i.e. peak in pressure and demand) would have low impacts (i.e. low ratio of unsupplied demand and small number of affected nodes). This can be explained by the fact that the examined network is a large real-life network, and a single failure of a distribution pipe is likely to cause pressure-deficient conditions in a small part of it, whereas performance elsewhere is mostly satisfactory. Furthermore, the complex structure of the WDN allows them to recover from local pipe failures, exploiting the topological redundancy provided by closed loops, so that the flow could reach a given demand node through alternative paths

    Quantile Models with Endogeneity

    Get PDF
    In this article, we review quantile models with endogeneity. We focus on models that achieve identification through the use of instrumental variables and discuss conditions under which partial and point identification are obtained. We discuss key conditions, which include monotonicity and full-rank-type conditions, in detail. In providing this review, we update the identification results of Chernozhukov & Hansen (2005). We illustrate the modeling assumptions through economically motivated examples. We also briefly review the literature on estimation and inference

    Adult Age Differences and the Role of Cognitive Resources in Perceptual-Motor Skill Acquisition: Application of a Multilevel Negative Exponential Model

    Get PDF
    The effects of advanced age and cognitive resources on the course of skill acquisition are unclear, and discrepancies among studies may reflect limitations of data analytic approaches. We applied a multilevel negative exponential model to skill acquisition data from 80 trials (four 20-trial blocks) of a pursuit rotor task administered to healthy adults (19-80 years old). The analyses conducted at the single-trial level indicated that the negative exponential function described performance well. Learning parameters correlated with measures of task-relevant cognitive resources on all blocks except the last and with age on all blocks after the second. Thus, age differences in motor skill acquisition may evolve in 2 phases: In the first, age differences are collinear with individual differences in task-relevant cognitive resources; in the second, age differences orthogonal to these resources emerg

    Partisan Conflict and Income Inequality in the United States: A Nonparametric Causality-in-Quantiles Approach

    Full text link
    This paper examines the predictive power of a partisan conflict on income inequality. Our study contributes to the existing literature by using the newly introduced nonparametric causality-in-quantile testing approach to examine how political polarization in the United States affects several measures of income inequality and distribution overtime. The study uses annual time-series data between the periods 1917–2013. We find evidence in support of a dynamic causal relationship between partisan conflict and income inequality, except at the upper end of the quantiles. Our empirical findings suggest that a reduction in partisan conflict will lead to a reduction in our measures of income inequality, but this requires that inequality is not exceptionally high

    Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes

    Full text link
    Causal inference approaches in systems genetics exploit quantitative trait loci (QTL) genotypes to infer causal relationships among phenotypes. The genetic architecture of each phenotype may be complex, and poorly estimated genetic architectures may compromise the inference of causal relationships among phenotypes. Existing methods assume QTLs are known or inferred without regard to the phenotype network structure. In this paper we develop a QTL-driven phenotype network method (QTLnet) to jointly infer a causal phenotype network and associated genetic architecture for sets of correlated phenotypes. Randomization of alleles during meiosis and the unidirectional influence of genotype on phenotype allow the inference of QTLs causal to phenotypes. Causal relationships among phenotypes can be inferred using these QTL nodes, enabling us to distinguish among phenotype networks that would otherwise be distribution equivalent. We jointly model phenotypes and QTLs using homogeneous conditional Gaussian regression models, and we derive a graphical criterion for distribution equivalence. We validate the QTLnet approach in a simulation study. Finally, we illustrate with simulated data and a real example how QTLnet can be used to infer both direct and indirect effects of QTLs and phenotypes that co-map to a genomic region.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS288 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Real estate markets and bank distress

    Get PDF
    We investigate the relationship between real estate markets and bank distress among German universal and specialized mortgage banks between 1995 and 2004. Higher house prices increase the value of collateral, which reduces the probability of bank distress (PDs). But higher prices at given rents may also indicate excessive expectations regarding the present value of real estate assets, which can increase PDs. Increasing price-to-rent ratios are positively related to PDs and larger real estate exposures amplify this effect. Rising real estate price levels alone reduce bank PDs, but only for banks with large real estate market exposure. This suggests a positive, but relatively small 'collateral' effect for banks with more expertise in specialized mortgage lending. Likewise, lower price-to-rent ratios are estimated to reduce the riskiness of banks. The multilevel logit model used here further shows that real estate markets are regionally segmented and location-specific effects contribute significantly to predicted bank PDs. --Real estate,distress,universal vs. specialized banks
    • …
    corecore