627,257 research outputs found

    Errors in Survey Based Quality Evaluation Variables in Efficiency Models of Primary Care Physicians

    Get PDF
    Efficiency analyses in the health care sector are often criticised for not incorporating quality variables. The definition of quality of primary health care has many aspects, and it is inevitably also a question of the patients’ perception of the services received. This paper uses variables derived from patient evaluation surveys as measures of the quality of the production of health care services. It uses statistical tests to judge if such measures have a significant impact on the use of resources in various Data Envelopment Analysis (DEA) models. As the use of survey data implies that the quality variables are measured with error, the assumptions underlying a DEA model are not strictly fulfilled. This paper focuses on ways of correcting for biases that might result from the violation of selected assumptions. Firstly, any selection bias in the patient mix of each physician is controlled for by regressing the patient evaluation responses on the patient characteristics. The corrected quality evaluation variables are entered as outputs in the DEA model, and model specification tests indicate that out of 25 different quality variables, only waiting time has a systematic impact on the efficiency results. Secondly, the effect on the efficiency estimates of the remaining sampling error in the patient sample for each physician is accounted for by constructing confidence intervals based on resampling. Finally, as an alternative approach to including the quality variables in the DEA model, a regression model finds different variables significant, but not always with a trade-of between quality and quantity.DEA; Health economics; Quality; Patient evaluation; Efficiency; Errors in variables; Resampling; Bootstrap; Selection bias; Sampling error

    Nonparametric Bayes inference on conditional independence

    Full text link
    In broad applications, it is routinely of interest to assess whether there is evidence in the data to refute the assumption of conditional independence of YY and XX conditionally on ZZ. Such tests are well developed in parametric models but are not straightforward in the nonparametric case. We propose a general Bayesian approach, which relies on an encompassing nonparametric Bayes model for the joint distribution of YY, XX and ZZ. The framework allows YY, XX and ZZ to be random variables on arbitrary spaces, and can accommodate different dimensional vectors having a mixture of discrete and continuous measurement scales. Using conditional mutual information as a scalar summary of the strength of the conditional dependence relationship, we construct null and alternative hypotheses. We provide conditions under which the correct hypothesis will be consistently selected. Computational methods are developed, which can be incorporated within MCMC algorithms for the encompassing model. The methods are applied to variable selection and assessed through simulations and criminology applications

    Testing exogeneity in cross-section regression by sorting data

    Get PDF
    We introduce a framework to test for exogeneity of a variable in a regression based on cross-sectional data. By sorting data with respect to a function (sorting score) of known exogeneous variables it is possible to utilize a battery of tools originally develped to detecting model misspecification in at time series context. Thus, we are able to propose graphical tools for the identification of endogeneity, as well as formal tests, including a simple-to-use Chow test, needing a minimum of assumptions on the alternative endogeneity hypothesis. Models of endogenous treatment and selectivity are utilized to illustrate the methods. With Monte Carlo experiments, including continous and discrete response cases, we compare small sample performances with existing tests for exogeneity.Chow test; Endogenous treatment; Propensity score; Recursive residuals; Sample selection; Sorting score

    SNIFFER WFD119: Enhancement of the River Invertebrate Classification Tool (RICT)

    Get PDF
    EXECUTIVE SUMMARY Project funders/partners: Environment Agency (EA), Northern Ireland Environment Agency (NIEA), Scotland & Northern Ireland Forum for Environmental Research (SNIFFER), Scottish Environment Protection Agency (SEPA) Background to research The Regulatory Agencies in the UK (the Environment Agency; Scottish Environment Protection Agency; and the Northern Ireland Environment Agency) now use the River Invertebrate Classification Tool (RICT) to classify the ecological quality of rivers for Water Framework Directive compliance monitoring. RICT incorporates RIVPACS IV predictive models and is a highly capable tool written in a modern software programming language. While RICT classifies waters for general degradation and organic pollution stress, producing assessments of status class and uncertainty, WFD compliance monitoring also requires the UK Agencies to assess the impacts of a wide range of pressures including hydromorphological and acidification stresses. Some of these pressures alter the predictor variables that current RIVPACS models use to derive predicted biotic indices. This project has sought to broaden the scope of RICT by developing one or more RIVPACS model(s) that do not use predictor variables that are affected by these stressors, but instead use alternative GIS based variables that are wholly independent of these pressures. This project has also included a review of the wide range of biotic indices now available in RICT, identifying published sources, examining index performance, and where necessary making recommendations on further needs for index testing and development. Objectives of research •To remove and derive alternative predictive variables that are not affected by stressors, with particular emphasis on hydrological/acidification metric predictors. •To construct one or more new RIVPACS model(s) using stressor independent variables. •Review WFD reporting indices notably AWIC(species), LIFE (species), PSI & WHPT. Key findings and recommendations : Predictor variables and intellectual property rights : An extensive suite of new variables have been derived by GIS for the RIVPACS reference sites that have been shown to act as stressor-independent predictor variables. These include measures of stream order, solid and drift geology, and a range of upstream catchment characteristics (e.g. catchment area, mean altitude of upstream catchment, and catchment aspect). It is recommended that decisions are reached on which of the newly derived model(s) are implemented in RICT so that IPR issues for the relevant datasets can be quickly resolved and the datasets licensed. It is also recommended that licensing is sought for a point and click system (where the dataset cannot be reverse engineered) that is capable of calculating any of the time-invariant RIVPACS environmental predictor variables used by any of the newly derived (and existing) RIVPACS models, and for any potential users. New stressor-independent RIVPACS models : Using the existing predictor variables, together with new ones derived for their properties of stressor-independence, initial step-wise forward selection discriminant models suggested a range of 36 possible models that merited further testing. Following further testing, the following models are recommended for assessing watercourses affected by flow/hydromorphological and/or acidity stress: • For flow/hydromorphological stressors that may have modified width, depth and/or substrate in GB, it is suggested that a new ‘RIVPACS IV – Hydromorphology Independent’ model (Model 24) is used (this does not use the predictor variables width, depth and substratum, but includes a suite of new stressor-independent variables). • For acidity related stressors in GB, it is suggested that a new ‘RIVPACS IV – Alkalinity Independent’ model (Model 35) is used (this does not use the predictor variable alkalinity, but includes new stressor-independent variables). • For flow/hydromorphological stressors and acidity related stressors in GB, it is suggested that a new ‘RIVPACS IV – Hydromorphology & Alkalinity Independent’ model (Model 13) is used (this does not use the predictor variables width, depth, substratum and alkalinity, but includes a suite of new stressor-independent variables). • Reduced availability of appropriate GIS tools at this time has meant that no new models have been developed for Northern Ireland. Discriminant functions and end group means have now been calculated to enable any of these models to be easily implemented in the RICT software. Biotic indices : The RIVPACS models in RICT can now produce expected values for a wide range of biotic indices addressing a variety of stressors. These indices will support the use of RICT as a primary tool for WFD classification and reporting of the quality of UK streams and rivers. There are however a number of outstanding issues with indices that need to be addressed: • There is a need to develop a biotic index for assessing metal pollution. • WFD EQR banding schemes are required for many of the indices to report what is considered an acceptable degree of stress (High-Good) and what is not (Moderate, Poor or Bad). • A comprehensive objective testing process needs to be undertaken on the indices in RICT using UK-wide, large-scale, independent test datasets to quantify their index-stressor relationships and their associated uncertainty, for example following the approach to acidity index testing in Murphy et al., (in review) or organic/general degradation indices in Banks & McFarland (2010). • Following objective testing, the UK Agencies should make efforts to address any index under-performance issues that have been identified, and where necessary new work should be commissioned to modify existing indices, or develop new ones where required so that indices for all stress types meet certain minimum performance criteria. • Testing needs to be done to examine index-stressor relationships with both observed index scores and RIVPACS observed/expected ratios. Work should also be done to compare the existing RIVPACS IV and the new stressor-independent models (developed in this project) as alternative sources of the expected index values for these tests. • Consideration should be given to assessing the extent to which chemical and biological monitoring points co-occur. Site-matched (rather than reach-matched) chemical and biological monitoring points would i) generate the substantial training datasets needed to refine or develop new indices and ii) generate the independent datasets for testing

    Online) An Open Access

    Get PDF
    ABSTRACT Two models are said to be non-nested models, if one can not be derived as a special case of another. Much attention in classical statistics has been devoted to testing non-nested regression models. Within the classical framework, there are three alternative general approaches to test non-nested models namely, the use of specification error tests; the use of comprehensive model method; and the use of procedures based upon Keywords: Non-Nested Model, Studentized Residuals INTRODUCTION The selection of a good model is an art. The basic idea in statistics is how to select a good model for the purpose of the study. Once a model is given, however, there are statistical criteria to judge whether the given model is bad or not. Since, many models can explain the same set of data about equally well, a given set of data can be used to screen out bad models but not to generate good models, whatever statistical techniques are used. The subject of model selection is treated in classical statistics, which deals with the two topics of estimation and testing of hypotheses. The problem of determining an appropriate model based on a subset of the original set of variables contains three basic ingredients namely, i) The computational technique used to provide the information for the analysis; ii) The criterion used to analyze the variables and select a subset, if that is an appropriate; and iii) The estimation of coefficients in the final model. In model selection criteria, there may be two important problems those arising from nested and nonnested model structures. The nested models arise with, for instance, two models specified in such a way that one model is a special case of the other; the non-nested model arise when neither model follows as a special case of the other. The model selection criterion is a problem of choice among competing models. The choice of a model follows some preliminary data search. In the context of the linear model, it leads to the specification of explanatory variables that appear to be the most important on prior grounds. Often, some explanatory variables appear in one model and reappear in another model gives rise to the nested models; often again neither model, in the case of two models appears to be a special case of the other model gives rise to the non-nested models. In the process of choosing models, statisticians have developed a variety of diagnostic tests. These tests have been classified into two categories: (i) Tests of Nested Regression models, and (ii) Tests of Non-nested Regression models If a modelI can be derived as a special case of another modelII then modelI is said to be nested model within modelII. Two models are said to be non-nested models, if one can not be derived as a special case of another

    Kernel based methods for accelerated failure time model with ultra-high dimensional data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Most genomic data have ultra-high dimensions with more than 10,000 genes (probes). Regularization methods with <it>L</it><sub>1 </sub>and <it>L<sub>p </sub></it>penalty have been extensively studied in survival analysis with high-dimensional genomic data. However, when the sample size <it>n </it>≪ <it>m </it>(the number of genes), directly identifying a small subset of genes from ultra-high (<it>m </it>> 10, 000) dimensional data is time-consuming and not computationally efficient. In current microarray analysis, what people really do is select a couple of thousands (or hundreds) of genes using univariate analysis or statistical tests, and then apply the LASSO-type penalty to further reduce the number of disease associated genes. This two-step procedure may introduce bias and inaccuracy and lead us to miss biologically important genes.</p> <p>Results</p> <p>The accelerated failure time (AFT) model is a linear regression model and a useful alternative to the Cox model for survival analysis. In this paper, we propose a nonlinear kernel based AFT model and an efficient variable selection method with adaptive kernel ridge regression. Our proposed variable selection method is based on the kernel matrix and dual problem with a much smaller <it>n </it>× <it>n </it>matrix. It is very efficient when the number of unknown variables (genes) is much larger than the number of samples. Moreover, the primal variables are explicitly updated and the sparsity in the solution is exploited.</p> <p>Conclusions</p> <p>Our proposed methods can simultaneously identify survival associated prognostic factors and predict survival outcomes with ultra-high dimensional genomic data. We have demonstrated the performance of our methods with both simulation and real data. The proposed method performs superbly with limited computational studies.</p

    Exact Post-Selection Inference for Sequential Regression Procedures

    Full text link
    We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection event that can be characterized as y falling into a polyhedral set. This framework allows us to derive conditional (post-selection) hypothesis tests at any step of forward stepwise or least angle regression, or any step along the lasso regularization path, because, as it turns out, selection events for these procedures can be expressed as polyhedral constraints on y. The p-values associated with these tests are exactly uniform under the null distribution, in finite samples, yielding exact type I error control. The tests can also be inverted to produce confidence intervals for appropriate underlying regression parameters. The R package "selectiveInference", freely available on the CRAN repository, implements the new inference tools described in this paper.Comment: 26 pages, 5 figure

    Choosing the best model in the presence of zero trade: a fish product analysis

    Get PDF
    The purpose of the paper is to test the hypothesis that food safety (chemical) standards act as barriers to international seafood imports. We use zero-accounting gravity models to test the hypothesis that food safety (chemical) standards act as barriers to international seafood imports. The chemical standards on which we focus include chloramphenicol required performance limit, oxytetracycline maximum residue limit, fluoro-quinolones maximum residue limit, and dichlorodiphenyltrichloroethane (DDT) pesticide residue limit. The study focuses on the three most important seafood markets: the European Union’s 15 members, Japan, and North America
    • …
    corecore