3,481 research outputs found

    Model Assessment Tools for a Model False World

    Full text link
    A standard goal of model evaluation and selection is to find a model that approximates the truth well while at the same time is as parsimonious as possible. In this paper we emphasize the point of view that the models under consideration are almost always false, if viewed realistically, and so we should analyze model adequacy from that point of view. We investigate this issue in large samples by looking at a model credibility index, which is designed to serve as a one-number summary measure of model adequacy. We define the index to be the maximum sample size at which samples from the model and those from the true data generating mechanism are nearly indistinguishable. We use standard notions from hypothesis testing to make this definition precise. We use data subsampling to estimate the index. We show that the definition leads us to some new ways of viewing models as flawed but useful. The concept is an extension of the work of Davies [Statist. Neerlandica 49 (1995) 185--245].Comment: Published in at http://dx.doi.org/10.1214/09-STS302 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A method for determining venous contribution to BOLD contrast sensory activation

    Get PDF
    While BOLD contrast reflects haemodynamic changes within capillaries serving neural tissue, it also has a venous component. Studies that have determined the relation of large blood vessels to the activation map indicate that veins are the source of the largest response, and the most delayed in time. It would be informative if the location of these large veins could be extracted from the properties of the functional responses, since vessels are not visible in BOLD contrast images. The present study describes a method for investigating whether measures taken from the functional response can reliably predict vein location, or at least be useful in down-weighting the venous contribution to the activation response, and illustrates this method using data from one subject. We combined fMRI at 3 Tesla with high-resolution anatomical imaging and MR venography to test whether the intrinsic properties of activation time courses corresponded to tissue type. Measures were taken from a gamma fit to the functional response. Mean magnitude showed a significant effect of tissue type (P veins ≈ grey matter > white matter. Mean delays displayed the same ranking across tissue types (P grey matter. However, measures for all tissue types were distributed across an overlapping range. A logistic regression model correctly discriminated 72% of the veins from grey matter in the absence of independent information of macroscopic vessels (ROC=0.72). Whilst tissue classification was not perfect for this subject, weighting the T contrast by the predicted probabilities materially reduced the venous component to the activation map

    Evaluating the success of a workplace health and wellbeing intervention using a small group of repeat-respondents from a large repeated cross-sectional survey

    Get PDF
    The Healthy@Work intervention in the Tasmanian State Service was responsible for increased availability of and participation in health and wellbeing activities, but there was little evidence of improvement in health-related factors for this group of respondents over the three year period of this study. Changes in the health-related factors were expected outcomes of the intervention but a study duration of just three years is possibly too short to allow change to be manifest

    MODELING THE IMPACTS OF E-GOVERNMENT SERVICES ON CORRUPTION REDUCTION IN RWANDA: A CASE EVIDENCE FROM NYAMASHEKE DISTRICT, RWANDA

    Get PDF
    The study entitled modeling the impacts of e-government services on corruption reduction in Rwanda: Case evidence from Nyamasheke District, Rwanda was about assessing the contribution of e-government services use on reducing corruption in the area under study. The study was guided with the objective of exploring the utilization of multinomial logistic regression (MLR) in modeling the impact of e-government services on reduction status of corruption. In this regard, the MLR model was performed using a maximum likelihood estimation method on the data set collected to find the parameter estimates of the model describing the relationship between the explanatory and the outcome variables and determine the significance of the explanatory variables that contribute significantly to the reduction status of corruption in the area under study. The study adopted both qualitative and quantitative approaches to collect data from 381 respondents from the target population of 8041 using Solvin’s formula for sample size calculation. Data were collected using questionnaire and interview schedule techniques and analyzed using SPSS-23. In this analysis, the results show that on the total of eleven independent variables, the explanatory variables such as age, income, ownership of the devices used in applying for the local government services and the advice types were dropped from the training set of explanatory variables that contribute significantly to the reduction of corruption in the area under study. In model selection that overall fits well the data, the obtained variables that contributed significantly to the outcome variable were education, e-government services’ use status, cost of accessing e-government services and the e-government services types delivery. The parameters estimate of the selected model revealed that the variables that best predicted the probability of reducing corruption once the e-government services are delivered online were education, status of using e-government services, types of e-government services delivery online while the cost of accessing the e-government services decreased the logit (the probability) of reducing corruption. The main challenges faced by users of e-government services were the cost given while applying to these e-government services is high and lack of enough skills to cope with technological usage. Finally the study recommended that local leaders in the area under study should strengthen the online system in delivering local services to people, educate people to be aware about the use of e-government services since the more a person is educated the more is attempting to use e-government services and then reduce the cost of using e-government services while applying to the local services since this has been the only explanatory variable that decreased the logit of reducing corruption in the study area. Article visualizations

    Pseudo-R2 Measures for Some Common Limited Dependent Variable Models

    Get PDF
    A large number of different Pseudo-R2 measures for some common limited dependent variable models are surveyed. Measures include those based solely on the maximized likelihoods with and without the restriction that slope coefficients are zero, those which require further calculations based on parameter estimates of the coefficients and variances and those that are based solely on whether the qualitative predictions of the model are correct or not. The theme of the survey is that while there is no obvious criterion for choosing which Pseudo-R2 to use, if the estimation is in the context of an underlying latent dependent variable model, a case can be made for basing the choice on the strength of the numerical relationship to the OLS-R2 in the latent dependent variable. As such an OLS-R2 can be known in a Monte Carlo simulation, we summarize Monte Carlo results for some important latent dependent variable models (binary probit, ordinal probit and Tobit) and find that a Pseudo-R2 measure due to McKelvey and Zavoina scores consistently well under our criterion. We also very briefly discuss Pseudo-R2 measures for count data, for duration models and for prediction-realization tables

    Investigating the ironwood tree (Casuarina equisetifolia) decline on Guam using applied multinomial modeling

    Get PDF
    The ironwood tree (Casuarina equisetifolia), a protector of coastlines of the sub-tropical and tropical Western Pacific, is in decline on the island of Guam where aggressive data collection and efforts to mitigate the problem are underway. For each sampled tree the level of decline was measured on an ordinal scale consisting of five categories ranging from healthy to near dead. Several predictors were also measured including tree diameter, fire damage, typhoon damage, presence or absence of termites, presence or absence of basidiocarps, and various geographical or cultural factors. The five decline response levels can be viewed as categories of a multinomial distribution where the multinomial probability profile depends on the levels of these various predictors. Such data structure is well suited to a proportional odds model thereby leading to odds ratios involving cumulative probabilities which can be estimated and summarized using information from the predictor coefficient. Various modeling techniques were applied to address data set issues: reduced logistic models, spatial relationships of residuals using latitude and longitude coordinates, and correlation structure induced by the fact that trees were sampled in clusters at various sites. Among our findings, factors related to ironwood decline were found to be basidiocarps, termites, and level of human management

    Modelling Road Work Zone Crashes’ Nature and Type of Person Involved Using Multinomial Logistic Regression

    Get PDF
    The sustainable development goals “Good health and well-being” and “Sustainable cities and communities” of the United Nations and World Health Organization, alert governments and researchers and raise awareness about road safety problems and the need to mitigate them. In Portugal, after the economic crisis of 2008–2013, a significant amount of road assets demand investment in maintenance and rehabilitation. The areas where these actions take place are called work zones. Considering the particularities of these areas, the proposed work aims to identify the main factors that impact the occurrence of work zones crashes. It uses the statistical technique of multinomial logistic regression, applied to official data on road crashes occurred in mainland Portugal, during the period of 2010–2015. Usually, multinomial logistic regression models are developed for crash and injury severity. In this work, the feasibility of developing predictive models for crash nature (collision, run off road and running over pedestrians) and for type of person involved in the crash (driver, passenger and pedestrian), considering only one covariate (the number of persons involved in the crash), was studied. For the two predictive models obtained, the variables road environment (urban/rural), horizontal geometric design (straight/curve), pavement grip conditions (good/bad), heavy vehicle involvement, and injury severity (fatalities, serious and slightly injuries), were identified as the preponderant factors in a universe of 230 investigated variables. Results point to an increase of work zone crash probability due to driver actions such as running straight and excessive speed for the prevailing conditions.info:eu-repo/semantics/publishedVersio

    Regression Methods for Categorical Dependent Variables: Effects on a Model of Student College Choice

    Get PDF
    Thesis (Ph.D.) - Indiana University, School of Education, 2012The use of categorical dependent variables with the classical linear regression model (CLRM) violates many of the model's assumptions and may result in biased estimates (Long, 1997; O'Connell, Goldstein, Rogers, & Peng, 2008). Many dependent variables of interest to educational researchers (e.g., professorial rank, educational attainment) are categorical in nature but are analyzed using the CLRM (Harwell & Gatti, 2001) even though alternate regression techniques for categorical dependent variables are recommended (Agresti, 1996; Long, 1997). Data obtained from ACTÂź, Inc., on 5,200 high school seniors in Illinois and Colorado were used to analyze effects of regression method on a model of ascriptive and academic influences on selectivity of postsecondary institution attended. The dependent variable was measured in rank-ordered categories based on self-reported institutional admissions policies and analyzed with classical linear, multinomial logistic, and ordered logistic regressions. Choice of regression method did not affect overall model performance as evidenced by significant F and Likelihood Ratio χ2 tests. The full CLRM was fit moderately-well to the data (R2 = .391), surpassing some previous findings (Hearn, 1988, 1991; Davies & Guppy, 1997). McFadden's R2L measure of strength of association was larger in the multinomial regression than in the ordered regression (R2L = .191 vs. R2L = .158). The multinomial logistic method also correctly predicted dependent variable category with the greatest accuracy (46.3% correct), but Somers' Dyx measure of association was smallest for the multinomial model. Direction and significance of relationship between predictors and the dependent variable was substantively consistent across the CLRM and logistic methods. In all regressions, ACTÂź score had the most impact on selectivity of institution attended. Threshold values were significant, supporting the assumption of an ordered dependent variable. Due to the CLRM's theoretical and predictive shortcomings and the multinomial model's complexity in interpretation, ordered logistic regression was determined to be the most appropriate for explaining influences on selectivity of postsecondary institution attended

    Graphical diagnostics to check model misspecification for the proportional odds regression model

    Full text link
    The cumulative logit or the proportional odds regression model is commonly used to study covariate effects on ordinal responses. This paper provides some graphical and numerical methods for checking the adequacy of the proportional odds regression model. The methods focus on evaluating functional misspecification for specific covariate effects, but misspecification of the link function can also be dealt with under the same framework. For the logistic regression model with binary responses, Arbogast and Lin ( Statist. Med. 2005; 24 :229–247) developed similar graphical and numerical methods for assessing the adequacy of the model using the cumulative sums of residuals. The paper generalizes their methods to ordinal responses and illustrates them using an example from the VA Normative Aging Study. Simulation studies comparing the performance of the different diagnostic methods indicate that some of the graphical methods are more powerful in detecting model misspecification than the Hosmer–Lemeshow-type goodness-of-fit statistics for the class of models studied. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/61528/1/3386_ftp.pd

    Relationship Between Health Literacy and End-Stage Renal Disease among Type II Diabetics

    Get PDF
    The progression of End Stage Renal Disease (ESRD) among type II diabetics is preventable, yet complications continue to plague many. Reports show that 29.1 million people (9.3%) in the United States have diabetes, and 40% of those individuals develop ESRD. Four research questions explored the relationship between ESRD, health literacy, and healthcare. Data from 2010-2015 from the National Institute of Health (NIH) was quantitatively analyzed. The conceptual framework was the revised health service utilization theory. The target population included 3939 diverse males and females between the ages of 20-75 diagnosed with type II Diabetes. Results from Chi-square, cross-tabulation, binary, and multinomial logistic regression revealed that there is a statistically significant relationship between inadequate health literacy and ESRD (p= \u3c0.05), inadequate health literacy and healthcare services (p= \u3c0.05), and healthcare services and development of ESRD (p=\u3c.001). Findings exposed significant demographic co-factor differences. Males developed ESRD more than females, and African American and Hispanic populations were almost 2 times more likely than Caucasians to develop ESRD. As participants age, odds for developing ESRD increase about 2-3 times. Both race and education were significant predictors of inadequate health literacy. African Americans and Hispanics were 3 times more likely to have inadequate health literacy than Caucasian participants. Lower education increased the odds of having inadequate health literacy approximately 7.6 times. Results show that Caucasian participants had higher education levels and private health insurance, whereas African Americans and Hispanics had lower education and no insurance or Medicaid. Implications from this research show that social determinants among vulnerable populations are impacting an individual\u27s health literacy and ability to adequately manage their health. Evidence from this study generates social change through recognition that health literacy is fundamental when attempting to prevent chronic disease complications and promote positive health
    • 

    corecore