9,515 research outputs found

    Improving the Estimation of Site-Specific Effects and their Distribution in Multisite Trials

    Full text link
    In multisite trials, statistical goals often include obtaining individual site-specific treatment effects, determining their rankings, and examining their distribution across multiple sites. This paper explores two strategies for improving inferences related to site-specific effects: (a) semiparametric modeling of the prior distribution using Dirichlet process mixture (DPM) models to relax the normality assumption, and (b) using estimators other than the posterior mean, such as the constrained Bayes or triple-goal estimators, to summarize the posterior. We conduct a large-scale simulation study, calibrated to multisite trials common in education research. We then explore the conditions and degrees to which these strategies and their combinations succeed or falter in the limited data environments. We found that the average reliability of within-site effect estimates is crucial for determining effective estimation strategies. In settings with low-to-moderate data informativeness, flexible DPM models perform no better than the simple parametric Gaussian model coupled with a posterior summary method tailored to a specific inferential goal. DPM models outperform Gaussian models only in select high-information settings, indicating considerable sensitivity to the level of cross-site information available in the data. We discuss the implications of our findings for balancing trade-offs associated with shrinkage for the design and analysis of future multisite randomized experiments

    Hierarchical Bayesian Estimation of Small Area Means Using Complex Survey Data

    Get PDF
    In survey data analysis, there are two main approaches -design-based and modelbased-for making inferences for different characteristics of the population. A designbased approach tends to produce unreliable estimates for small geographical regions or cross classified demographic regions due to the small sample sizes. Moreover, when there are no samples available in those areas, a design-based method cannot be used. In the case of estimating population characteristics for a small area, model-based methods are used. They provide a flexible modeling method that can incorporate relevant information from similar areas and external databases. To provide suitable estimates, many model building techniques, both frequentist and Bayesian, have been developed, and when the model-based method makes an explicit use of prior distributions on the hyperparameters, inference can be carried out in the Bayesian paradigm. For estimating small area proportions, mixed models are often used because of the flexibility in combining information from different sources and of the tractability of error sources. Mixed models are categorized into two broad classes, area-level and unit-level models, and the use of either model depends on the availability of information. Generally, estimation of small area proportions with the hierarchical Bayes(HB) method involves transformation of the direct survey-weighted estimates that stabilizes the sampling variance. Additionally, it is commonly assumed that the survey-weighted proportion has a normal distribution with a known sampling variance. We find that these assumptions and application methods may introduce some complications. First, the transformation of direct estimates can introduce bias when they are back transformed for obtaining the original parameter of interest. Second, transformation of direct estimates can cause additional measures of uncertainty. Third, certain commonly used functions for transformation cannot be used, such as log transformation on a zero survey count. Fourth, applying fixed values for sampling variances may fail to capture the additional variability. Last, assumption of the normality of the model distribution might be inappropriate when the true parameter of interest lies near the extremities (near 0 or 1). To address these complications, we first expand the Fay-Herriot area-level model for estimating proportions that can directly model the survey-weighted proportions without using any transformation functions. Second, we introduce a logit function for the linking model, which is more appropriate for estimating proportions. Third, we model the sampling variance to capture the additional variability. Additionally, we develop a model that can be used for modeling the survey weighted counts directly. We also explore a new benchmarking approach for the estimates. Estimates are benchmarked when the aggregate of the estimates from the smaller regions matches that of the corresponding larger region. The benchmarking techniques involve a number of constraints. Our approach introduces a simple method that can be applied to complicated constraints when applying a traditional method may fail. Finally, we investigate the "triple-goal" estimation method that can concurrently achieve the three specific goals relatively well as an ensemble

    Dynamic modeling of web purchase behavior and e-mailing impact by Petri net

    Get PDF
    In this article, the authors introduce Petri nets to model the dynamics of web site visits and purchase behaviors in the case of wish list systems. They describe web site activities and their transition with probability distributions and model the sequential impact of influential factors through links that better explain web purchase behavior dynamics. The basic model, which analyzes site connections and purchases to explain visit and purchase behavior, performs better than a classical negative binomial regression model. To demonstrate its flexibility, the authors extend the wish list Petri net model to measure the impact of e-mailing intervals on visit frequency and purchase.internet; wish list; e-mail; Petri net; dynamic model

    Ranking USRDS Provider-Specific SMRs from 1998-2001

    Get PDF
    Provider profiling (ranking, league tables ) is prevalent in health services research. Similarly, comparing educational institutions and identifying differentially expressed genes depend on ranking. Effective ranking procedures must be structured by a hierarchical (Bayesian) model and guided by a ranking-specific loss function, however even optimal methods can perform poorly and estimates must be accompanied by uncertainty assessments. We use the 1998-2001 Standardized Mortality Ratio (SMR) data from United States Renal Data System (USRDS) as a platform to identify issues and approaches. Our analyses extend Liu et al. (2004) by combining evidence over multiple years via an AR(1) model; by considering estimates that minimize errors in classifying providers above or below a percentile cutpoint in addition to those that minimize rank-based, squared-error loss; by considering ranks based on the posterior probability that a provider\u27s SMR exceeds a threshold; by comparing these ranks to those produced by ranking MLEs and ranking P-values associated with testing whether a provider\u27s SMR = 1; by comparing results for a parametric and a non-parametric prior; by reporting on a suite of uncertainty measures. Results show that MLE-based and hypothesis test based ranks are far from optimal, that uncertainty measures effectively calibrate performance; that in the USRDS context ranks based on single-year data perform poorly, but that performance improves substantially when using the AR(1) model; that ranks based on posterior probabilities of exceeding a properly chosen SMR threshold are essentially identical to those produced by minimizing classification loss. These findings highlight areas requiring additional research and the need to educate stakeholders on the uses and abuses of ranks; on their proper role in science and policy; on the absolute necessity of accompanying estimated ranks with uncertainty assessments and ensuring that these uncertainties influence decisions

    Statistical Issues in Assessing Hospital Performance

    Get PDF
    From the Preface: The Centers for Medicare and Medicaid Services (CMS), through a subcontract with Yale New Haven Health Services Corporation, Center for Outcomes Research and Evaluation (YNHHSC/CORE), is supporting a committee appointed by the Committee of Presidents of Statistical Societies (COPSS) to address statistical issues identified by the CMS and stakeholders about CMSā€™s approach to modeling hospital quality based on outcomes. In the spring of 2011, with the direct support of YNHHSC/ CORE, COPSS formed a committee comprised of one member from each of its constituent societies, a chair, and a staff member from the American Statistical Association, and held a preliminary meeting in April. In June, YNHHSC/CORE executed a subcontract with COPSS under its CMS contract to support the development of a White Paper on statistical modeling. Specifically, YNHHSC/CORE contracted with COPSS to ā€œprovide guidance on statistical approaches . . .when estimating performance metrics,ā€ and ā€œconsider and discuss concerns commonly raised by stakeholders (hospitals, consumer, and insurers) about the use of ā€œhierarchical generalized linear models in profiling hospital quality. The committee convened in June and August of 2011, and exchanged a wide variety of materials. To ensure the committeeā€™s independence, YNHHSC/CORE did not comment on the white paper findings, and CMS pre-cleared COPSSā€™ publication of an academic manuscript based on the White Paper

    Bayesian ranking and selection methods using hierarchical mixture models in microarray studies.

    Get PDF
    The main purpose of microarray studies is screening to identify differentially expressed genes as candidates for further investigation. Because of limited resources in this stage, prioritizing or ranking genes is a relevant statistical task in microarray studies. In this article, we develop 3 empirical Bayes methods for gene ranking on the basis of differential expression, using hierarchical mixture models. These methods are based on (i) minimizing mean squared errors of estimation for parameters, (ii) minimizing mean squared errors of estimation for ranks of parameters, and (iii) maximizing sensitivity in selecting prespecified numbers of differential genes, with the largest effect. Our methods incorporate the mixture structures of differential and nondifferential components in empirical Bayes models to allow information borrowing across differential genes, with separation from nuisance, nondifferential genes. The accuracy of our ranking methods is compared with that of conventional methods through simulation studies. An application to a clinical study for breast cancer is provided
    • ā€¦
    corecore