97 research outputs found

    Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data

    Full text link
    Comment on ``Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data'' [arXiv:0804.2958]Comment: Published in at http://dx.doi.org/10.1214/07-STS227C the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Generalized Boosted Models: A Guide to the gbm Package,

    Get PDF
    Boosting takes on various forms with different programs using different loss functions, different base models, and different optimization schemes. The gbm package takes the approach described i

    Loss Function Based Ranking in Two-Stage, Hierarchical Models

    Get PDF
    Several authors have studied the performance of optimal, squared error loss (SEL) estimated ranks. Though these are effective, in many applications interest focuses on identifying the relatively good (e.g., in the upper 10%) or relatively poor performers. We construct loss functions that address this goal and evaluate candidate rank estimates, some of which optimize specific loss functions. We study performance for a fully parametric hierarchical model with a Gaussian prior and Gaussian sampling distributions, evaluating performance for several loss functions. Results show that though SEL-optimal ranks and percentiles do not specifically focus on classifying with respect to a percentile cut point, they perform very well over a broad range of loss functions. We compare inferences produced by the candidate estimates using data from The Community Tracking Study

    Ranking USRDS Provider-Specific SMRs from 1998-2001

    Get PDF
    Provider profiling (ranking, league tables ) is prevalent in health services research. Similarly, comparing educational institutions and identifying differentially expressed genes depend on ranking. Effective ranking procedures must be structured by a hierarchical (Bayesian) model and guided by a ranking-specific loss function, however even optimal methods can perform poorly and estimates must be accompanied by uncertainty assessments. We use the 1998-2001 Standardized Mortality Ratio (SMR) data from United States Renal Data System (USRDS) as a platform to identify issues and approaches. Our analyses extend Liu et al. (2004) by combining evidence over multiple years via an AR(1) model; by considering estimates that minimize errors in classifying providers above or below a percentile cutpoint in addition to those that minimize rank-based, squared-error loss; by considering ranks based on the posterior probability that a provider\u27s SMR exceeds a threshold; by comparing these ranks to those produced by ranking MLEs and ranking P-values associated with testing whether a provider\u27s SMR = 1; by comparing results for a parametric and a non-parametric prior; by reporting on a suite of uncertainty measures. Results show that MLE-based and hypothesis test based ranks are far from optimal, that uncertainty measures effectively calibrate performance; that in the USRDS context ranks based on single-year data perform poorly, but that performance improves substantially when using the AR(1) model; that ranks based on posterior probabilities of exceeding a properly chosen SMR threshold are essentially identical to those produced by minimizing classification loss. These findings highlight areas requiring additional research and the need to educate stakeholders on the uses and abuses of ranks; on their proper role in science and policy; on the absolute necessity of accompanying estimated ranks with uncertainty assessments and ensuring that these uncertainties influence decisions

    Cardiff Model Toolkit: Community Guidance For Violence Prevention

    Get PDF
    More than half of violent crime in the United States is not reported to law enforcement, according to the U.S. Department of Justice. That means cities and communities lack a complete understanding of where violence occurs, which limits the ability to develop successful solutions.The Cardiff Violence Prevention Model provides a way for communities to gain a clearer picture about where violence is occurring by combining and mapping both hospital and police data on violence.But more than just an approach to map and understand violence, the Cardiff Model provides a straightforward framework for hospitals, law enforcement agencies, public health agencies, community groups, and others interested in violence prevention to work together and develop collaborative violence prevention strategies

    Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment

    Full text link
    Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of disparate mistreatment for decision boundary-based classifiers, which can be easily incorporated into their formulation as convex-concave constraints. Experiments on synthetic as well as real world datasets show that our methodology is effective at avoiding disparate mistreatment, often at a small cost in terms of accuracy.Comment: To appear in Proceedings of the 26th International World Wide Web Conference (WWW), 2017. Code available at: https://github.com/mbilalzafar/fair-classificatio

    Remarkable resilience of forest structure and biodiversity following fire in the peri-urban bushland of Sydney, Australia

    Get PDF
    In rapidly urbanizing areas, natural vegetation becomes fragmented, making conservation planning challenging, particularly as climate change accelerates fire risk. We studied urban forest fragments in two threatened eucalypt‐dominated (scribbly gum woodland, SGW, and ironbark forest, IF) communities across ~2000 ha near Sydney, Australia, to evaluate effects of fire frequency (0– 4 in last 25 years) and time since fire (0.5 to >25 years) on canopy structure, habitat quality and biodiversity (e.g., species richness). Airborne lidar was used to assess canopy height and density, and ground‐based surveys of 148 (400 m2) plots measured leaf area index (LAI), plant species com‐ position and habitat metrics such as litter cover and hollow‐bearing trees. LAI, canopy density, litter, and microbiotic soil crust increased with time since fire in both communities, while tree and mistletoe cover increased in IF. Unexpectedly, plant species richness increased with fire frequency, owing to increased shrub richness which offset decreased tree richness in both communities. These findings indicate biodiversity and canopy structure are generally resilient to a range of times since fire and fire frequencies across this study area. Nevertheless, reduced arboreal habitat quality and subtle shifts in community composition of resprouters and obligate seeders signal early concern for a scenario of increasing fire frequency under climate change. Ongoing assessment of fire responses is needed to ensure that biodiversity, canopy structure and ecosystem function are maintained in the remaining fragments of urban forests under future climate change which will likely drive hotter and more frequent fires
    corecore