7 research outputs found

    Unsupervised Machine Learning for Explainable Medicare Fraud Detection

    Full text link
    The US federal government spends more than a trillion dollars per year on health care, largely provided by private third parties and reimbursed by the government. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this paper, we develop novel machine learning tools to identify providers that overbill Medicare, the US federal health insurance program for elderly adults and the disabled. Using large-scale Medicare claims data, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for Medicare fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and several case studies validate our approach and findings both quantitatively and qualitatively.Comment: Working pape

    Maimonides rule redux

    Get PDF
    We use Maimonides' rule as an instrument for class size in large Israeli samples from 2002–2011. In contrast with Angrist and Lavy (1999), newer estimates show no evidence of class size effects. The new data also reveal enrollment manipulation near Maimonides cutoffs. A modified rule that uses birthdays to impute enrollment circumvents manipulation while still generating precisely estimated zeros. In both old and new data, Maimonides' rule is unrelated to socioeconomic characteristics conditional on a few controls. Enrollment manipulation therefore appears to be innocuous. We briefly discuss possible explanations for the disappearance of Israeli class size effects since the early 1990s

    Structural Topic Models for Open-Ended Survey Responses

    Get PDF
    Collection and especially analysis of open-ended survey responses are relatively rare in the discipline and when conducted are almost exclusively done through human coding. We present an alternative, semiautomated approach, the structural topic model (STM) (Roberts, Stewart, and Airoldi 2013; Roberts et al. 2013), that draws on recent developments in machine learning based analysis of textual data. A crucial contribution of the method is that it incorporates information about the document, such as the author's gender, political affiliation, and treatment assignment (if an experimental study). This article focuses on how the STM is helpful for survey researchers and experimentalists. The STM makes analyzing open-ended responses easier, more revealing, and capable of being used to estimate treatment effects. We illustrate these innovations with analysis of text from surveys and experiments

    The economics of fraud and corruption

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Economics, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references.Fraud and corruption are serious issues which undermine the provision of public goods. This thesis consists of three papers which analyze the economics of fraud and the mechanisms by which it can be detected and averted. An introductory chapter presents an overview of the economic ideas surrounding these topics. In the αrst paper, I analyze a US federal law that incentivizes whistleblowers to litigate against fraud and misreporting committed against the Medicare program. I provide a theoretical framework for understanding the economic tradeoffs associated with privatized whistleblowing enforcement and then empirically analyze the deterrence effects of whistleblower lawsuits. In the second paper, conducted as joint research, we consider the incentives for misreported enrollment statistics in Israeli public school data and the way in which data manipulation undermines economic estimates of the returns to smaller class sizes. We provide evidence of enrollment manipulation and show that smaller class sizes have no effect on student achievement, overturning earlier literature. In the third paper, we develop a mechanism for detecting misreported αnancial data and apply it to reports from a World Bank project. Our results are consistent with strategic and proαtable falsiαcation of data, and our method matches the results of an audit conducted independently by the World Bank on the same project.by Jetson Leder-Luis.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Economic

    Structural Topic Models for Open-Ended Survey Responses

    No full text
    Collection and especially analysis of open-ended survey responses are relatively rare in the discipline and when conducted are almost exclusively done through human coding. We present an alternative, semiautomated approach, the structura ltopic model (STM) (Roberts, Stewart, and Airoldi 2013; Roberts et al. 2013), that draws on recent developments in machine learning based analysis of textual data. A crucial contribution of the method is that it incorporates information about the document, such as the author'™s gender, political affiliation, and treatment assignment (if an experimental study). This article focuses on how the STM is helpful for survey researchers and experimentalists. The STM makes analyzing open-ended responses easier, more revealing, and capable of being used to estimate treatment effects. We illustrate these innovations with analysis of text from surveys and experiments
    corecore