6,917 research outputs found

    Maximizing Insight from Modern Economic Analysis

    Full text link
    The last decade has seen a growing trend of economists exploring how to extract different economic insight from "big data" sources such as the Web. As economists move towards this model of analysis, their traditional workflow starts to become infeasible. The amount of noisy data from which to draw insights presents data management challenges for economists and limits their ability to discover meaningful information. This leads to economists needing to invest a great deal of energy in training to be data scientists (a catch-all role that has grown to describe the usage of statistics, data mining, and data management in the big data age), with little time being spent on applying their domain knowledge to the problem at hand. We envision an ideal workflow that generates accurate and reliable results, where results are generated in near-interactive time, and systems handle the "heavy lifting" required for working with big data. This dissertation presents several systems and methodologies that bring economists closer to this ideal workflow, helping them address many of the challenges faced in transitioning to working with big data sources like the Web. To help users generate accurate and reliable results, we present approaches to identifying relevant predictors in nowcasting applications, as well as methods for identifying potentially invalid nowcasting models and their inputs. We show how a streamlined workflow, combined with pruning and shared computation, can help handle the heavy lifting of big data analysis, allowing users to generate results in near-interactive time. We also present a novel user model and architecture for helping users avoid undesirable bias when doing data preparation: users interactively define constraints for transformation code and the data that the code produces, and an explain-and-repair system satisfies these constraints as best it can, also providing an explanation for any problems along the way. These systems combined represent a unified effort to streamline the transition for economists to this new big data workflow.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144007/1/dol_1.pd

    MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry

    Get PDF
    Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%)

    Optimal auditing with scoring: theory and application to insurance fraud

    Get PDF
    This article makes a bridge between the theory of optimal auditing and the scoring methodology in an asymmetric information setting. Our application is meant for insurance claims fraud, but it can be applied to many other activities that use the scoring approach. Fraud signals are classified based on the degree to which they reveal an increasing probability of fraud. We show that the optimal auditing strategy takes the form of a “Red Flags Strategy” which consists in referring claims to a Special Investigative Unit (SIU) when certain fraud indicators are observed. The auditing policy acts as a deterrence device and we explain why it requires the commitment of the insurer and how it should affect the incentives of SIU staffs. The characterization of the optimal auditing strategy is robust to some degree of signal manipulation by defrauders as well as to the imperfect information of defrauders about the audit frequency. The model is calibrated with data from a large European insurance company. We show that it is possible to improve our results by separating different groups of insureds with different moral costs of fraud. Finally, our results indicate how the deterrence effect of the audit scheme can be taken into account and how it affects the optimal auditing strategy.Audit, scoring, insurance fraud, red flags strategy, fraud indicators, suspicion index, moral cost of fraud, deterrence effect, signal manipulation.

    진료 내역 데이터를 활용한 딥러닝 기반의 건강보험 남용 탐지

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 산업공학과, 2020. 8. 조성준.As global life expectancy increases, spending on healthcare grows in accordance in order to improve quality of life. However, due to expensive price of medical care, the bare cost of healthcare services would inevitably places great financial burden to individuals and households. In this light, many countries have devised and established their own public healthcare insurance systems to help people receive medical services at a lower price. Since reimbursements are made ex-post, unethical practices arise, exploiting the post-payment structure of the insurance system. The archetypes of such behavior are overdiagnosis, the act of manipulating patients diseases, and overtreatments, prescribing unnecessary drugs for the patient. These abusive behaviors are considered as one of the main sources of financial loss incurred in the healthcare system. In order to detect and prevent abuse, the national healthcare insurance hires medical professionals to manually examine whether the claim filing is medically legitimate or not. However, the review process is, unquestionably, very costly and time-consuming. In order to address these limitations, data mining techniques have been employed to detect problematic claims or abusive providers showing an abnormal billing pattern. However, these cases only used coarsely grained information such as claim-level or provider-level data. This extracted information may lead to degradation of the model's performance. In this thesis, we proposed abuse detection methods using the medical treatment data, which is the lowest level information of the healthcare insurance claim. Firstly, we propose a scoring model based on which abusive providers are detected and show that the review process with the proposed model is more efficient than that with the previous model which uses the provider-level variables as input variables. At the same time, we devise the evaluation metrics to quantify the efficiency of the review process. Secondly, we propose the method of detecting overtreatment under seasonality, which reflects more reality to the model. We propose a model embodying multiple structures specific to DRG codes selected as important for each given department. We show that the proposed method is more robust to the seasonality than the previous method. Thirdly, we propose an overtreatment detection model accounting for heterogeneous treatment between practitioners. We proposed a network-based approach through which the relationship between the diseases and treatments is considered during the overtreatment detection process. Experimental results show that the proposed method classify the treatment well which does not explicitly exist in the training set. From these works, we show that using treatment data allows modeling abuse detection at various levels: treatment, claim, and provider-level.사람들의 기대수명이 증가함에 따라 삶의 질을 향상시키기 위해 보건의료에 소비하는 금액은 증가하고 있다. 그러나, 비싼 의료 서비스 비용은 필연적으로 개인과 가정에게 큰 재정적 부담을 주게된다. 이를 방지하기 위해, 많은 국가에서는 공공 의료 보험 시스템을 도입하여 사람들이 적절한 가격에 의료서비스를 받을 수 있도록 하고 있다. 일반적으로, 환자가 먼저 서비스를 받고 나서 일부만 지불하고 나면, 보험 회사가 사후에 해당 의료 기관에 잔여 금액을 상환을 하는 제도로 운영된다. 그러나 이러한 제도를 악용하여 환자의 질병을 조작하거나 과잉진료를 하는 등의 부당청구가 발생하기도 한다. 이러한 행위들은 의료 시스템에서 발생하는 주요 재정 손실의 이유 중 하나로, 이를 방지하기 위해, 보험회사에서는 의료 전문가를 고용하여 의학적 정당성여부를 일일히 검사한다. 그러나, 이러한 검토과정은 매우 비싸고 많은 시간이 소요된다. 이러한 검토과정을 효율적으로 하기 위해, 데이터마이닝 기법을 활용하여 문제가 있는 청구서나 청구 패턴이 비정상적인 의료 서비스 공급자를 탐지하는 연구가 있어왔다. 그러나, 이러한 연구들은 데이터로부터 청구서 단위나 공급자 단위의 변수를 유도하여 모델을 학습한 사례들로, 가장 낮은 단위의 데이터인 진료 내역 데이터를 활용하지 못했다. 이 논문에서는 청구서에서 가장 낮은 단위의 데이터인 진료 내역 데이터를 활용하여 부당청구를 탐지하는 방법론을 제안한다. 첫째, 비정상적인 청구 패턴을 갖는 의료 서비스 제공자를 탐지하는 방법론을 제안하였다. 이를 실제 데이터에 적용하였을 때, 기존의 공급자 단위의 변수를 사용한 방법보다 더 효율적인 심사가 이루어 짐을 확인하였다. 이 때, 효율성을 정량화하기 위한 평가 척도도 제안하였다. 둘째로, 청구서의 계절성이 존재하는 상황에서 과잉진료를 탐지하는 방법을 제안하였다. 이 때, 진료 과목단위로 모델을 운영하는 대신 질병군(DRG) 단위로 모델을 학습하고 평가하는 방법을 제안하였다. 그리고 실제 데이터에 적용하였을 때, 제안한 방법이 기존 방법보다 계절성에 더 강건함을 확인하였다. 셋째로, 동일 환자에 대해서 의사간의 상이한 진료 패턴을 갖는 환경에서의 과잉진료 탐지 방법을 제안하였다. 이는 환자의 질병과 진료내역간의 관계를 네트워크 기반으로 모델링하는것을 기반으로 한다. 실험 결과 제안한 방법이 학습 데이터에서 나타나지 않는 진료 패턴에 대해서도 잘 분류함을 알 수 있었다. 그리고 이러한 연구들로부터 진료 내역을 활용하였을 때, 진료내역, 청구서, 의료 서비스 제공자 등 다양한 레벨에서의 부당 청구를 탐지할 수 있음을 확인하였다.Chapter 1 Introduction 1 Chapter 2 Detection of Abusive Providers by department with Neural Network 9 2.1 Background 9 2.2 Literature Review 12 2.2.1 Abnormality Detection in Healthcare Insurance with Datamining Technique 12 2.2.2 Feed-Forward Neural Network 17 2.3 Proposed Method 21 2.3.1 Calculating the Likelihood of Abuse for each Treatment with Deep Neural Network 22 2.3.2 Calculating the Abuse Score of the Provider 25 2.4 Experiments 26 2.4.1 Data Description 27 2.4.2 Experimental Settings 32 2.4.3 Evaluation Measure (1): Relative Efficiency 33 2.4.4 Evaluation Measure (2): Precision at k 37 2.5 Results 38 2.5.1 Results in the test set 38 2.5.2 The Relationship among the Claimed Amount, the Abused Amount and the Abuse Score 40 2.5.3 The Relationship between the Performance of the Treatment Scoring Model and Review Efficiency 41 2.5.4 Treatment Scoring Model Results 42 2.5.5 Post-deployment Performance 44 2.6 Summary 45 Chapter 3 Detection of overtreatment by Diagnosis-related Group with Neural Network 48 3.1 Background 48 3.2 Literature review 51 3.2.1 Seasonality in disease 51 3.2.2 Diagnosis related group 52 3.3 Proposed method 54 3.3.1 Training a deep neural network model for treatment classi fication 55 3.3.2 Comparing the Performance of DRG-based Model against the department-based Model 57 3.4 Experiments 60 3.4.1 Data Description and Preprocessing 60 3.4.2 Performance Measures 64 3.4.3 Experimental Settings 65 3.5 Results 65 3.5.1 Overtreatment Detection 65 3.5.2 Abnormal Claim Detection 67 3.6 Summary 68 Chapter 4 Detection of overtreatment with graph embedding of disease-treatment pair 70 4.1 Background 70 4.2 Literature review 72 4.2.1 Graph embedding methods 73 4.2.2 Application of graph embedding methods to biomedical data analysis 79 4.2.3 Medical concept embedding methods 87 4.3 Proposed method 88 4.3.1 Network construction 89 4.3.2 Link Prediction between the Disease and the Treatment 90 4.3.3 Overtreatment Detection 93 4.4 Experiments 96 4.4.1 Data Description 97 4.4.2 Experimental Settings 99 4.5 Results 102 4.5.1 Network Construction 102 4.5.2 Link Prediction between the Disease and the Treatment 104 4.5.3 Overtreatment Detection 105 4.6 Summary 106 Chapter 5 Conclusion 108 5.1 Contribution 108 5.2 Future Work 110 Bibliography 112 국문초록 129Docto

    Data-Driven Models, Techniques, and Design Principles for Combatting Healthcare Fraud

    Get PDF
    In the U.S., approximately 700billionofthe700 billion of the 2.7 trillion spent on healthcare is linked to fraud, waste, and abuse. This presents a significant challenge for healthcare payers as they navigate fraudulent activities from dishonest practitioners, sophisticated criminal networks, and even well-intentioned providers who inadvertently submit incorrect billing for legitimate services. This thesis adopts Hevner’s research methodology to guide the creation, assessment, and refinement of a healthcare fraud detection framework and recommended design principles for fraud detection. The thesis provides the following significant contributions to the field:1. A formal literature review of the field of fraud detection in Medicaid. Chapters 3 and 4 provide formal reviews of the available literature on healthcare fraud. Chapter 3 focuses on defining the types of fraud found in healthcare. Chapter 4 reviews fraud detection techniques in literature across healthcare and other industries. Chapter 5 focuses on literature covering fraud detection methodologies utilized explicitly in healthcare.2. A multidimensional data model and analysis techniques for fraud detection in healthcare. Chapter 5 applies Hevner et al. to help develop a framework for fraud detection in Medicaid that provides specific data models and techniques to identify the most prevalent fraud schemes. A multidimensional schema based on Medicaid data and a set of multidimensional models and techniques to detect fraud are presented. These artifacts are evaluated through functional testing against known fraud schemes. This chapter contributes a set of multidimensional data models and analysis techniques that can be used to detect the most prevalent known fraud types.3. A framework for deploying outlier-based fraud detection methods in healthcare. Chapter 6 proposes and evaluates methods for applying outlier detection to healthcare fraud based on literature review, comparative research, direct application on healthcare claims data, and known fraudulent cases. A method for outlier-based fraud detection is presented and evaluated using Medicaid dental claims, providers, and patients.4. Design principles for fraud detection in complex systems. Based on literature and applied research in Medicaid healthcare fraud detection, Chapter 7 offers generalized design principles for fraud detection in similar complex, multi-stakeholder systems.<br/

    Law, Technology and Patient Safety

    Get PDF
    Medical error is the third leading cause of death in the United States, In an effort to increase patient safety, various regulatory agencies require reporting of adverse events, but reported counts tend to be inaccurate. In 2005, in an effort to reduce adverse event rates, Congress proposed a list of “never events,” adverse events, such as wrong-site surgery, that should never occur in hospitals, and authorized CMS to refuse payment for care required following such events. CMS has since pushed for further regulation, “such as putting more payment at risk, increasing transparency, increasing frequency of quality data reviews, and stepping up media scrutiny.” Evidence suggests these public reporting and pay-for-performance initiatives compel hospitals to manipulate reports, and in some cases patient treatment, to conceal adverse events. The purpose of this Essay is to consider how we might use law coupled with technological advances to increase adverse event count accuracy. On the technology front, three advances are particularly relevant. First, digitization of medical records, billing data, and other sources of germane information has made collecting large amounts of data easier than ever. Second, current adverse event counters employ powerful computer algorithms, and we’re likely moving towards detecting adverse events through analysis of large datasets using artificial intelligence. Third, governmental entities have started to team up with computer scientists who use cryptographic techniques to collect sensitive data in ways that protect the anonymity of data producers. We explore how law might harness the power of these technological developments to increase adverse event count accuracy without creating incentives for providers to hide data or alter treatment practices in harmful or wasteful ways. This Essay is organized as follows. Part II describes current methods used by hospitals, CMS and researchers to count adverse events. It also attempts to explain the wide disparities in counts produced by various counting methods. A close look at count disparities illuminates two problems with today’s methods. First, the most reliable count estimates are not generalizable. Second, evidence suggests that providers act to shroud true counts, sometimes in ways that put patients at risk. Part III suggests that recent technological advances might make it possible to use law to improve the accuracy of adverse event counts. In particular, we explore the law’s possible annexing of three technological advances—digitized patient data, artificial intelligence, and cryptography—to assemble a state-of-the-art adverse events dataset that could make it possible for policy makers, in conjunction with providers, to take well-informed steps towards increasing patient safety. Part IV discusses a number of possible hurdles and concludes

    SOME ISSUES CONCERNING THE ELEMENTS OF CONTROL FUNCTION OF MANAGEMENT

    Get PDF
    In the field literature and the specific practice, the use of terms, such as – control,verification, evaluation, audit, on one side, and on the other side, the definitions for control functionof management, respectively, control-evaluation function – remain highly ambiguous. Consideringthese observations, the authors point out several useful aspects meant to clarify this issue. In orderto highlight the complexity and the integrality of the management function, the analysis of theelements composing a control system is undertaken. Constantly, the parts are reported to the whole;therefore, those concerning the evaluation and the verification are reported to the system providingthe exercise of the control function. Willing to eliminate ambiguity, for each of the conceptsinvolved, certain substances are proposed, as being considered better confined and oriented.management, control, verification, evaluation, audit

    Big Data and the Internet of Things

    Full text link
    Advances in sensing and computing capabilities are making it possible to embed increasing computing power in small devices. This has enabled the sensing devices not just to passively capture data at very high resolution but also to take sophisticated actions in response. Combined with advances in communication, this is resulting in an ecosystem of highly interconnected devices referred to as the Internet of Things - IoT. In conjunction, the advances in machine learning have allowed building models on this ever increasing amounts of data. Consequently, devices all the way from heavy assets such as aircraft engines to wearables such as health monitors can all now not only generate massive amounts of data but can draw back on aggregate analytics to "improve" their performance over time. Big data analytics has been identified as a key enabler for the IoT. In this chapter, we discuss various avenues of the IoT where big data analytics either is already making a significant impact or is on the cusp of doing so. We also discuss social implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski (eds.) Big Data Analysis: New algorithms for a new society, Springer Series on Studies in Big Data, to appea
    corecore