155 research outputs found

    Show Me Your Claims and I\u27ll Tell You Your Offenses: Machine Learning-Based Decision Support for Fraud Detection on Medical Claim Data

    Get PDF
    Health insurance claim fraud is a serious issue for the healthcare industry as it drives up costs and inefficiency. Therefore, claim fraud must be effectively detected to provide economical and high-quality healthcare. In practice, however, fraud detection is mainly performed by domain experts resulting in significant cost and resource consumption. This paper presents a novel Convolutional Neural Network-based fraud detection approach that was developed, implemented, and evaluated on Medicare Part B records. The model aids manual fraud detection by classifying potential types of fraud, which can then be specifically analyzed. Our model is the first of its kind for Medicare data, yields an AUC of 0.7 for selected fraud types and provides an applicable method for medical claim fraud detection

    Show Me Your Claims and I'll Tell You Your Offenses: Machine Learning-Based Decision Support for Fraud Detection on Medical Claim Data

    Get PDF
    Health insurance claim fraud is a serious issue for the healthcare industry as it drives up costs and inefficiency. Therefore, claim fraud must be effectively detected to provide economical and high-quality healthcare. In practice, however, fraud detection is mainly performed by domain experts resulting in significant cost and resource consumption. This paper presents a novel Convolutional Neural Network-based fraud detection approach that was developed, implemented, and evaluated on Medicare Part B records. The model aids manual fraud detection by classifying potential types of fraud, which can then be specifically analyzed. Our model is the first of its kind for Medicare data, yields an AUC of 0.7 for selected fraud types and provides an applicable method for medical claim fraud detection

    Exploration of Data Science Toolbox and Predictive Models to Detect and Prevent Medicare Fraud, Waste, and Abuse

    Get PDF
    The Federal Department of Health and Human Services spends approximately 830BillionannuallyonMedicareofwhichanestimated830 Billion annually on Medicare of which an estimated 30 to $110 billion is some form of fraud, waste, or abuse (FWA). Despite the Federal Government’s ongoing auditing efforts, fraud, waste, and abuse is rampant and requires modern machine learning approaches to generalize and detect such patterns. New and novel machine learning algorithms offer hope to help detect fraud, waste, and abuse. The existence of publicly accessible datasets complied by The Centers for Medicare & Medicaid Services (CMS) contain vast quantities of structured data. This data, coupled with industry standardized billing codes provides many opportunities for the application of machine learning for fraud, waste, and abuse detection. This research aims to develop a new model utilizing machine learning to generalize the patterns of fraud, waste, and abuse in Medicare. This task is accomplished by linking provider and payment data with the list of excluded individuals and entities to train an Isolation Forest algorithm on previously fraudulent behavior. Results indicate anomalous instances occurring in 0.2% of all analyzed claims, demonstrating machine learning models’ predictive ability to detect FWA

    Data-Driven Models, Techniques, and Design Principles for Combatting Healthcare Fraud

    Get PDF
    In the U.S., approximately 700billionofthe700 billion of the 2.7 trillion spent on healthcare is linked to fraud, waste, and abuse. This presents a significant challenge for healthcare payers as they navigate fraudulent activities from dishonest practitioners, sophisticated criminal networks, and even well-intentioned providers who inadvertently submit incorrect billing for legitimate services. This thesis adopts Hevner’s research methodology to guide the creation, assessment, and refinement of a healthcare fraud detection framework and recommended design principles for fraud detection. The thesis provides the following significant contributions to the field:1. A formal literature review of the field of fraud detection in Medicaid. Chapters 3 and 4 provide formal reviews of the available literature on healthcare fraud. Chapter 3 focuses on defining the types of fraud found in healthcare. Chapter 4 reviews fraud detection techniques in literature across healthcare and other industries. Chapter 5 focuses on literature covering fraud detection methodologies utilized explicitly in healthcare.2. A multidimensional data model and analysis techniques for fraud detection in healthcare. Chapter 5 applies Hevner et al. to help develop a framework for fraud detection in Medicaid that provides specific data models and techniques to identify the most prevalent fraud schemes. A multidimensional schema based on Medicaid data and a set of multidimensional models and techniques to detect fraud are presented. These artifacts are evaluated through functional testing against known fraud schemes. This chapter contributes a set of multidimensional data models and analysis techniques that can be used to detect the most prevalent known fraud types.3. A framework for deploying outlier-based fraud detection methods in healthcare. Chapter 6 proposes and evaluates methods for applying outlier detection to healthcare fraud based on literature review, comparative research, direct application on healthcare claims data, and known fraudulent cases. A method for outlier-based fraud detection is presented and evaluated using Medicaid dental claims, providers, and patients.4. Design principles for fraud detection in complex systems. Based on literature and applied research in Medicaid healthcare fraud detection, Chapter 7 offers generalized design principles for fraud detection in similar complex, multi-stakeholder systems.<br/

    MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry

    Get PDF
    Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%)

    Unsupervised learning for anomaly detection in Australian medical payment data

    Full text link
    Fraudulent or wasteful medical insurance claims made by health care providers are costly for insurers. Typically, OECD healthcare organisations lose 3-8% of total expenditure due to fraud. As Australia’s universal public health insurer, Medicare Australia, spends approximately A34billionperannumontheMedicareBenefitsSchedule(MBS)andPharmaceuticalBenefitsScheme,wastedspendingofA 34 billion per annum on the Medicare Benefits Schedule (MBS) and Pharmaceutical Benefits Scheme, wasted spending of A1–2.7 billion could be expected.However, fewer than 1% of claims to Medicare Australia are detected as fraudulent, below international benchmarks. Variation is common in medicine, and health conditions, along with their presentation and treatment, are heterogenous by nature. Increasing volumes of data and rapidly changing patterns bring challenges which require novel solutions. Machine learning and data mining are becoming commonplace in this field, but no gold standard is yet available. In this project, requirements are developed for real-world application to compliance analytics at the Australian Government Department of Health and Aged Care (DoH), covering: unsupervised learning; problem generalisation; human interpretability; context discovery; and cost prediction. Three novel methods are presented which rank providers by potentially recoverable costs. These methods used association analysis, topic modelling, and sequential pattern mining to provide interpretable, expert-editable models of typical provider claims. Anomalous providers are identified through comparison to the typical models, using metrics based on costs of excess or upgraded services. Domain knowledge is incorporated in a machine-friendly way in two of the methods through the use of the MBS as an ontology. Validation by subject-matter experts and comparison to existing techniques shows that the methods perform well. The methods are implemented in a software framework which enables rapid prototyping and quality assurance. The code is implemented at the DoH, and further applications as decision-support systems are in progress. The developed requirements will apply to future work in this fiel

    A model for the automated detection of fraudulent healthcare claims using data mining methods

    Get PDF
    Abstract : The menace of fraud today cannot be underestimated. The healthcare system put in place to facilitate rendering medical services as well as improving access to medical services has not been an exception to fraudulent activities. Traditional healthcare claims fraud detection methods no longer suffice due to the increased complexity in the medical billing process. Machine learning has become a very important technique in the computing world today. The abundance of computing power has aided the adoption of machine learning by different problem domains including healthcare claims fraud detection. The study explores the application of different machine learning methods in the process of detecting possible fraudulent healthcare claims fraud. We propose a data mining model that incorporates several knowledge discovery processes in the pipeline. The model makes use of the data from the Medicare payment data from the Centre for Medicare and Medicaid Services as well as data from the List of Excluded Individual or Entities (LEIE) database. The data was then passed through the data pre-processing and transformation stages to get the data to a desirable state. Once the data is in the desired state, we apply several machine learning methods to derive knowledge as well as classify the data into fraudulent and non-fraudulent claims. The results derived from the comprehensive benchmark used on the implemented version of the model, have shown that machine learning methods can be used to detect possible fraudulent healthcare claims. The models based on the Gradient Boosted Tree Classifier and Artificial Neural Network performed best while the Naïve Bayes model couldn’t classify the data. By applying the correct pre-processing method as well as data transformation methods to the Medicare data, along with the appropriate machine learning methods, the healthcare fraud detection system yields nominal results for identification of possible fraudulent claims in the medical billing process.M.Sc. (Computer Science

    A TAXONOMY OF MACHINE LEARNING-BASED FRAUD DETECTION SYSTEMS

    Get PDF
    As fundamental changes in information systems drive digitalization, the heavy reliance on computers today significantly increases the risk of fraud. Existing literature promotes machine learning as a potential solution approach for the problem of fraud detection as it is able able to detect patterns in large datasets efficiently. However, there is a lack of clarity and awareness on which components and functionalities of machine learning-based fraud detection systems exist and how these systems can be classified consistently. We draw on 54 identified relevant machine learning-based fraud detection systems to address this research gap and develop a taxonomic scheme. By deriving three archetypes of machine learning-based fraud detection systems, the taxonomy paves the way for research and practice to understand and advance fraud detection knowledge to combat fraud and abuse

    Holding down health care costs : a guide for the financial executive;

    Get PDF
    https://egrove.olemiss.edu/aicpa_guides/1130/thumbnail.jp

    A study assessing the characteristics of big data environments that predict high research impact: application of qualitative and quantitative methods

    Full text link
    BACKGROUND: Big data offers new opportunities to enhance healthcare practice. While researchers have shown increasing interest to use them, little is known about what drives research impact. We explored predictors of research impact, across three major sources of healthcare big data derived from the government and the private sector. METHODS: This study was based on a mixed methods approach. Using quantitative analysis, we first clustered peer-reviewed original research that used data from government sources derived through the Veterans Health Administration (VHA), and private sources of data from IBM MarketScan and Optum, using social network analysis. We analyzed a battery of research impact measures as a function of the data sources. Other main predictors were topic clusters and authors’ social influence. Additionally, we conducted key informant interviews (KII) with a purposive sample of high impact researchers who have knowledge of the data. We then compiled findings of KIIs into two case studies to provide a rich understanding of drivers of research impact. RESULTS: Analysis of 1,907 peer-reviewed publications using VHA, IBM MarketScan and Optum found that the overall research enterprise was highly dynamic and growing over time. With less than 4 years of observation, research productivity, use of machine learning (ML), natural language processing (NLP), and the Journal Impact Factor showed substantial growth. Studies that used ML and NLP, however, showed limited visibility. After adjustments, VHA studies had generally higher impact (10% and 27% higher annualized Google citation rates) compared to MarketScan and Optum (p<0.001 for both). Analysis of co-authorship networks showed that no single social actor, either a community of scientists or institutions, was dominating. Other key opportunities to achieve high impact based on KIIs include methodological innovations, under-studied populations and predictive modeling based on rich clinical data. CONCLUSIONS: Big data for purposes of research analytics has grown within the three data sources studied between 2013 and 2016. Despite important challenges, the research community is reacting favorably to the opportunities offered both by big data and advanced analytic methods. Big data may be a logical and cost-efficient choice to emulate research initiatives where RCTs are not possible
    • 

    corecore