6 research outputs found

    Half a century of amyloids: past, present and future

    Get PDF
    Amyloid diseases are global epidemics with profound health, social and economic implications and yet remain without a cure. This dire situation calls for research into the origin and pathological manifestations of amyloidosis to stimulate continued development of new therapeutics. In basic science and engineering, the cross-ß architecture has been a constant thread underlying the structural characteristics of pathological and functional amyloids, and realizing that amyloid structures can be both pathological and functional in nature has fuelled innovations in artificial amyloids, whose use today ranges from water purification to 3D printing. At the conclusion of a half century since Eanes and Glenner's seminal study of amyloids in humans, this review commemorates the occasion by documenting the major milestones in amyloid research to date, from the perspectives of structural biology, biophysics, medicine, microbiology, engineering and nanotechnology. We also discuss new challenges and opportunities to drive this interdisciplinary field moving forward. This journal i

    Big Data Analytics and Engineering for Medicare Fraud Detection

    No full text
    The United States (U.S.) healthcare system produces an enormous volume of data with a vast number of financial transactions generated by physicians administering healthcare services. This makes healthcare fraud difficult to detect, especially when there are considerably less fraudulent transactions than non-fraudulent. Fraud is an extremely important issue for healthcare, as fraudulent activities within the U.S. healthcare system contribute to significant financial losses. In the U.S., the elderly population continues to rise, increasing the need for programs, such as Medicare, to help with associated medical expenses. Unfortunately, due to healthcare fraud, these programs are being adversely affected, draining resources and reducing the quality and accessibility of necessary healthcare services. In response, advanced data analytics have recently been explored to detect possible fraudulent activities. The Centers for Medicare and Medicaid Services (CMS) released several 'Big Data' Medicare claims datasets for different parts of their Medicare program to help facilitate this effort. In this dissertation, we employ three CMS Medicare Big Data datasets to evaluate the fraud detection performance available using advanced data analytics techniques, specifically machine learning. We use two distinct approaches, designated as anomaly detection and traditional fraud detection, where each have very distinct data processing and feature engineering. Anomaly detection experiments classify by provider specialty, determining whether outlier physicians within the same specialty signal fraudulent behavior. Traditional fraud detection refers to the experiments directly classifying physicians as fraudulent or non-fraudulent, leveraging machine learning algorithms to discriminate between classes. We present our novel data engineering approaches for both anomaly detection and traditional fraud detection including data processing, fraud mapping, and the creation of a combined dataset consisting of all three Medicare parts. We incorporate the List of Excluded Individuals and Entities database to identify real-world fraudulent physicians for model evaluation. Regarding features, the final datasets for anomaly detection contain only claim counts for every procedure a physician submits while traditional fraud detection incorporates aggregated counts and payment information, specialty, and gender. Additionally, we compare cross-validation to the real-world application of building a model on a training dataset and evaluating on a separate test dataset for severe class imbalance and rarity

    The effects of class rarity on the evaluation of supervised healthcare fraud detection models

    No full text
    Abstract The United States healthcare system produces an enormous volume of data with a vast number of financial transactions generated by physicians administering healthcare services. This makes healthcare fraud difficult to detect, especially when there are considerably less fraudulent transactions (documented and readily available) than non-fraudulent. The ability to successfully detect fraudulent activities in healthcare, given such discrepancies, can garner up to $350 billion in recovered monetary losses. In machine learning, when one class has a substantially larger number of instances (majority) compared to the other (minority), this is known as class imbalance. In this paper, we focus specifically on Medicare, utilizing three ‘Big Data’ Medicare claims datasets with real-world fraudulent physicians. We create a training and test dataset for all three Medicare parts, both separately and combined, to assess fraud detection performance. To emulate class rarity, which indicates particularly severe levels of class imbalance, we generate additional datasets, by removing fraud instances, to determine the effects of rarity on fraud detection performance. Before a machine learning model can be distributed for real-world use, a performance evaluation is necessary to determine the best configuration (e.g. learner, class sampling ratio) and whether the associated error rates are low, indicating good detection rates. With our research, we demonstrate the effects of severe class imbalance and rarity using a training and testing (Train_Test) evaluation method via a hold-out set, and provide our recommendations based on the supervised machine learning results. Additionally, we repeat the same experiments using Cross-Validation, and determine it is a viable substitute for Medicare fraud detection. For machine learning with the severe class imbalance datasets, we found that, as expected, fraud detection performance decreased as the fraudulent instances became more rare. We apply Random Undersampling to both Train_Test and Cross-Validation, for all original and generated datasets, in order to assess potential improvements in fraud detection by reducing the adverse effects of class imbalance and rarity. Overall, our results indicate that the Train_Test method significantly outperforms Cross-Validation

    Big Data fraud detection using multiple medicare data sources

    No full text
    Abstract In the United States, advances in technology and medical sciences continue to improve the general well-being of the population. With this continued progress, programs such as Medicare are needed to help manage the high costs associated with quality healthcare. Unfortunately, there are individuals who commit fraud for nefarious reasons and personal gain, limiting Medicare’s ability to effectively provide for the healthcare needs of the elderly and other qualifying people. To minimize fraudulent activities, the Centers for Medicare and Medicaid Services (CMS) released a number of “Big Data” datasets for different parts of the Medicare program. In this paper, we focus on the detection of Medicare fraud using the following CMS datasets: (1) Medicare Provider Utilization and Payment Data: Physician and Other Supplier (Part B), (2) Medicare Provider Utilization and Payment Data: Part D Prescriber (Part D), and (3) Medicare Provider Utilization and Payment Data: Referring Durable Medical Equipment, Prosthetics, Orthotics and Supplies (DMEPOS). Additionally, we create a fourth dataset which is a combination of the three primary datasets. We discuss data processing for all four datasets and the mapping of real-world provider fraud labels using the List of Excluded Individuals and Entities (LEIE) from the Office of the Inspector General. Our exploratory analysis on Medicare fraud detection involves building and assessing three learners on each dataset. Based on the Area under the Receiver Operating Characteristic (ROC) Curve performance metric, our results show that the Combined dataset with the Logistic Regression (LR) learner yielded the best overall score at 0.816, closely followed by the Part B dataset with LR at 0.805. Overall, the Combined and Part B datasets produced the best fraud detection performance with no statistical difference between these datasets, over all the learners. Therefore, based on our results and the assumption that there is no way to know within which part of Medicare a physician will commit fraud, we suggest using the Combined dataset for detecting fraudulent behavior when a physician has submitted payments through any or all Medicare parts evaluated in our study
    corecore