Medicaid is the largest health insurance in the U.S. It provides health coverage to over 68
million individuals, costs the nation over $600 billion a year, and subject to improper payments
(fraud, waste, and abuse) or inaccurate payments (claim processed erroneously). Medicaid
programs partially use Fee-For-Services (FFS) to provide coverage to beneficiaries by
adjudicating claims and leveraging traditional inferential statistics to verify the quality of
adjudicated claims. These quality methods only provide an interval estimate of the quality errors
and are incapable of detecting most claim adjudication errors, potentially millions of dollar
opportunity costs. This dissertation studied a method of applying supervised learning to detect
erroneous payment in the entire population of adjudicated claims in each Medicaid Management
Information System (MMIS), focusing on two specific claim types: inpatient and outpatient. A
synthesized source of adjudicated claims generated by the Centers for Medicare & Medicaid
Services (CMS) was used to create the original dataset. Quality reports from California FFS
Medicaid were used to extract the underlying statistical pattern of claim adjudication errors in
each Medicaid FFS and data labeling utilizing the goodness of fit and Anderson-Darling tests.
Principle Component Analysis (PCA) and business knowledge were applied for dimensionality
reduction resulting in the selection of sixteen (16) features for the outpatient and nineteen (19)
features for the inpatient claims models. Ten (10) supervised learning algorithms were trained
and tested on the labeled data: Decision tree with two configurations - Entropy and Gini,
Random forests with two configurations - Entropy and Gini, Naïve Bayes, K Nearest Neighbor,
Logistic Regression, Neural Network, Discriminant Analysis, and Gradient Boosting. Five (5) cross-validation and event-based sampling were applied during the training process (with oversampling using SMOTE method and stratification within oversampling). The prediction power (Gini importance) for the selected features were measured using the Mean Decrease in
Impurity (MDI) method across three algorithms. A one-way ANOVA and Tukey and Fisher LSD
pairwise comparisons were conducted. Results show that the Claim Payment Amount
significantly outperforms the rest of the prediction power (highest Mean F-value for Gini
importance at the α = 0.05 significance) for both claim types. Finally, all algorithms' recall and
F1-score were measured for both claim types (inpatient and outpatient) and with and without
oversampling. A one-way ANOVA and Tukey and Fisher LSD pairwise comparisons were
conducted. The results show a statistically significant difference in the algorithm's performance
in detecting quality issues in the outpatient and inpatient claims. Gradient Boosting, Decision
Tree (with various configurations and sampling strategies) outperform the rest of the algorithms
in recall and F1-measure on both datasets. Logistic Regression showing better recall on the
outpatient than inpatient data, and Naïve Bays performs considerably better from recall and F1-
score on outpatient data. Medicaid FFS programs and consultants, Medicaid administrators, and
researchers could use this study to develop machine learning models to detect quality issues in
the Medicaid FFS claim datasets at scale, saving potentially millions of dollars