7 research outputs found

    Fine-Tuning a -Nearest Neighbors Machine Learning Model for the Detection of Insurance Fraud

    Get PDF
    Billions of dollars are lost within insurance companies due to fraud. Large money losses force insurance companies to increase premium costs and/or restrict policies. This negatively affects a company’s loyal customers. Although this is a prevalent problem, companies are not urgently working toward bettering their machine learning algorithms. Underskilled workers paired with inefficient computer algorithms make it difficult to accurately and reliably detect fraud. The goal of this study is to understand the idea of -Nearest Neighbors ( -NN) and to use this classification technique to accurately detect fraudulent auto insurance claims. Using -NN requires choosing a value and a distance metric. The best choice of values and distance metrics will be unique to every dataset. This study aims to break down the processes involved in determining an accurate value and distance metric for a sample auto insurance claims dataset. Odd values 1 through 19 and the Euclidean, Manhattan, Chebyshev, and Hassanat metrics are analyzed using Excel and R. Results support the idea that unique values and distance metrics are needed depending on the dataset being worked with. Keywords: machine learning, insurance, fraud, detection, k-NN, distanc

    A Calibrated Data-Driven Approach for Small Area Estimation using Big Data

    Full text link
    Where the response variable in a big data set is consistent with the variable of interest for small area estimation, the big data by itself can provide the estimates for small areas. These estimates are often subject to the coverage and measurement error bias inherited from the big data. However, if a probability survey of the same variable of interest is available, the survey data can be used as a training data set to develop an algorithm to impute for the data missed by the big data and adjust for measurement errors. In this paper, we outline a methodology for such imputations based on an kNN algorithm calibrated to an asymptotically design-unbiased estimate of the national total and illustrate the use of a training data set to estimate the imputation bias and the fixed - asymptotic bootstrap to estimate the variance of the small area hybrid estimator. We illustrate the methodology of this paper using a public use data set and use it to compare the accuracy and precision of our hybrid estimator with the Fay-Harriot (FH) estimator. Finally, we also examine numerically the accuracy and precision of the FH estimator when the auxiliary variables used in the linking models are subject to under-coverage errorsComment: 26 pages, 2 figures, 2 tables and 2 appendice

    On Identifying Terrorists Using Their Victory Signs

    Get PDF
    In certain cases, the only evidence to identify terrorists, who are seen in digital images or videos is their hands’ shapes, particularly, the victory sign as performed by many of them when they intentionally hide their faces, and/or distort their voices. This paper proposes new methods to identify those persons for the first time from their victory sign. These methods are based on features extracted from the fingers areas using shape moments in addition to other features related to fingers contours. To evaluate the proposed methods and to show the feasibility of this study we have created a victory sign database for 400 volunteers using a mobile phone camera. The experimental results using different classifiers show encouraging identification results; as the best precision/recall were achieved by merging normalized features from both methods using linear discriminate analysis classifier with 96.6% precision and 96.3 recall. Such a high performance achieved by the proposed methods shows their great potential to be applied for terrorists’ identification from their victory sign
    corecore