832 research outputs found

    Predicting automobile insurance fraud using classical and machine learning models

    Get PDF
    Insurance fraud claims have become a major problem in the insurance industry. Several investigations have been carried out to eliminate negative impacts on the insurance industry as this immoral act has caused the loss of billions of dollars. In this paper, a comparative study was carried out to assess the performance of various classification models, namely logistic regression, neural network (NN), support vector machine (SVM), tree augmented naïve Bayes (NB), decision tree (DT), random forest (RF) and AdaBoost with different model settings for predicting automobile insurance fraud claims. Results reveal that the tree augmented NB outperformed other models based on several performance metrics with accuracy (79.35%), sensitivity (44.70%), misclassification rate (20.65%), area under curve (0.81) and Gini (0.62). In addition, the result shows that the AdaBoost algorithm can improve the classification performance of the decision tree. These findings are useful for insurance professionals to identify potential insurance fraud claim cases

    A Multi-view Context-aware Approach to Android Malware Detection and Malicious Code Localization

    Full text link
    Existing Android malware detection approaches use a variety of features such as security sensitive APIs, system calls, control-flow structures and information flows in conjunction with Machine Learning classifiers to achieve accurate detection. Each of these feature sets provides a unique semantic perspective (or view) of apps' behaviours with inherent strengths and limitations. Meaning, some views are more amenable to detect certain attacks but may not be suitable to characterise several other attacks. Most of the existing malware detection approaches use only one (or a selected few) of the aforementioned feature sets which prevent them from detecting a vast majority of attacks. Addressing this limitation, we propose MKLDroid, a unified framework that systematically integrates multiple views of apps for performing comprehensive malware detection and malicious code localisation. The rationale is that, while a malware app can disguise itself in some views, disguising in every view while maintaining malicious intent will be much harder. MKLDroid uses a graph kernel to capture structural and contextual information from apps' dependency graphs and identify malice code patterns in each view. Subsequently, it employs Multiple Kernel Learning (MKL) to find a weighted combination of the views which yields the best detection accuracy. Besides multi-view learning, MKLDroid's unique and salient trait is its ability to locate fine-grained malice code portions in dependency graphs (e.g., methods/classes). Through our large-scale experiments on several datasets (incl. wild apps), we demonstrate that MKLDroid outperforms three state-of-the-art techniques consistently, in terms of accuracy while maintaining comparable efficiency. In our malicious code localisation experiments on a dataset of repackaged malware, MKLDroid was able to identify all the malice classes with 94% average recall

    Light Auditor: Power Measurement Can Tell Private Data Leakage Through IoT Covert Channels

    Get PDF
    Despite many conveniences of using IoT devices, they have suffered from various attacks due to their weak security. Besides well-known botnet attacks, IoT devices are vulnerable to recent covert-channel attacks. However, no study to date has considered these IoT covert-channel attacks. Among these attacks, researchers have demonstrated exfiltrating users\u27 private data by exploiting the smart bulb\u27s capability of infrared emission. In this paper, we propose a power-auditing-based system that defends the data exfiltration attack on the smart bulb as a case study. We first implement this infrared-based attack in a lab environment. With a newly-collected power consumption dataset, we pre-process the data and transform them into two-dimensional images through Continous Wavelet Transformation (CWT). Next, we design a two-dimensional convolutional neural network (2D-CNN) model to identify the CWT images generated by malicious behavior. Our experiment results show that the proposed design is efficient in identifying infrared-based anomalies: 1) With much fewer parameters than transfer-learning classifiers, it achieves an accuracy of 88% in identifying the attacks, including unseen patterns. The results are similarly accurate as the sophisticated transfer-learning CNNs, such as AlexNet and GoogLeNet; 2) We validate that our system can classify the CWT images in real time

    Complex networks in audit:A data-driven modelling approach

    Get PDF
    In this thesis, we introduce data-driven audit methods using a network-based approach. Utilizing data from over 300 companies, it transforms transaction data into a network format, providing auditors with a clear overview of a company's financial structure. Chapter 2 details the financial statements network, designed for straightforward interpretation by auditors. This network effectively represents the company's financial structure, aiding in developing universal data-driven audit methods. Chapter 3's analysis reveals that the financial account nodes' degree distribution typically follows a heavy-tail distribution. Moreover, we found only minor variations in network statistics across industries. These findings help establish baseline expectations for network statistics, facilitating risk assessment. Chapter 4 addresses the complexity of these networks, proposing a method to simplify them into a more understandable high-level structure for auditors. Chapter 5 explores a similarity measure to compare financial structures, helping auditors identify deviations in a client's financial network compared to peers or historical data. Deviations could signal increased audit risks. In summary, we pioneer data-driven audit methods using financial statement networks, providing new insights and tools for auditors and paving the way for more efficient and effective audit processes

    Extending Capability and Implementing a Web Interface for the XALT Software Monitoring Tool

    Get PDF
    As high performance computing centers evolve in terms of hardware, software, and user-base, the act of monitoring and managing such systems requires specialized tools. The tool discussed in this thesis is XALT, which is a collaborative effort between the National Institute for Computational Sciences and Texas Advanced Computing Center. XALT is designed to track link-time and job level information for applications that are compiled and executed on any Linux cluster, workstation, or high-end supercomputer. The key objectives of this work are to extend the existing functionality of XALT and implement a real-time web portal to easily visualize the tracked data. A prototype is developed to track function calls resolved by external libraries which helps software management. The web portal generates reports and metrics which would improve efficiency and effectiveness for an extensive community of stakeholders including users, support organizations, and development teams. In addition, we discuss use cases of interest to center support staff and researchers on identifying users based on given counters and generating provenance reports. This work details the opportunity and challenges to further push XALT towards becoming a complete package

    Machine Learning Solution to Organ-At-Risk Segmentation for Radiation Treatment Planning

    Get PDF
    In the treatment of cancer using ionizing radiation, it is important to design a treatment plan such that dose to normal, healthy organs is sufficiently low. Today, segmentation requires a trained human to carefully outline, or segment, organs on each slice of a treatment planning computed tomography (CT) scan but it is laborious, time-consuming, and contains intra- and inter-rater variability. Currently, existing clinical automation technology relies on atlas-based automation, which has limited segmentation accuracy. Thus the auto-segmentations require post process editing by an expert. In this paper, we propose a machine learning solution that shortens the segmentation time of organs-at-risk (OARs) in the thoracic cavity. The overall system will include preprocessing, model processing, and postprocessing steps to make the system easily integratable into the radiotherapy planning process. For our model, we chose to use a 3D deep convolutional neural network with a U-net based architecture because this machine learning strategy takes into account local spatial relationships, will restore the original image resolution and has been utilized in image segmentation, especially in medical image analysis. Training and testing were done with a 60 patient dataset of thoracic CT scans from the AAPM 2017 Grand Challenge. To assess and improve our system we calculated accuracy metrics (Dice similarity coefficient (DSC), mean surface distance (MSD)) and compared our model’s segmentation performance to that of an expert and the top two performing machine learning methods of the challenge. We explored using preprocessing steps such as cropping and image enhancement to improve the model segmentation accuracy. Our final model was able to segment the lungs as accurately as a dosimetrist and the heart and spinal cord within acceptable DSC ranges. All DSC values of the OARs from our method were as accurate as other machine learning methods. The DSC for the esophagus was below tolerable error for radiotherapy planning, but our mean surface distance was superior to other auto-segmentation methods. We were successful in significantly reducing manual segmentation time by developing a machine learning system. Though our approach still necessitates a single preparatory step of manually cropping anatomical regions to isolate segmentation volume, a general hospital technician could complete this task which removes the need of an expert for one time-consuming step of radiotherapy planning. Implementation of our methods to provide radiotherapy in lower-middle income countries brings us closer to accessibility of treatment for a wider population
    corecore