3 research outputs found

    Using Feature Selection with Machine Learning for Generation of Insurance Insights

    Get PDF
    Insurance is a data-rich sector, hosting large volumes of customer data that is analysed to evaluate risk. Machine learning techniques are increasingly used in the effective management of insurance risk. Insurance datasets by their nature, however, are often of poor quality with noisy subsets of data (or features). Choosing the right features of data is a significant pre-processing step in the creation of machine learning models. The inclusion of irrelevant and redundant features has been demonstrated to affect the performance of learning models. In this article, we propose a framework for improving predictive machine learning techniques in the insurance sector via the selection of relevant features. The experimental results, based on five publicly available real insurance datasets, show the importance of applying feature selection for the removal of noisy features before performing machine learning techniques, to allow the algorithm to focus on influential features. An additional business benefit is the revelation of the most and least important features in the datasets. These insights can prove useful for decision making and strategy development in areas/business problems that are not limited to the direct target of the downstream algorithms. In our experiments, machine learning techniques based on a set of selected features suggested by feature selection algorithms outperformed the full feature set for a set of real insurance datasets. Specifically, 20% and 50% of features in our five datasets had improved downstream clustering and classification performance when compared to whole datasets. This indicates the potential for feature selection in the insurance sector to both improve model performance and to highlight influential features for business insights

    Supporting Telecommunication Alarm Management System with Trouble Ticket Prediction

    Get PDF
    Fault alarm data emanated from heterogeneous telecommunication network services and infrastructures are exploding with network expansions. Managing and tracking the alarms with Trouble Tickets using manual or expert rule- based methods has become challenging due to increase in the complexity of Alarm Management Systems and demand for deployment of highly trained experts. As the size and complexity of networks hike immensely, identifying semantically identical alarms, generated from heterogeneous network elements from diverse vendors, with data-driven methodologies has become imperative to enhance efficiency. In this paper, a data-driven Trouble Ticket prediction models are proposed to leverage Alarm Management Systems. To improve performance, feature extraction, using a sliding time-window and feature engineering, from related history alarm streams is also introduced. The models were trained and validated with a data-set provided by the largest telecommunication provider in Italy. The experimental results showed the promising efficacy of the proposed approach in suppressing false positive alarms with Trouble Ticket prediction
    corecore