5,777 research outputs found
A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs
Representing the reservoir as a network of discrete compartments with
neighbor and non-neighbor connections is a fast, yet accurate method for
analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale
compartments with distinct static and dynamic properties is an integral part of
such high-level reservoir analysis. In this work, we present a hybrid framework
specific to reservoir analysis for an automatic detection of clusters in space
using spatial and temporal field data, coupled with a physics-based multiscale
modeling approach. In this work a novel hybrid approach is presented in which
we couple a physics-based non-local modeling framework with data-driven
clustering techniques to provide a fast and accurate multiscale modeling of
compartmentalized reservoirs. This research also adds to the literature by
presenting a comprehensive work on spatio-temporal clustering for reservoir
studies applications that well considers the clustering complexities, the
intrinsic sparse and noisy nature of the data, and the interpretability of the
outcome.
Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal
Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin
Auto Insurance Business Analytics Approach for Customer Segmentation Using Multiple Mixed-Type Data Clustering Algorithms
Customer segmentation is critical for auto insurance companies to gain competitive advantage by mining useful customer related information. While some efforts have been made for customer segmentation to support auto insurance decision making, their customer segmentation results tend to be affected by the characteristics of the algorithm used and lack multiple validation from multiple algorithms. To this end, we propose an auto insurance business analytics approach that segments customers by using three mixed-type data clustering algorithms including k-prototypes, improved k-prototypes and similarity-based agglomerative clustering. The customer segmentation results of these algorithms can complement and reinforce each other and demonstrate as much information as possible to support decision-making. To confirm its practical value, the proposed approach extracts seven rules for an auto insurance company that may support the company to make customer related decisions and develop insurance products
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
Applications of Clustering with Mixed Type Data in Life Insurance
Death benefits are generally the largest cash flow item that affects
financial statements of life insurers where some still do not have a systematic
process to track and monitor death claims experience. In this article, we
explore data clustering to examine and understand how actual death claims
differ from expected, an early stage of developing a monitoring system crucial
for risk management. We extend the -prototypes clustering algorithm to draw
inference from a life insurance dataset using only the insured's
characteristics and policy information without regard to known mortality. This
clustering has the feature to efficiently handle categorical, numerical, and
spatial attributes. Using gap statistics, the optimal clusters obtained from
the algorithm are then used to compare actual to expected death claims
experience of the life insurance portfolio. Our empirical data contains
observations, during 2014, of approximately 1.14 million policies with a total
insured amount of over 650 billion dollars. For this portfolio, the algorithm
produced three natural clusters, with each cluster having a lower actual to
expected death claims but with differing variability. The analytical results
provide management a process to identify policyholders' attributes that
dominate significant mortality deviations, and thereby enhance decision making
for taking necessary actions.Comment: 25 pages, 6 figures, 5 table
An Extended RFM Model for Customer Behaviour and Demographic Analysis in Retail Industry
Background: Customer segmentation has become one of the most innovative ways which help businesses adopt appropriate marketing campaigns and reach targeted customers. The RFM model and machine learning combination have been widely applied in various areas. Motivations: With the rapid increase of transactional data, the RFM model can accurately segment customers and provide deeper insights into customers’ purchasing behaviour. However, the traditional RFM model is limited to 3 variables, Recency, Frequency and Monetary, without revealing segments based on demographic features. Meanwhile, the contribution of demographic characteristics to marketing strategies is extremely important. Methods/Approach: The article proposed an extended RFMD model (D-Demographic) with a combination of behavioural and demographic variables. Customer segmentation can be performed effectively using the RFMD model, K-Means, and K-Prototype algorithms. Results: The extended model is applied to the retail dataset, and the experimental result shows 5 clusters with different features. The effectiveness of the new model is measured by the Adjusted Rand Index and Adjusted Mutual Information. Furthermore, we use Cohort analysis to analyse customer retention rates and recommend marketing strategies for each segment. Conclusions: According to the evaluation, the proposed RMFD model was deployed with stable results created by two clustering algorithms. Businesses can apply this model to deeply understand customer behaviour with their demographics and launch efficient campaigns
- …