1 research outputs found
SynthETIC: an individual insurance claim simulator with feature control
Recent years have seen rapid increase in the application of machine learning
to insurance loss reserving. They yield most value when applied to large data
sets, such as individual claims, or large claim triangles. In short, they are
likely to be useful in the analysis of any data set whose volume is sufficient
to obscure a naked-eye view of its features. Unfortunately, such large data
sets are in short supply in the actuarial literature. Accordingly, one needs to
turn to synthetic data. Although the ultimate objective of these methods is
application to real data, the use of synthetic data containing features
commonly observed in real data is also to be encouraged.
While there are a number of claims simulators in existence, each valuable
within its own context, the inclusion of a number of desirable (but
complicated) data features requires further development. Accordingly, in this
paper we review those desirable features, and propose a new simulator of
individual claim experience called SynthETIC.
Our simulator is publicly available, open source, and fills a gap in the
non-life actuarial toolkit. The simulator specifically allows for desirable
(but optionally complicated) data features typically occurring in practice,
such as variations in rates of settlements and development patterns; as with
superimposed inflation, and various discontinuities, and also enables various
dependencies between variables. The user has full control of the mechanics of
the evolution of an individual claim. As a result, the complexity of the data
set generated (meaning the level of difficulty of analysis) may be dialled
anywhere from extremely simple to extremely complex