Market events such as order placement and order cancellation are examples of
the complex and substantial flow of data that surrounds a modern financial
engineer. New mathematical techniques, developed to describe the interactions
of complex oscillatory systems (known as the theory of rough paths) provides
new tools for analysing and describing these data streams and extracting the
vital information. In this paper we illustrate how a very small number of
coefficients obtained from the signature of financial data can be sufficient to
classify this data for subtle underlying features and make useful predictions.
This paper presents financial examples in which we learn from data and then
proceed to classify fresh streams. The classification is based on features of
streams that are specified through the coordinates of the signature of the
path. At a mathematical level the signature is a faithful transform of a
multidimensional time series. (Ben Hambly and Terry Lyons \cite{uniqueSig}),
Hao Ni and Terry Lyons \cite{NiLyons} introduced the possibility of its use to
understand financial data and pointed to the potential this approach has for
machine learning and prediction.
We evaluate and refine these theoretical suggestions against practical
examples of interest and present a few motivating experiments which demonstrate
information the signature can easily capture in a non-parametric way avoiding
traditional statistical modelling of the data. In the first experiment we
identify atypical market behaviour across standard 30-minute time buckets
sampled from the WTI crude oil future market (NYMEX). The second and third
experiments aim to characterise the market "impact" of and distinguish between
parent orders generated by two different trade execution algorithms on the FTSE
100 Index futures market listed on NYSE Liffe