12 research outputs found

    Using Text Mining to Analyze Quality Aspects of Unstructured Data: A Case Study for “stock-touting” Spam Emails

    Get PDF
    The growth in the utilization of text mining tools and techniques in the last decade has been primarily driven by the increase in the sheer volume of unstructured texts and the need to extract useful and more importantly, quality information from them. The impetus to analyse unstructured data efficiently and effectively as part of the decision making processes within an organization has further motivated the need to better understand how to use text mining tools and techniques. This paper describes a case study of a stock spam e-mail architecture that demonstrates the process of refining linguistic resources to extract relevant, high quality information including stock profile, financial key words, stock and company news (positive/negative), and compound phrases from stock spam e-mails. The context of such a study is to identify high quality information patterns that can be used to support relevant authorities in detecting and analyzing fraudulent activities

    Detecting disturbances in supply chains: the case of capacity constraints

    Get PDF
    Purpose – The ability to detect disturbances quickly as they arise in a supply chain helps to manage them efficiently and effectively. This paper is aimed at demonstrating the feasibility of automatically, and therefore quickly detecting a specific disturbance, which is constrained capacity at a supply chain echelon. Design/Methodology/approach – Different supply chain echelons of a simulated four echelon supply chain were individually capacity constrained to assess their impacts on the profiles of system variables, and to develop a signature that related the profiles to the echelon location of the capacity constraint. A review of disturbance detection techniques across various domains formed the basis for considering the signature based technique. Findings – The signature for detecting a capacity constrained echelon was found to be based on cluster profiles of shipping and net inventory variables for that echelon as well as other echelons in a supply chain, where the variables are represented as spectra. Originality/value– Detection of disturbances in a supply chain including that of constrained capacity at an echelon has seen limited research where this study makes a contribution

    Adversarial Attacks on Probabilistic Autoregressive Forecasting Models

    Full text link
    We develop an effective generation of adversarial attacks on neural models that output a sequence of probability distributions rather than a sequence of single values. This setting includes the recently proposed deep probabilistic autoregressive forecasting models that estimate the probability distribution of a time series given its past and achieve state-of-the-art results in a diverse set of application domains. The key technical challenge we address is effectively differentiating through the Monte-Carlo estimation of statistics of the joint distribution of the output sequence. Additionally, we extend prior work on probabilistic forecasting to the Bayesian setting which allows conditioning on future observations, instead of only on past observations. We demonstrate that our approach can successfully generate attacks with small input perturbations in two challenging tasks where robust decision making is crucial: stock market trading and prediction of electricity consumption.Comment: 15 pages, 6 figure

    Detecting market manipulation in stock market data

    Get PDF
    Anomaly Detection is an extensively researched problem that has diverse applications in many domains. Anomaly detection is the process of finding data points or patterns that do not conform to expected behavior within a dataset. Solutions to this problem have used techniques from disciplines such as statistics, machine learning, data mining, spectral theory and information theory. In the case of stock market data, the input is a non-linear complex time series that render statistical methods ineffective. The aim of this thesis, is to detect anomalies within the Standard and Poor and Qatar Stock Exchange using the behavior of similar time series. Many works on stock market manipulation focus on supervised learning techniques, which require labeled datasets. The labeling process requires substantial efforts. Anomalous behavior is also dynamic in nature. For those reasons, the development of an unsupervised market manipulation detection technique would be very interesting. The Contextual Anomaly Detector (CAD) is an unsupervised method that finds anomalies by looking at similarly behaving time series and uses them to predict expected values. When the predicted value is different from the actual value in the time series by a certain threshold, it is considered an anomaly. This thesis will look at the Contextual Anomaly Detector (CAD) and implement a different preprocessing step to improve recall and precision

    Detection of Stock Price Manipulation Using Kernel Based Principal Component Analysis and Multivariate Density Estimation

    Get PDF
    Stock price manipulation uses illegitimate means to artificially influence market prices of several stocks. It causes massive losses and undermines investors’ confidence and the integrity of the stock market. Several existing research works focused on detecting a specific manipulation scheme using supervised learning but lacks the adaptive capability to capture different manipulative strategies. This begets the assumption of model parameter values specific to the underlying manipulation scheme. In addition, supervised learning requires the use of labelled data which is difficult to acquire due to confidentiality and the proprietary nature of trading data. The proposed research establishes a detection model based on unsupervised learning using Kernel Principal Component Analysis (KPCA) and applied increased variance of selected latent features in higher dimensions. A proposed Multidimensional Kernel Density Estimation (MKDE) clustering is then applied upon the selected components to identify abnormal patterns of manipulation in data. This research has an advantage over the existing methods in overcoming the ambiguity of assuming values of several parameters, reducing the high dimensions obtained from conventional KPCA and thereby reducing computational complexity. The robustness of the detection model has also been evaluated when two or more manipulative activities occur within a short duration of each other and by varying the window length of the dataset fed to the model. The results show a comprehensive assessment of the model on multiple datasets and a significant performance enhancement in terms of the F-measure values with a significant reduction in false alarm rate (FAR) has been achieved

    Computational intelligent hybrid model for detecting disruptive trading activity

    Get PDF
    The term “disruptive trading behaviour” was first proposed by the U.S. Commodity Futures Trading Commission and is now widely used by US and EU regulation (MiFID II) to describe activities that create a misleading appearance of market liquidity or depth or an artificial price movement upward or downward according to their own purposes. Such activities, identified as a new form of financial fraud in EU regulations, damage the proper functioning and integrity of capital markets and are hence extremely harmful. While existing studies have explored this issue, they have, in most cases, either focused on empirical analysis of such cases or proposed detection models based on certain assumptions of the market. Effective methods that can analyse and detect such disruptive activities based on direct studies of trading behaviours have not been studied to date. There exists, accordingly, a knowledge gap in the literature. This paper seeks to address that gap and provides a hybrid model composed of two data-mining-based detection modules that effectively identify disruptive trading behaviours. The hybrid model is designed to work in an on-line scheme. The limit order stream is transformed, calculated and extracted as a feature stream. One detection module, “Single Order Detection,” detects disruptive behaviours by identifying abnormal patterns of every single trading order. Another module, “Order Sequence Detection,” approaches the problem by examining the contextual relationships of a sequence of trading orders using an extended hidden Markov model, which identifies whether sequential changes from the extracted features are manipulative activities (or not). Both models were evaluated using huge volumes of real tick data from the NASDAQ, which demonstrated that both are able to identify a range of disruptive trading behaviours and, furthermore, that they outperform the selected traditional benchmark models. Thus, this hybrid model is shown to make a substantial contribution to the literature on financial market surveillance and to offer a practical and effective approach for the identification of disruptive trading behaviour

    WALDATA : Wavelet transform based adversarial learning for the detection of anomalous trading activities

    Get PDF
    Detecting manipulative activities in stock market trading poses a significant challenge due to the complex temporal correlations inherent to the dynamically changing stock price data. This challenge is further exacerbated by the limited availability of labelled anomalous trading data instances. Stock price manipulations, which consist of infrequent anomalies in stock price trading data, are challenging to capture due to their sporadic occurrence and dynamically evolving nature. This scarcity and inherent complexity significantly complicate the creation of labelled datasets hence hinders the development of robust detection of different stock price manipulation schemes through supervised learning methods. Overcoming these challenges is crucial for enhancing our understanding of market dynamics and implementing robust market surveillance systems. To address these challenges, we introduce a novel stock price manipulation detection approach called WALDATA (Wavelet Transform based Adversarial Learning for the Detection of Anomalous Trading Activities). We leverage the Wavelet Transform (WT) to decompose non-stationary stock price time series into informative features and capture multi-scale dynamics within the data. We encode stock price data by transforming it into scalogram images through the Continuous Wavelet Transform, effectively converting stock price time series data into a 2D image representation. Subsequently, we employ a Generative Adversarial Network (GAN) architecture, originally applied to computer vision, to learn the underlying distribution of normal trading behaviour from the encoded images. We then train the discriminator as an anomaly detector for identifying manipulative trading activities in the stock market. The efficacy of WALDATA is rigorously evaluated on diverse real-world stock datasets using 1-level tick data from the LOBSTER project and the experimental results demonstrate the significant performance of our approach achieving an average AUC of 0.99 while maintaining low false alarm rates across various market conditions. These findings not only validate the effectiveness of the proposed WALDATA approach in accurately identifying stock price manipulations but also provide investors and regulators alike with valuable insights for the development of advanced market surveillance systems. This research demonstrates the promising potential of combining wavelet-based feature extraction and stock price time series to image representation with generative adversarial learning frameworks for anomaly detection in financial time series data. The successful implementation of WALDATA contributes to the development of advanced market surveillance systems and paves the way for further advancements in market surveillance, contributing towards a more efficient and robust financial system and a fair market environment

    Data analytic approach for manipulation detection in stock market

    Get PDF
    The term “price manipulation” is used to describe the actions of “rogue” traders who employ carefully designed trading tactics to incur equity prices up or down to make profit. Such activities damage the proper functioning, integrity, and stability of the financial markets. In response to that, the regulators proposed new regulatory guidance to prohibit such activities on the financial markets. However, due to the lack of existing research and the implementation complexity, the application of those regulatory guidance, i.e. MiFID II in EU, is postponed to 2018. The existing studies exploring this issue either focus on empirical analysis of such cases, or propose detection models based on certain assumptions. The effective methods, based on analysing trading behaviour data, are not yet studied. This paper seeks to address that gap, and provides two data analytics based models. The first one, static model, detects manipulative behaviours through identifying abnormal patterns of trading activities. The activities are represented by transformed limit orders, in which the transformation method is proposed for partially reducing the non-stationarity nature of the financial data. The second one is hidden Markov model based dynamic model, which identifies the sequential and contextual changes in trading behaviours. Both models are evaluated using real stock tick data, which demonstrate their effectiveness on identifying a range of price manipulation scenarios, and outperforming the selected benchmarks. Thus, both models are shown to make a substantial contribution to the literature, and to offer a practical and effective approach to the identification of market manipulation
    corecore