6 research outputs found

    The cross-association relation based on intervals ratio in fuzzy time series

    Get PDF
    The fuzzy time series (FTS) is a forecasting model based on linguistic values. This forecasting method was developed in recent years after the existing ones were insufficiently accurate. Furthermore, this research modified the accuracy of existing methods for determining and the partitioning universe of discourse, fuzzy logic relationship (FLR), and variation historical data using intervals ratio, cross association relationship, and rubber production Indonesia data, respectively. The modifed steps start with the intervals ratio to partition the determined universe discourse. Then the triangular fuzzy sets were built, allowing fuzzification. After this, the FLR are built based on the cross association relationship, leading to defuzzification. The average forecasting error rate (AFER) was used to compare the modified results and the existing methods. Additionally, the simulations were conducted using rubber production Indonesia data from 2000-2020. With an AFER result of 4.77%<10%, the modification accuracy has a smaller error than previous methods, indicating  very good forecasting criteria. In addition, the coefficient values of D1 and D2 were automatically obtained from the intervals ratio algorithm. The future works modified the partitioning of the universe of discourse using frequency density to eliminate unused partition intervals

    Triangular Fuzzy Time Series for Two Factors High-order based on Interval Variations

    Get PDF
    Fuzzy time series (FTS) firstly introduced by Song and Chissom has been developed to forecast such as enrollment data, stock index, air pollution, etc. In forecasting FTS data several authors define universe of discourse using coefficient values with any integer or real number as a substitute. This study focuses on interval variation in order to get better evaluation. Coefficient values analyzed and compared in unequal partition intervals and equal partition intervals with base and triangular fuzzy membership functions applied in two factors high-order. The study implemented in the Shen-hu stock index data. The models evaluated by average forecasting error rate (AFER) and compared with existing methods. AFER value 0.28% for Shen-hu stock index daily data. Based on the result, this research can be used as a reference to determine the better interval and degree membership value in the fuzzy time series.

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Evolutionary multivariate time series prediction

    Get PDF
    Multivariate time series (MTS) prediction plays a significant role in many practical data mining applications, such as finance, energy supply, and medical care domains. Over the years, various prediction models have been developed to obtain robust and accurate prediction. However, this is not an easy task by considering a variety of key challenges. First, not all channels (each channel represents one time series) are informative (channel selection). Considering the complexity of each selected time series, it is difficult to predefine a time window used for inputs. Second, since the selected time series may come from cross domains collected with different devices, they may require different feature extraction techniques by considering suitable parameters to extract meaningful features (feature extraction), which influences the selection and configuration of the predictor, i.e., prediction (configuration). The challenge arising from channel selection, feature extraction, and prediction (configuration) is to perform them jointly to improve prediction performance. Third, we resort to ensemble learning to solve the MTS prediction problem composed of the previously mentioned operations,  where the challenge is to obtain a set of models satisfied both accurate and diversity. Each of these challenges leads to an NP-hard combinatorial optimization problem, which is impossible to be solved using the traditional methods since it is non-differentiable. Evolutionary algorithm (EA), as an efficient metaheuristic stochastic search technique, which is highly competent to solve complex combinatorial optimization problems having mixed types of decision variables, may provide an effective way to address the challenges arising from MTS prediction. The main contributions are supported by the following investigations. First, we propose a discrete evolutionary model, which mainly focuses on seeking the influential subset of channels of MTS and the optimal time windows for each of the selected channels for the MTS prediction task. A comprehensively experimental study on a real-world electricity consumption data with auxiliary environmental factors demonstrates the efficiency and effectiveness of the proposed method in searching for the informative time series and respective time windows and parameters in a predictor in comparison to the result obtained through enumeration. Subsequently, we define the basic MTS prediction pipeline containing channel selection, feature extraction, and prediction (configuration). To perform these key operations, we propose an evolutionary model construction (EMC) framework to seek the optimal subset of channels of MTS, suitable feature extraction methods and respective time windows applied to the selected channels, and parameter settings in the predictor simultaneously for the best prediction performance. To implement EMC, a two-step EA is proposed, where the first step EA mainly focuses on channel selection while in the second step, a specially designed EA works on feature extraction and prediction (configuration). A real-world electricity data with exogenous environmental information is used and the whole dataset is split into another two datasets according to holiday and nonholiday events. The performance of EMC is demonstrated on all three datasets in comparison to hybrid models and some existing methods. Then, based on the prediction pipeline defined previously, we propose an evolutionary multi-objective ensemble learning model (EMOEL) by employing multi-objective evolutionary algorithm (MOEA) subjected to two conflicting objectives, i.e., accuracy and model diversity. MOEA leads to a pareto front (PF) composed of non-dominated optimal solutions, where each of them represents the optimal subset of the selected channels, the selected feature extraction methods and the selected time windows, and the selected parameters in the predictor. To boost ultimate prediction accuracy, the models with respect to these optimal solutions are linearly combined with combination coefficients being optimized via a single-objective task-oriented EA. The superiority of EMOEL is identified on electricity consumption data with climate information in comparison to several state-of-the-art models. We also propose a multi-resolution selective ensemble learning model, where multiple resolutions are constructed from the minimal granularity using statistics. At the current time stamp, the preceding time series data is sampled at different time intervals (i.e., resolutions) to constitute the time windows. For each resolution, multiple base learners with different parameters are first trained. Feature selection technique is applied to search for the optimal set of trained base learners and least square regression is used to combine them. The performance of the proposed ensemble model is verified on the electricity consumption data for the next-step and next-day prediction. Finally, based on EMOEL and multi-resolution, instead of only combining the models generated from each PF, we propose an evolutionary ensemble learning (EEL) framework, where multiple PFs are aggregated to produce a composite PF (CPF) after removing the same solutions in PFs and being sorted into different levels of non-dominated fronts (NDFs). Feature selection techniques are applied to exploit the optimal subset of models in level-accumulated NDF and least square is used to combine the selected models. The performance of EEL that chooses three different predictors as base learners is evaluated by the comprehensive analysis of the parameter sensitivity. The superiority of EEL is demonstrated in comparison to the best result from single-objective EA and the best individual from the PF, and several state-of-the-art models across electricity consumption and air quality datasets, both of which use the environmental factors from other domains as the auxiliary factors. In summary, this thesis provides studies on how to build efficient and effective models for MTS prediction. The built frameworks investigate the influential factors, consider the pipeline composed of channel selection, feature extraction, and prediction (configuration) simultaneously, and keep good generalization and accuracy across different applications. The proposed algorithms to implement the frameworks use techniques from evolutionary computation (single-objective EA and MOEA), machine learning and data mining areas. We believe that this research provides a significant step towards constructing robust and accurate models for solving MTS prediction problems. In addition, with the case study on electricity consumption prediction, it will contribute to helping decision-makers in determining the trend of future energy consumption for scheduling and planning of the operations of the energy supply system

    Data mining in computational finance

    Get PDF
    Computational finance is a relatively new discipline whose birth can be traced back to early 1950s. Its major objective is to develop and study practical models focusing on techniques that apply directly to financial analyses. The large number of decisions and computationally intensive problems involved in this discipline make data mining and machine learning models an integral part to improve, automate, and expand the current processes. One of the objectives of this research is to present a state-of-the-art of the data mining and machine learning techniques applied in the core areas of computational finance. Next, detailed analysis of public and private finance datasets is performed in an attempt to find interesting facts from data and draw conclusions regarding the usefulness of features within the datasets. Credit risk evaluation is one of the crucial modern concerns in this field. Credit scoring is essentially a classification problem where models are built using the information about past applicants to categorise new applicants as ‘creditworthy’ or ‘non-creditworthy’. We appraise the performance of a few classical machine learning algorithms for the problem of credit scoring. Typically, credit scoring databases are large and characterised by redundant and irrelevant features, making the classification task more computationally-demanding. Feature selection is the process of selecting an optimal subset of relevant features. We propose an improved information-gain directed wrapper feature selection method using genetic algorithms and successfully evaluate its effectiveness against baseline and generic wrapper methods using three benchmark datasets. One of the tasks of financial analysts is to estimate a company’s worth. In the last piece of work, this study predicts the growth rate for earnings of companies using three machine learning techniques. We employed the technique of lagged features, which allowed varying amounts of recent history to be brought into the prediction task, and transformed the time series forecasting problem into a supervised learning problem. This work was applied on a private time series dataset
    corecore