1,756 research outputs found

    Robust optimization of algorithmic trading systems

    Get PDF
    GAs (Genetic Algorithms) and GP (Genetic Programming) are investigated for finding robust Technical Trading Strategies (TTSs). TTSs evolved with standard GA/GP techniques tend to suffer from over-fitting as the solutions evolved are very fragile to small disturbances in the data. The main objective of this thesis is to explore optimization techniques for GA/GP which produce robust TTSs that have a similar performance during both optimization and evaluation, and are also able to operate in all market conditions and withstand severe market shocks. In this thesis, two novel techniques that increase the robustness of TTSs and reduce over-fitting are described and compared to standard GA/GP optimization techniques and the traditional investment strategy Buy & Hold. The first technique employed is a robust multi-market optimization methodology using a GA. Robustness is incorporated via the environmental variables of the problem, i.e. variablity in the dataset is introduced by conducting the search for the optimum parameters over several market indices, in the hope of exposing the GA to differing market conditions. This technique shows an increase in the robustness of the solutions produced, with results also showing an improvement in terms of performance when compared to those offered by conducting the optimization over a single market. The second technique is a random sampling method we use to discover robust TTSs using GP. Variability is introduced in the dataset by randomly sampling segments and evaluating each individual on different random samples. This technique has shown promising results, substantially beating Buy & Hold. Overall, this thesis concludes that Evolutionary Computation techniques such as GA and GP combined with robust optimization methods are very suitable for developing trading systems, and that the systems developed using these techniques can be used to provide significant economic profits in all market conditions

    A superior active portfolio optimization model for stock exchange

    Get PDF
    Due to the vast number of stocks and the multiple appearances of developing investment portfolios, investors in the financial market face multiple investment opportunities. In this regard, the investor task becomes extremely difficult as investors define their preferences for expected return and the amount to which they want to avoid potential investment risks. This research attempts to design active portfolios that outperform the performance of the appropriate market index. To achieve this aim, technical analysis and optimization procedures were used based on a hybrid model. It combines the strong features of the Markowitz model with the General Reduced Gradient (GRG) algorithm to maintain a good compromise between diversification and exploitation. The proposed model is used to construct an active portfolio optimization model for the Iraq Stock Exchange (ISX) for the period from January 2010 to February 2020. This is applied to all 132 companies registered on the exchange. In addition to the market portfolio, two methods, namely, Equal Weight (EW) and Markowitz were used to generate active portfolios to compare the research findings. After a thorough review based on the Sharpe ratio criterion, the suggested model demonstrated its robustness, resulting in maximizing earnings with low risks

    NEW METHODS FOR MINING SEQUENTIAL AND TIME SERIES DATA

    Get PDF
    Data mining is the process of extracting knowledge from large amounts of data. It covers a variety of techniques aimed at discovering diverse types of patterns on the basis of the requirements of the domain. These techniques include association rules mining, classification, cluster analysis and outlier detection. The availability of applications that produce massive amounts of spatial, spatio-temporal (ST) and time series data (TSD) is the rationale for developing specialized techniques to excavate such data. In spatial data mining, the spatial co-location rule problem is different from the association rule problem, since there is no natural notion of transactions in spatial datasets that are embedded in continuous geographic space. Therefore, we have proposed an efficient algorithm (GridClique) to mine interesting spatial co-location patterns (maximal cliques). These patterns are used as the raw transactions for an association rule mining technique to discover complex co-location rules. Our proposal includes certain types of complex relationships – especially negative relationships – in the patterns. The relationships can be obtained from only the maximal clique patterns, which have never been used until now. Our approach is applied on a well-known astronomy dataset obtained from the Sloan Digital Sky Survey (SDSS). ST data is continuously collected and made accessible in the public domain. We present an approach to mine and query large ST data with the aim of finding interesting patterns and understanding the underlying process of data generation. An important class of queries is based on the flock pattern. A flock is a large subset of objects moving along paths close to each other for a predefined time. One approach to processing a “flock query” is to map ST data into high-dimensional space and to reduce the query to a sequence of standard range queries that can be answered using a spatial indexing structure; however, the performance of spatial indexing structures rapidly deteriorates in high-dimensional space. This thesis sets out a preprocessing strategy that uses a random projection to reduce the dimensionality of the transformed space. We use probabilistic arguments to prove the accuracy of the projection and to present experimental results that show the possibility of managing the curse of dimensionality in a ST setting by combining random projections with traditional data structures. In time series data mining, we devised a new space-efficient algorithm (SparseDTW) to compute the dynamic time warping (DTW) distance between two time series, which always yields the optimal result. This is in contrast to other approaches which typically sacrifice optimality to attain space efficiency. The main idea behind our approach is to dynamically exploit the existence of similarity and/or correlation between the time series: the more the similarity between the time series, the less space required to compute the DTW between them. Other techniques for speeding up DTW, impose a priori constraints and do not exploit similarity characteristics that may be present in the data. Our experiments demonstrate that SparseDTW outperforms these approaches. We discover an interesting pattern by applying SparseDTW algorithm: “pairs trading” in a large stock-market dataset, of the index daily prices from the Australian stock exchange (ASX) from 1980 to 2002

    From metaheuristics to learnheuristics: Applications to logistics, finance, and computing

    Get PDF
    Un gran nombre de processos de presa de decisions en sectors estratègics com el transport i la producció representen problemes NP-difícils. Sovint, aquests processos es caracteritzen per alts nivells d'incertesa i dinamisme. Les metaheurístiques són mètodes populars per a resoldre problemes d'optimització difícils en temps de càlcul raonables. No obstant això, sovint assumeixen que els inputs, les funcions objectiu, i les restriccions són deterministes i conegudes. Aquests constitueixen supòsits forts que obliguen a treballar amb problemes simplificats. Com a conseqüència, les solucions poden conduir a resultats pobres. Les simheurístiques integren la simulació a les metaheurístiques per resoldre problemes estocàstics d'una manera natural. Anàlogament, les learnheurístiques combinen l'estadística amb les metaheurístiques per fer front a problemes en entorns dinàmics, en què els inputs poden dependre de l'estructura de la solució. En aquest context, les principals contribucions d'aquesta tesi són: el disseny de les learnheurístiques, una classificació dels treballs que combinen l'estadística / l'aprenentatge automàtic i les metaheurístiques, i diverses aplicacions en transport, producció, finances i computació.Un gran número de procesos de toma de decisiones en sectores estratégicos como el transporte y la producción representan problemas NP-difíciles. Frecuentemente, estos problemas se caracterizan por altos niveles de incertidumbre y dinamismo. Las metaheurísticas son métodos populares para resolver problemas difíciles de optimización de manera rápida. Sin embargo, suelen asumir que los inputs, las funciones objetivo y las restricciones son deterministas y se conocen de antemano. Estas fuertes suposiciones conducen a trabajar con problemas simplificados. Como consecuencia, las soluciones obtenidas pueden tener un pobre rendimiento. Las simheurísticas integran simulación en metaheurísticas para resolver problemas estocásticos de una manera natural. De manera similar, las learnheurísticas combinan aprendizaje estadístico y metaheurísticas para abordar problemas en entornos dinámicos, donde los inputs pueden depender de la estructura de la solución. En este contexto, las principales aportaciones de esta tesis son: el diseño de las learnheurísticas, una clasificación de trabajos que combinan estadística / aprendizaje automático y metaheurísticas, y varias aplicaciones en transporte, producción, finanzas y computación.A large number of decision-making processes in strategic sectors such as transport and production involve NP-hard problems, which are frequently characterized by high levels of uncertainty and dynamism. Metaheuristics have become the predominant method for solving challenging optimization problems in reasonable computing times. However, they frequently assume that inputs, objective functions and constraints are deterministic and known in advance. These strong assumptions lead to work on oversimplified problems, and the solutions may demonstrate poor performance when implemented. Simheuristics, in turn, integrate simulation into metaheuristics as a way to naturally solve stochastic problems, and, in a similar fashion, learnheuristics combine statistical learning and metaheuristics to tackle problems in dynamic environments, where inputs may depend on the structure of the solution. The main contributions of this thesis include (i) a design for learnheuristics; (ii) a classification of works that hybridize statistical and machine learning and metaheuristics; and (iii) several applications for the fields of transport, production, finance and computing

    Using Particle Swarm Optimization for Market Timing Strategies

    Get PDF
    Market timing is the issue of deciding when to buy or sell a given asset on the market. As one of the core issues of algorithmic trading systems, designers of such system have turned to computational intelligence methods to aid them in this task. In this thesis, we explore the use of Particle Swarm Optimization (PSO) within the domain of market timing.nPSO is a search metaheuristic that was first introduced in 1995 [28] and is based on the behavior of birds in flight. Since its inception, the PSO metaheuristic has seen extensions to adapt it to a variety of problems including single objective optimization, multiobjective optimization, niching and dynamic optimization problems. Although popular in other domains, PSO has seen limited application to the issue of market timing. The current incumbent algorithm within the market timing domain is Genetic Algorithms (GA), based on the volume of publications as noted in [40] and [84]. In this thesis, we use PSO to compose market timing strategies using technical analysis indicators. Our first contribution is to use a formulation that considers both the selection of components and the tuning of their parameters in a simultaneous manner, and approach market timing as a single objective optimization problem. Current approaches only considers one of those aspects at a time: either selecting from a set of components with fixed values for their parameters or tuning the parameters of a preset selection of components. Our second contribution is proposing a novel training and testing methodology that explicitly exposes candidate market timing strategies to numerous price trends to reduce the likelihood of overfitting to a particular trend and give a better approximation of performance under various market conditions. Our final contribution is to consider market timing as a multiobjective optimization problem, optimizing five financial metrics and comparing the performance of our PSO variants against a well established multiobjective optimization algorithm. These algorithms address unexplored research areas in the context of PSO algorithms to the best of our knowledge, and are therefore original contributions. The computational results over a range of datasets shows that the proposed PSO algorithms are competitive to GAs using the same formulation. Additionally, the multiobjective variant of our PSO algorithm achieve statistically significant improvements over NSGA-II

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Application of evolutionary computation techniques for the identification of innovators in open innovation communities

    Get PDF
    Open innovation represents an emergent paradigm by which organizations make use of internal and external resources to drive their innovation processes. The growth of information and communication technologies has facilitated a direct contact with customers and users, which can be organized as open innovation communities through Internet. The main drawback of this scheme is the huge amount of information generated by users, which can negatively affect the correct identification of potentially applicable ideas. This paper proposes the use of evolutionary computation techniques for the identification of innovators, that is, those users with the ability of generating attractive and applicable ideas for the organization. For this purpose, several characteristics related to the participation activity of users though open innovation communities have been collected and combined in the form of discriminant functions to maximize their correct classification. The right classification of innovators can be used to improve the ideas evaluation process carried out by the organization innovation team. Besides, obtained results can also be used to test lead user theory and to measure to what extent lead users are aligned with the organization strategic innovation policie

    Earnings prediction using machine learning methods and analyst comparison

    Get PDF
    In the course of this dissertation we propose an experimental study on how technical, macroeconomic, and financial variables, alongside analysts’ forecasts, can be used to optimize the prediction for the subsequent quarter’s earnings results using machine learning, comparing the performance of the models to analysts’ forecasts. The dissertation includes three steps. In step one, an event study is conducted to test abnormal returns in firms’ stock prices in the day following earnings announcement, grouped by earnings per share (EPS) growth in classes of size 3, 6 and 9, computed for each quarter. In step two, several machine learning models are built to maximize the accuracy of EPS predictions. In the last step, investment strategies are constructed to take advantage of investors’ expectations, which are closely correlated with analysts’ predictions. In the backdrop of an exhaustive analysis on quarterly earnings predictions using machine learning methods, conclusions are drawn related to the superiority of the CatBoost classifier. All machine learning models tested underperform analyst predictions, which could be explained by the time and privileged information at analysts’ disposal, as well as their selection of firms to cover. Regardless, machine learning models can be used as a confirmation for analyst predictions, and statistically significant investment strategies are pursued with those fundamentals. Importantly, high confidence predictions by machine learning models are significantly more accurate than the average accuracy of forecasts.No decorrer desta dissertação, realiza-se um estudo experimental sobre a forma como análises técnicas, macroeconómicas, fundamentais e as previsões dos analistas podem ser utilizadas em conjunto para otimizar a previsão dos resultados de lucros do próximo trimestre de empresas A dissertação inclui três etapas. Na primeira etapa, é efetuado um estudo de evento para testar os retornos anormais nas ações no dia seguinte aos anúncios de lucros, sendo estes agrupados pelo crescimento do lucro por ação nas classes de 3, 6 e 9, calculado para cada trimestre. Na etapa dois, vários modelos de machine learning (ML) são concebidos para maximizar a precisão das previsões de crescimento de lucros de empresas. Na última etapa, estratégias de investimento são construídas para tirar proveito das expectativas do investidor, que estão relacionadas com as previsões dos analistas. Uma vez que um dos projetos de pesquisa mais exaustivos sobre previsões de lucros para o próximo trimestre, conclusões podem ser retiradas relacionadas com a superioridade do modelo CatBoost nas previsões de lucros. Todos os modelos de testados apresentam desempenho inferior às previsões dos analistas, o que pode ser explicado pelo tempo e pelas informações privilegiadas a que os analistas têm acesso, bem como pela escolha da empresa sob a qual as suas previsões incidem. Os modelos de podem ser utilizados como uma confirmação para as previsões dos analistas criando estratégias de investimento estatisticamente significativas. Além disso, as previsões com alta confiança por modelos de são mais precisas do que a precisão média das previsões dos analistas

    Can bank interaction during rating measurement of micro and very small enterprises ipso facto Determine the collapse of PD status?

    Get PDF
    This paper begins with an analysis of trends - over the period 2012-2018 - for total bank loans, non-performing loans, and the number of active, working enterprises. A review survey was done on national data from Italy with a comparison developed on a local subset from the Sardinia Region. Empirical evidence appears to support the hypothesis of the paper: can the rating class assigned by banks - using current IRB and A-IRB systems - to micro and very small enterprises, whose ability to replace financial resources using endogenous means is structurally impaired, ipso facto orient the results of performance in the same terms of PD assigned by the algorithm, thereby upending the principle of cause and effect? The thesis is developed through mathematical modeling that demonstrates the interaction of the measurement tool (the rating algorithm applied by banks) on the collapse of the loan status (default, performing, or some intermediate point) of the assessed micro-entity. Emphasis is given, in conclusion, to the phenomenon using evidence of the intrinsically mutualistic link of the two populations of banks and (micro) enterprises provided by a system of differential equation
    corecore