218 research outputs found

    DISCOVERING INTERESTING PATTERNS FOR INVESTMENT DECISION MAKING WITH GLOWER C - A GENETIC LEARNER OVERLAID WITH ENTROPY REDUCTION

    Get PDF
    Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open-ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorporates successful ideas from tree induction and rule learning. We examine the performance of several GLOWER variants on two UCI data sets as well as on a standard financial prediction problem (S&P500 stock returns), using the results to identify and use one of the better variants for further comparisons. We introduce a new (to KDD) financial prediction problem (predicting positive and negative earnings surprises), and experiment withGLOWER, contrasting it with tree- and rule-induction approaches. Our results are encouraging, showing that GLOWER has the ability to uncover effective patterns for difficult problems that have weak structure and significant nonlinearities.Information Systems Working Papers Serie

    Sample Efficient Policy Search for Optimal Stopping Domains

    Full text link
    Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201

    DISCOVERING INTERESTING PATTERNS FOR INVESTMENT DECISION MAKING WITH GLOWER C - A GENETIC LEARNER OVERLAID WITH ENTROPY REDUCTION

    Get PDF
    Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open-ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorporates successful ideas from tree induction and rule learning. We examine the performance of several GLOWER variants on two UCI data sets as well as on a standard financial prediction problem (S&P500 stock returns), using the results to identify and use one of the better variants for further comparisons. We introduce a new (to KDD) financial prediction problem (predicting positive and negative earnings surprises), and experiment withGLOWER, contrasting it with tree- and rule-induction approaches. Our results are encouraging, showing that GLOWER has the ability to uncover effective patterns for difficult problems that have weak structure and significant nonlinearities.Information Systems Working Papers Serie

    GPR: A Data Mining Tool Using Genetic Programming

    Get PDF
    This paper proposes an inductive data mining technique (named GPR) based on genetic programming. Unlike other mining systems, the particularity of our technique is its ability to discover business rules that satisfy multiple (and possibly conflicting) decision or search criteria simultaneously. We present a step-by-step method to implement GPR, and introduce a prototype that generates production rules from real life data. We also report in this article on the use of GPR in an organization that seeks to understand how its employees make decisions in a voluntary separation program. Using a personnel database of 12,787 employees with 35 descriptive variables, our technique is able to discover employees\u27 hidden decision making patterns in the form of production rules. As our approach does not require any domain specific knowledge, it can be used without any major modification in different domains

    A new sequential covering strategy for inducing classification rules with ant colony algorithms

    Get PDF
    Ant colony optimization (ACO) algorithms have been successfully applied to discover a list of classification rules. In general, these algorithms follow a sequential covering strategy, where a single rule is discovered at each iteration of the algorithm in order to build a list of rules. The sequential covering strategy has the drawback of not coping with the problem of rule interaction, i.e., the outcome of a rule affects the rules that can be discovered subsequently since the search space is modified due to the removal of examples covered by previous rules. This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules. Our experiments using 18 publicly available data sets show that the predictive accuracy obtained by a new ACO classification algorithm implementing the proposed sequential covering strategy is statistically significantly higher than the predictive accuracy of state-of-the-art rule induction classification algorithms

    Analysis of Production and Location Decisions by Means of Multi-Criteria Analysis

    Get PDF
    During the last few years economists and operations researchers have paid much attention to multi-criteria analysis as a tool in modern decision-making. The basic feature of multi-criteria analysis is the fact that a wide variety of relevant decision aspects can be taken into account without a necessity to translate all these aspects in monetary terms. This article will give a brief survey of these new methods in both a quantitative and in a qualitative sense. After this survey the relevance of multi-criteria analysis for entrepreneurial decisions in the field of production and investments will be exposed. The analysis will be illustrated by means of two examples of entrepreneurial decision-problems, which have been solved by means of multi-criteria analysis

    PREDICTING INTRADAY STOCK RETURNS BY INTEGRATING MARKET DATA AND FINANCIAL NEWS REPORTS

    Get PDF
    Forecasting in the financial domain is undoubtedly a challenging undertaking in data mining. While the majority of previous studies in this field utilize historical market data to predict future stock returns, we explore whether there is benefit in augmenting the prediction model with supplementary domain knowledge obtained from financial news reports. To this end, we empirically evaluate how the integration of these data sources helps to predict intraday stocks returns. We consider several types of integration methods: variable-based as well as bundling methods. To discern whether the integration methods are sensitive to the type of forecasting algorithm, we have implemented each integration method using three different data mining algorithms. The results show several scenarios in which appending market-based data with textual news-based data helps to improve forecasting performance. The successful integration strongly depends on which forecasting algorithm and variable representation method is utilized. The findings are promising enough to warrant further studies in this direction

    Expert Stock Picker: The Wisdom of (Experts in) Crowds

    Get PDF
    The phrase the wisdom of crowds suggests that good verdicts can be achieved by averaging the opinions and insights of large, diverse groups of people who possess varied types of information. Online user-generated content enables researchers to view the opinions of large numbers of users publicly. These opinions, in the form of reviews and votes, can be used to automatically generate remarkably accurate verdicts-collective estimations of future performance-about companies, products, and people on the Web to resolve very tough problems. The wealth and richness of user-generated content may enable firms and individuals to aggregate consumer-think for better business understanding. Our main contribution, here applied to user-generated stock pick votes from a widely used online financial newsletter, is a genetic algorithm approach that can be used to identify the appropriate vote weights for users based on their prior individual voting success. Our method allows us to identify and rank experts within the crowd, enabling better stock pick decisions than the S&P 500. We show that the online crowd performs better, on average, than the S&P 500 for two test time periods, 2008 and 2009, in terms of both overall returns and risk-adjusted returns, as measured by the Sharpe ratio. Furthermore, we show that giving more weight to the votes of the experts in the crowds increases the accuracy of the verdicts, yielding an even greater return in the same time periods. We test our approach by utilizing more than three years of publicly available stock pick data. We compare our method to approaches derived from both the computer science and finance literature. We believe that our approach can be generalized to other domains where user opinions are publicly available early and where those opinions can be evaluated. For example, YouTube video ratings may be used to predict downloads, or online reviewer ratings on Digg may be used to predict the success or popularity of a story

    A hybrid decision tree/genetic algorithm method for data mining

    Get PDF
    corecore