24 research outputs found

    Learning to solve planning problems efficiently by means of genetic programming

    Get PDF
    Declarative problem solving, such as planning, poses interesting challenges for Genetic Programming (GP). There have been recent attempts to apply GP to planning that fit two approaches: (a) using GP to search in plan space or (b) to evolve a planner. In this article, we propose to evolve only the heuristics to make a particular planner more efficient. This approach is more feasible than (b) because it does not have to build a planner from scratch but can take advantage of already existing planning systems. It is also more efficient than (a) because once the heuristics have been evolved, they can be used to solve a whole class of different planning problems in a planning domain, instead of running GP for every new planning problem. Empirical results show that our approach (EVOCK) is able to evolve heuristics in two planning domains (the blocks world and the logistics domain) that improve PRODIGY4.0 performance. Additionally, we experiment with a new genetic operator - Instance-Based Crossover - that is able to use traces of the base planner as raw genetic material to be injected into the evolving population.Publicad

    Автоматическая генерация логического знания

    Get PDF
    We study problems which arise deriving generating automatically logical knowledge in systems of artificial intellect, first of all in systems of automatic theorem proving. Three necessary conditions for a generator of logical knowledge are proposed and a verification of these ones is presented.В статье рассмотрены вопросы, которые возникают при попытке генерации автоматическим образом логического знания в системах искусственного интеллекта, в первую очередь, в системах машинного доказательства теорем. Сформулированы три необходимых требования к подобному генератору и рассмотрено, как их можно выполнить

    The 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies

    Get PDF
    This publication comprises the papers presented at the 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies held at the NASA/Goddard Space Flight Center, Greenbelt, Maryland, on May 9-11, 1995. The purpose of this annual conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed

    Air Vehicle Path Planning

    Get PDF
    This dissertation explores optimal path planning for air vehicles. An air vehicle exposed to illumination by a tracking radar is considered and the problem of determining an optimal planar trajectory connecting two prespecified points is addressed. An analytic solution yielding the trajectory minimizing the received radar energy reflected from the target is derived using the Calculus of Variations. Additionally, the related problem of an air vehicle tracked by a passive sensor is also solved. Using the insights gained from the single air vehicle radar exposure minimization problem, a hierarchical cooperative control law is formulated to determine the optimal trajectories that minimize the cumulative exposure of multiple air vehicles during a rendezvous maneuver. The problem of one air vehicle minimizing exposure to multiple radars is also addressed using a variational approach, as well as a sub-optimal minimax argument. Local and global optimality issues are explored. A novel decision criterion is developed determining the geometric conditions dictating when it is preferable to go between or around two radars. Lastly, an optimal minimum time control law is obtained for the search and target identification mission of an autonomous air vehicle. This work demonstrates that an awareness of the consequences of embracing sub-optimal and non-globally optimal solutions for optimization problems, such as air vehicle path planning, is essential

    Self Monitoring Goal Driven Autonomy Agents

    Get PDF
    The growing abundance of autonomous systems is driving the need for robust performance. Most current systems are not fully autonomous and often fail when placed in real environments. Via self-monitoring, agents can identify when their own, or externally given, boundaries are violated, thereby increasing their performance and reliability. Specifically, self-monitoring is the identification of unexpected situations that either (1) prohibit the agent from reaching its goal(s) or (2) result in the agent acting outside of its boundaries. Increasingly complex and open environments warrant the use of such robust autonomy (e.g., self-driving cars, delivery drones, and all types of future digital and physical assistants). The techniques presented herein advance the current state of the art in self-monitoring, demonstrating improved performance in a variety of challenging domains. In the aforementioned domains, there is an inability to plan for all possible situations. In many cases all aspects of a domain are not known beforehand, and, even if they were, the cost of encoding them is high. Self-monitoring agents are able to identify and then respond to previously unexpected situations, or never-before-encountered situations. When dealing with unknown situations, one must start with what is expected behavior and use that to derive unexpected behavior. The representation of expectations will vary among domains; in a real-time strategy game like Starcraft, it could be logically inferred concepts; in a mars rover domain, it could be an accumulation of actions\u27 effects. Nonetheless, explicit expectations are necessary to identify the unexpected. This thesis lays the foundation for self-monitoring in goal driven autonomy agents in both rich and expressive domains and in partially observable domains. We introduce multiple techniques for handling such environments. We show how inferred expectations are needed to enable high level planning in real-time strategy games. We show how a hierarchical structure of Goal-driven Autonomy (GDA) enables agents to operate within large state spaces. Within Hierarchical Task Network planning, we show how informed expectations identify states that are likely to prevent an agent from reaching its goals in dynamic domains. Finally, we give a model of expectations for self-monitoring at the meta-cognitive level, and empirical results of agents equipped with and without metacognitive expectations

    Systematic Trading: Calibration Advances through Machine Learning

    Get PDF
    Systematic trading in finance uses computer models to define trade goals, risk controls and rules that can execute trade orders in a methodical way. This thesis investigates how performance in systematic trading can be crucially enhanced by both i) persistently reducing the bid-offer spread quoted by the trader through optimized and realistically backtested strategies and ii) improving the out-of-sample robustness of the strategy selected through the injection of theory into the typically data-driven calibration processes. While doing so it brings to the foreground sound scientific reasons that, for the first time to my knowledge, technically underpin popular academic observations about the recent nature of the financial markets. The thesis conducts consecutive experiments across strategies within the three important building blocks of systematic trading: a) execution, b) quoting and c) risk-reward allowing me to progressively generate more complex and accurate backtested scenarios as recently demanded in the literature (Cahan et al. (2010)). The three experiments conducted are: 1. Execution: an execution model based on support vector machines. The first experiment is deployed to improve the realism of the other two. It analyses a popular model of execution: the volume weighted average price (VWAP). The VWAP algorithm targets to split the size of an order along the trading session according to the expected intraday volume's profile since the activity in the markets typically resembles convex seasonality – with more activity around the open and the closing auctions than along the rest of the day. In doing so, the main challenge is to provide the model with a reasonable expected profile. After proving in my data sample that two simple static approaches to the profile overcome the PCA-ARMA from Bialkowski et al. (2008) (a popular two-fold model composed by a dynamic component around an unsupervised learning structure) a further combination of both through an index based on supervised learning is proposed. The Sample Sensitivity Index hence successfully allows estimating the expected volume's profile more accurately by selecting those ranges of time where the model shall be less sensitive to past data through the identification of patterns via support vector machines. Only once the intraday execution risk has been defined can the quoting policy of a mid-frequency (in general, up to a week) hedging strategy be accurately analysed. 2. Quoting: a quoting model built upon particle swarm optimization. The second experiment analyses for the first time to my knowledge how to achieve the disruptive 50% bid-offer spread discount observed in Menkveld (2013) without increasing the risk profile of a trading agent. The experiment depends crucially on a series of variables of which market impact and slippage are typically the most difficult to estimate. By adapting the market impact model in Almgren et al. (2005) to the VWAP developed in the previous experiment and by estimating its slippage through its errors' distribution a framework within which the bid-offer spread can be assessed is generated. First, a full-replication spread, (that set out following the strict definition of a product in order to hedge it completely) is calculated and fixed as a benchmark. Then, by allowing benefiting from a lower market impact at the cost of assuming deviation risk (tracking error and tail risk) a non-full-replication spread is calibrated through particle swarm optimization (PSO) as in Diez et al. (2012) and compared with the benchmark. Finally, it is shown that the latter can reach a discount of a 50% with respect to the benchmark if a certain number of trades is granted. This typically occurs on the most liquid securities. This result not only underpins Menkveld's observations but also points out that there is room for further reductions. When seeking additional performance, once the quoting policy has been defined, a further layer with a calibrated risk-reward policy shall be deployed. 3. Risk-Reward: a calibration model defined within a Q-learning framework. The third experiment analyses how the calibration process of a risk-reward policy can be enhanced to achieve a more robust out-of-sample performance – a cornerstone in quantitative trading. It successfully gives a response to the literature that recently focusses on the detrimental role of overfitting (Bailey et al. (2013a)). The experiment was motivated by the assumption that the techniques underpinned by financial theory shall show a better behaviour (a lower deviation between in-sample and out-of-sample performance) than the classical data-driven only processes. As such, both approaches are compared within a framework of active trading upon a novel indicator. The indicator, called the Expectations' Shift, is rooted on the expectations of the markets' evolution embedded in the dynamics of the prices. The crucial challenge of the experiment is the injection of theory within the calibration process. This is achieved through the usage of reinforcement learning (RL). RL is an area of ML inspired by behaviourist psychology concerned with how software agents take decisions in an specific environment incentivised by a policy of rewards. By analysing the Q-learning matrix that collects the set of state/actions learnt by the agent within the environment, defined by each combination of parameters considered within the calibration universe, the rationale that an autonomous agent would have learnt in terms of risk management can be generated. Finally, by then selecting the combination of parameters whose attached rationale is closest to that of the portfolio manager a data-driven solution that converges to the theory-driven solution can be found and this is shown to successfully outperform out-of-sample the classical approaches followed in Finance. The thesis contributes to science by addressing what techniques could underpin recent academic findings about the nature of the trading industry for which a scientific explanation was not yet given: • A novel agent-based approach that allows for a robust out-of-sampkle performance by crucially providing the trader with a way to inject financial insights into the generally data-driven only calibration processes. It this way benefits from surpassing the generic model limitations present in the literature (Bailey et al. (2013b), Schorfheid and Wolpin (2012), Van Belle and Kerr (2012) or Weiss and Kulikowski (1991)) by finding a point where theory-driven patterns (the trader's priors tend to enhance out-of-sample robustness) merge with data-driven ones (those that allow to exploit latent information). • The provision of a technique that, to the best of my knowledge, explains for the first time how to reduce the bid-offer spread quoted by a traditional trader without modifying her risk appetite. A reduction not previously addressed in the literature in spite of the fact that the increasing regulation against the assumption of risk by market makers (e.g. Dodd–Frank Wall Street Reform and Consumer Protection Act) does yet coincide with the aggressive discounts observed by Menkveld (2013). As a result, this thesis could further contribute to science by serving as a framework to conduct future analyses in the context of systematic trading. • The completion of a mid-frequency trading experiment with high frequency execution information. It is shown how the latter can have a significant effect on the former not only through the erosion of its performance but, more subtly, by changing its entire strategic design (both, optimal composition and parameterization). This tends to be highly disregarded by the financial literature. More importantly, the methodologies disclosed herein have been crucial to underpin the setup of a new unit in the industry, BBVA's Global Strategies & Data Science. This disruptive, global and cross-asset team gives an enhanced role to science by successfully becoming the main responsible for the risk management of the Bank's strategies both in electronic trading and electronic commerce. Other contributions include: the provision of a novel risk measure (flowVaR); the proposal of a novel trading indicator (Expectations’ Shift); and the definition of a novel index that allows to improve the estimation of the intraday volume’s profile (Sample Sensitivity Index)
    corecore