57,376 research outputs found

    Temporal Feature Selection with Symbolic Regression

    Get PDF
    Building and discovering useful features when constructing machine learning models is the central task for the machine learning practitioner. Good features are useful not only in increasing the predictive power of a model but also in illuminating the underlying drivers of a target variable. In this research we propose a novel feature learning technique in which Symbolic regression is endowed with a ``Range Terminal\u27\u27 that allows it to explore functions of the aggregate of variables over time. We test the Range Terminal on a synthetic data set and a real world data in which we predict seasonal greenness using satellite derived temperature and snow data over a portion of the Arctic. On the synthetic data set we find Symbolic regression with the Range Terminal outperforms standard Symbolic regression and Lasso regression. On the Arctic data set we find it outperforms standard Symbolic regression, fails to beat the Lasso regression, but finds useful features describing the interaction between Land Surface Temperature, Snow, and seasonal vegetative growth in the Arctic

    Fast, accurate, and transferable many-body interatomic potentials by symbolic regression

    Full text link
    The length and time scales of atomistic simulations are limited by the computational cost of the methods used to predict material properties. In recent years there has been great progress in the use of machine learning algorithms to develop fast and accurate interatomic potential models, but it remains a challenge to develop models that generalize well and are fast enough to be used at extreme time and length scales. To address this challenge, we have developed a machine learning algorithm based on symbolic regression in the form of genetic programming that is capable of discovering accurate, computationally efficient manybody potential models. The key to our approach is to explore a hypothesis space of models based on fundamental physical principles and select models within this hypothesis space based on their accuracy, speed, and simplicity. The focus on simplicity reduces the risk of overfitting the training data and increases the chances of discovering a model that generalizes well. Our algorithm was validated by rediscovering an exact Lennard-Jones potential and a Sutton Chen embedded atom method potential from training data generated using these models. By using training data generated from density functional theory calculations, we found potential models for elemental copper that are simple, as fast as embedded atom models, and capable of accurately predicting properties outside of their training set. Our approach requires relatively small sets of training data, making it possible to generate training data using highly accurate methods at a reasonable computational cost. We present our approach, the forms of the discovered models, and assessments of their transferability, accuracy and speed

    Mechanisms for the generation and regulation of sequential behaviour

    Get PDF
    A critical aspect of much human behaviour is the generation and regulation of sequential activities. Such behaviour is seen in both naturalistic settings such as routine action and language production and laboratory tasks such as serial recall and many reaction time experiments. There are a variety of computational mechanisms that may support the generation and regulation of sequential behaviours, ranging from those underlying Turing machines to those employed by recurrent connectionist networks. This paper surveys a range of such mechanisms, together with a range of empirical phenomena related to human sequential behaviour. It is argued that the empirical phenomena pose difficulties for most sequencing mechanisms, but that converging evidence from behavioural flexibility, error data arising from when the system is stressed or when it is damaged following brain injury, and between-trial effects in reaction time tasks, point to a hybrid symbolic activation-based mechanism for the generation and regulation of sequential behaviour. Some implications of this view for the nature of mental computation are highlighted

    Rerepresenting and Restructuring Domain Theories: A Constructive Induction Approach

    Full text link
    Theory revision integrates inductive learning and background knowledge by combining training examples with a coarse domain theory to produce a more accurate theory. There are two challenges that theory revision and other theory-guided systems face. First, a representation language appropriate for the initial theory may be inappropriate for an improved theory. While the original representation may concisely express the initial theory, a more accurate theory forced to use that same representation may be bulky, cumbersome, and difficult to reach. Second, a theory structure suitable for a coarse domain theory may be insufficient for a fine-tuned theory. Systems that produce only small, local changes to a theory have limited value for accomplishing complex structural alterations that may be required. Consequently, advanced theory-guided learning systems require flexible representation and flexible structure. An analysis of various theory revision systems and theory-guided learning systems reveals specific strengths and weaknesses in terms of these two desired properties. Designed to capture the underlying qualities of each system, a new system uses theory-guided constructive induction. Experiments in three domains show improvement over previous theory-guided systems. This leads to a study of the behavior, limitations, and potential of theory-guided constructive induction.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl
    • …
    corecore