1,150 research outputs found

    A divide and conquer method for symbolic regression

    Full text link
    Symbolic regression aims to find a function that best explains the relationship between independent variables and the objective value based on a given set of sample data. Genetic programming (GP) is usually considered as an appropriate method for the problem since it can optimize functional structure and coefficients simultaneously. However, the convergence speed of GP might be too slow for large scale problems that involve a large number of variables. Fortunately, in many applications, the target function is separable or partially separable. This feature motivated us to develop a new method, divide and conquer (D&C), for symbolic regression, in which the target function is divided into a number of sub-functions and the sub-functions are then determined by any of a GP algorithm. The separability is probed by a new proposed technique, Bi-Correlation test (BiCT). D&C powered GP has been tested on some real-world applications, and the study shows that D&C can help GP to get the target function much more rapidly

    Predicting the energy output of wind farms based on weather data: important variables and their correlation

    Get PDF
    Pre-print available at: http://arxiv.org/abs/1109.1922Wind energy plays an increasing role in the supply of energy world wide. The energy output of a wind farm is highly dependent on the weather conditions present at its site. If the output can be predicted more accurately, energy suppliers can coordinate the collaborative production of different energy sources more efficiently to avoid costly overproduction. In this paper, we take a computer science perspective on energy prediction based on weather data and analyze the important parameters as well as their correlation on the energy output. To deal with the interaction of the different parameters, we use symbolic regression based on the genetic programming tool DataModeler. Our studies are carried out on publicly available weather and energy data for a wind farm in Australia. We report on the correlation of the different variables for the energy output. The model obtained for energy prediction gives a very reliable prediction of the energy output for newly supplied weather data. © 2012 Elsevier Ltd.Ekaterina Vladislavleva, Tobias Friedrich, Frank Neumann, Markus Wagne

    Interpretable Categorization of Heterogeneous Time Series Data

    Get PDF
    Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 201

    Late-Breaking Papers of EuroGP-99

    Get PDF
    This booklet contains the late-breaking papers of the Second European Workshop on Genetic Programming (EuroGP’99) held in G"oteborg Sweden 26–27 May 1999. EuroGP’99 was one of the EvoNet workshops on evolutionary computing, EvoWorkshops’99. The purpose of the late-breaking papers was to provide attendees with information about research that was initiated, enhanced, improved, or completed after the original paper submission deadline in December 1998. To ensure coverage of the most up-to-date research, the deadline for submission was set only a month before the workshop. Late-breaking papers were examined for relevance and quality by the organisers of the EuroGP’99, but no formal review process took place. The 3 late-breaking papers in this booklet (which was distributed at the workshop) were presented during a poster session held on Thursday 27 May 1999 during EuroGP’99. Authors individually retain copyright (and all other rights) to their late-breaking papers. This booklet is available as a technical report SEN-R9913 from Centrum voor Wiskunde en Informatica, Kruislaan 413, NL-1098 SJ Amsterdam http://www.cwi.nl/static/publications/reports/reports.htm
    • …
    corecore