1,150 research outputs found
A divide and conquer method for symbolic regression
Symbolic regression aims to find a function that best explains the
relationship between independent variables and the objective value based on a
given set of sample data. Genetic programming (GP) is usually considered as an
appropriate method for the problem since it can optimize functional structure
and coefficients simultaneously. However, the convergence speed of GP might be
too slow for large scale problems that involve a large number of variables.
Fortunately, in many applications, the target function is separable or
partially separable. This feature motivated us to develop a new method, divide
and conquer (D&C), for symbolic regression, in which the target function is
divided into a number of sub-functions and the sub-functions are then
determined by any of a GP algorithm. The separability is probed by a new
proposed technique, Bi-Correlation test (BiCT). D&C powered GP has been tested
on some real-world applications, and the study shows that D&C can help GP to
get the target function much more rapidly
Predicting the energy output of wind farms based on weather data: important variables and their correlation
Pre-print available at: http://arxiv.org/abs/1109.1922Wind energy plays an increasing role in the supply of energy world wide. The energy output of a wind farm is highly dependent on the weather conditions present at its site. If the output can be predicted more accurately, energy suppliers can coordinate the collaborative production of different energy sources more efficiently to avoid costly overproduction. In this paper, we take a computer science perspective on energy prediction based on weather data and analyze the important parameters as well as their correlation on the energy output. To deal with the interaction of the different parameters, we use symbolic regression based on the genetic programming tool DataModeler. Our studies are carried out on publicly available weather and energy data for a wind farm in Australia. We report on the correlation of the different variables for the energy output. The model obtained for energy prediction gives a very reliable prediction of the energy output for newly supplied weather data. © 2012 Elsevier Ltd.Ekaterina Vladislavleva, Tobias Friedrich, Frank Neumann, Markus Wagne
Interpretable Categorization of Heterogeneous Time Series Data
Understanding heterogeneous multivariate time series data is important in
many applications ranging from smart homes to aviation. Learning models of
heterogeneous multivariate time series that are also human-interpretable is
challenging and not adequately addressed by the existing literature. We propose
grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs
extend decision trees with a grammar framework. Logical expressions derived
from a context-free grammar are used for branching in place of simple
thresholds on attributes. The added expressivity enables support for a wide
range of data types while retaining the interpretability of decision trees. In
particular, when a grammar based on temporal logic is used, we show that GBDTs
can be used for the interpretable classi cation of high-dimensional and
heterogeneous time series data. Furthermore, we show how GBDTs can also be used
for categorization, which is a combination of clustering and generating
interpretable explanations for each cluster. We apply GBDTs to analyze the
classic Australian Sign Language dataset as well as data on near mid-air
collisions (NMACs). The NMAC data comes from aircraft simulations used in the
development of the next-generation Airborne Collision Avoidance System (ACAS
X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data
Mining (SDM) 201
Late-Breaking Papers of EuroGP-99
This booklet contains the late-breaking papers of the Second European Workshop on Genetic Programming (EuroGP’99) held in G"oteborg Sweden 26–27 May 1999. EuroGP’99 was one of the EvoNet workshops on evolutionary computing, EvoWorkshops’99. The purpose of the late-breaking papers was to provide attendees with information about research that was initiated, enhanced, improved, or completed after the original paper submission deadline in December 1998. To ensure coverage of the most up-to-date research, the deadline for submission was set only a month before the workshop. Late-breaking papers were examined for relevance and quality by the organisers of the EuroGP’99, but no formal review process took place. The 3 late-breaking papers in this booklet (which was distributed at the workshop) were presented during a poster session held on Thursday 27 May 1999 during EuroGP’99. Authors individually retain copyright (and all other rights) to their late-breaking papers. This booklet is available as a technical report SEN-R9913 from Centrum voor Wiskunde en Informatica, Kruislaan 413, NL-1098 SJ Amsterdam http://www.cwi.nl/static/publications/reports/reports.htm
- …