15,059 research outputs found
Elite Bases Regression: A Real-time Algorithm for Symbolic Regression
Symbolic regression is an important but challenging research topic in data
mining. It can detect the underlying mathematical models. Genetic programming
(GP) is one of the most popular methods for symbolic regression. However, its
convergence speed might be too slow for large scale problems with a large
number of variables. This drawback has become a bottleneck in practical
applications. In this paper, a new non-evolutionary real-time algorithm for
symbolic regression, Elite Bases Regression (EBR), is proposed. EBR generates a
set of candidate basis functions coded with parse-matrix in specific mapping
rules. Meanwhile, a certain number of elite bases are preserved and updated
iteratively according to the correlation coefficients with respect to the
target model. The regression model is then spanned by the elite bases. A
comparative study between EBR and a recent proposed machine learning method for
symbolic regression, Fast Function eXtraction (FFX), are conducted. Numerical
results indicate that EBR can solve symbolic regression problems more
effectively.Comment: The 2017 13th International Conference on Natural Computation, Fuzzy
Systems and Knowledge Discovery (ICNC-FSKD 2017
Temporal Feature Selection with Symbolic Regression
Building and discovering useful features when constructing machine learning models is the central task for the machine learning practitioner. Good features are useful not only in increasing the predictive power of a model but also in illuminating the underlying drivers of a target variable. In this research we propose a novel feature learning technique in which Symbolic regression is endowed with a ``Range Terminal\u27\u27 that allows it to explore functions of the aggregate of variables over time. We test the Range Terminal on a synthetic data set and a real world data in which we predict seasonal greenness using satellite derived temperature and snow data over a portion of the Arctic. On the synthetic data set we find Symbolic regression with the Range Terminal outperforms standard Symbolic regression and Lasso regression. On the Arctic data set we find it outperforms standard Symbolic regression, fails to beat the Lasso regression, but finds useful features describing the interaction between Land Surface Temperature, Snow, and seasonal vegetative growth in the Arctic
Using Markov Models and Statistics to Learn, Extract, Fuse, and Detect Patterns in Raw Data
Many systems are partially stochastic in nature. We have derived data driven
approaches for extracting stochastic state machines (Markov models) directly
from observed data. This chapter provides an overview of our approach with
numerous practical applications. We have used this approach for inferring
shipping patterns, exploiting computer system side-channel information, and
detecting botnet activities. For contrast, we include a related data-driven
statistical inferencing approach that detects and localizes radiation sources.Comment: Accepted by 2017 International Symposium on Sensor Networks, Systems
and Securit
- …