1,468 research outputs found
Integer Echo State Networks: Hyperdimensional Reservoir Computing
We propose an approximation of Echo State Networks (ESN) that can be
efficiently implemented on digital hardware based on the mathematics of
hyperdimensional computing. The reservoir of the proposed Integer Echo State
Network (intESN) is a vector containing only n-bits integers (where n<8 is
normally sufficient for a satisfactory performance). The recurrent matrix
multiplication is replaced with an efficient cyclic shift operation. The intESN
architecture is verified with typical tasks in reservoir computing: memorizing
of a sequence of inputs; classifying time-series; learning dynamic processes.
Such an architecture results in dramatic improvements in memory footprint and
computational efficiency, with minimal performance loss.Comment: 10 pages, 10 figures, 1 tabl
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
Node harvest
When choosing a suitable technique for regression and classification with
multivariate predictor variables, one is often faced with a tradeoff between
interpretability and high predictive accuracy. To give a classical example,
classification and regression trees are easy to understand and interpret. Tree
ensembles like Random Forests provide usually more accurate predictions. Yet
tree ensembles are also more difficult to analyze than single trees and are
often criticized, perhaps unfairly, as `black box' predictors. Node harvest is
trying to reconcile the two aims of interpretability and predictive accuracy by
combining positive aspects of trees and tree ensembles. Results are very sparse
and interpretable and predictive accuracy is extremely competitive, especially
for low signal-to-noise data. The procedure is simple: an initial set of a few
thousand nodes is generated randomly. If a new observation falls into just a
single node, its prediction is the mean response of all training observation
within this node, identical to a tree-like prediction. A new observation falls
typically into several nodes and its prediction is then the weighted average of
the mean responses across all these nodes. The only role of node harvest is to
`pick' the right nodes from the initial large ensemble of nodes by choosing
node weights, which amounts in the proposed algorithm to a quadratic
programming problem with linear inequality constraints. The solution is sparse
in the sense that only very few nodes are selected with a nonzero weight. This
sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to
select a tuning parameter for optimal predictive accuracy. Node harvest can
handle mixed data and missing values and is shown to be simple to interpret and
competitive in predictive accuracy on a variety of data sets.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS367 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Identifying static and dynamic prediction models for NOx emissions with evolving fuzzy systems
Antipollution legislation in automotive internal combustion engines requires active control and prediction of pollutant formation and emissions. Predictive emission models are of great use in the system calibration phase, and also can be integrated for the engine control and on-board diagnosis tasks. In this paper, fuzzy modelling of the NOx emissions of a diesel engine is investigated, which overcomes some drawbacks of pure engine mapping or analytical physical-oriented models. For building up the fuzzy NOx prediction models, the FLEXFIS approach (short for FLEXible Fuzzy Inference Systems) is applied, which automatically extracts an appropriate number of rules and fuzzy sets by an evolving version of vector quantization (eVQ) and estimates the consequent parameters of Takagi-Sugeno fuzzy systems with the local learning approach in order to optimize the least squares functional. The predictive power of the fuzzy NOx prediction models is compared with that one achieved by physical-oriented models based on high-dimensional engine data recorded during steady-state and dynamic engine states.This work was supported by the Upper Austrian Technology and Research Promotion. This publication reflects only the author's view. Furthermore, we acknowledge PSA for providing the engine and partially supporting our investigation. Special thanks are given to PO Calendini, P Gaillard and C. Bares at the Diesel Engine Control Department.Lughofer, E.; Macian Martinez, V.; Guardiola García, C.; Klement, EP. (2011). Identifying static and dynamic prediction models for NOx emissions with evolving fuzzy systems. Applied Soft Computing. 11(2):2487-2500. doi:10.1016/j.asoc.2010.10.004S2487250011
- …