1,468 research outputs found

    Integer Echo State Networks: Hyperdimensional Reservoir Computing

    Full text link
    We propose an approximation of Echo State Networks (ESN) that can be efficiently implemented on digital hardware based on the mathematics of hyperdimensional computing. The reservoir of the proposed Integer Echo State Network (intESN) is a vector containing only n-bits integers (where n<8 is normally sufficient for a satisfactory performance). The recurrent matrix multiplication is replaced with an efficient cyclic shift operation. The intESN architecture is verified with typical tasks in reservoir computing: memorizing of a sequence of inputs; classifying time-series; learning dynamic processes. Such an architecture results in dramatic improvements in memory footprint and computational efficiency, with minimal performance loss.Comment: 10 pages, 10 figures, 1 tabl

    Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization

    Full text link
    Protecting vast quantities of data poses a daunting challenge for the growing number of organizations that collect, stockpile, and monetize it. The ability to distinguish data that is actually needed from data collected "just in case" would help these organizations to limit the latter's exposure to attack. A natural approach might be to monitor data use and retain only the working-set of in-use data in accessible storage; unused data can be evicted to a highly protected store. However, many of today's big data applications rely on machine learning (ML) workloads that are periodically retrained by accessing, and thus exposing to attack, the entire data store. Training set minimization methods, such as count featurization, are often used to limit the data needed to train ML workloads to improve performance or scalability. We present Pyramid, a limited-exposure data management system that builds upon count featurization to enhance data protection. As such, Pyramid uniquely introduces both the idea and proof-of-concept for leveraging training set minimization methods to instill rigor and selectivity into big data management. We integrated Pyramid into Spark Velox, a framework for ML-based targeting and personalization. We evaluate it on three applications and show that Pyramid approaches state-of-the-art models while training on less than 1% of the raw data

    Node harvest

    Full text link
    When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS367 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Identifying static and dynamic prediction models for NOx emissions with evolving fuzzy systems

    Full text link
    Antipollution legislation in automotive internal combustion engines requires active control and prediction of pollutant formation and emissions. Predictive emission models are of great use in the system calibration phase, and also can be integrated for the engine control and on-board diagnosis tasks. In this paper, fuzzy modelling of the NOx emissions of a diesel engine is investigated, which overcomes some drawbacks of pure engine mapping or analytical physical-oriented models. For building up the fuzzy NOx prediction models, the FLEXFIS approach (short for FLEXible Fuzzy Inference Systems) is applied, which automatically extracts an appropriate number of rules and fuzzy sets by an evolving version of vector quantization (eVQ) and estimates the consequent parameters of Takagi-Sugeno fuzzy systems with the local learning approach in order to optimize the least squares functional. The predictive power of the fuzzy NOx prediction models is compared with that one achieved by physical-oriented models based on high-dimensional engine data recorded during steady-state and dynamic engine states.This work was supported by the Upper Austrian Technology and Research Promotion. This publication reflects only the author's view. Furthermore, we acknowledge PSA for providing the engine and partially supporting our investigation. Special thanks are given to PO Calendini, P Gaillard and C. Bares at the Diesel Engine Control Department.Lughofer, E.; Macian Martinez, V.; Guardiola García, C.; Klement, EP. (2011). Identifying static and dynamic prediction models for NOx emissions with evolving fuzzy systems. Applied Soft Computing. 11(2):2487-2500. doi:10.1016/j.asoc.2010.10.004S2487250011
    corecore