1,619 research outputs found

    Edge-centric inferential modeling & analytics

    Get PDF
    This work contributes to a real-time, edge-centric inferential modeling and analytics methodology introducing the fundamental mechanisms for (i) predictive models update and (ii) diverse models selection in distributed computing. Our objective in edge-centric analytics is the time-optimized model caching and selective forwarding at the network edge adopting optimal stopping theory, where communication overhead is significantly reduced as only inferred knowledge and sufficient statistics are delivered instead of raw data obtaining high quality of analytics. Novel model selection algorithms are introduced to fuse the inherent models' diversity over distributed edge nodes to support inferential analytics tasks to end-users/analysts, and applications in real-time. We provide statistical learning modeling and establish the corresponding mathematical analyses of our mechanisms along with comprehensive performance and comparative assessment using real data from different domains and showing its benefits in edge computing

    Best Linear Unbiased Estimation Fusion with Constraints

    Get PDF
    Estimation fusion, or data fusion for estimation, is the problem of how to best utilize useful information contained in multiple data sets for the purpose of estimating an unknown quantity — a parameter or a process. Estimation fusion with constraints gives rise to challenging theoretical problems given the observations from multiple geometrically dispersed sensors: Under dimensionality constraints, how to preprocess data at each local sensor to achieve the best estimation accuracy at the fusion center? Under communication bandwidth constraints, how to quantize local sensor data to minimize the estimation error at the fusion center? Under constraints on storage, how to optimally update state estimates at the fusion center with out-of-sequence measurements? Under constraints on storage, how to apply the out-of-sequence measurements (OOSM) update algorithm to multi-sensor multi-target tracking in clutter? The present work is devoted to the above topics by applying the best linear unbiased estimation (BLUE) fusion. We propose optimal data compression by reducing sensor data from a higher dimension to a lower dimension with minimal or no performance loss at the fusion center. For single-sensor and some particular multiple-sensor systems, we obtain the explicit optimal compression rule. For a multisensor system with a general dimensionality requirement, we propose the Gauss-Seidel iterative algorithm to search for the optimal compression rule. Another way to accomplish sensor data compression is to find an optimal sensor quantizer. Using BLUE fusion rules, we develop optimal sensor data quantization schemes according to the bit rate constraints in communication between each sensor and the fusion center. For a dynamic system, how to perform the state estimation and sensor quantization update simultaneously is also established, along with a closed form of a recursion for a linear system with additive white Gaussian noise. A globally optimal OOSM update algorithm and a constrained optimal update algorithm are derived to solve one-lag as well as multi-lag OOSM update problems. In order to extend the OOSM update algorithms to multisensor multitarget tracking in clutter, we also study the performance of OOSM update associated with the Probabilistic Data Association (PDA) algorithm

    Structural Generative Descriptions for Temporal Data

    Get PDF
    In data mining problems the representation or description of data plays a fundamental role, since it defines the set of essential properties for the extraction and characterisation of patterns. However, for the case of temporal data, such as time series and data streams, one outstanding issue when developing mining algorithms is finding an appropriate data description or representation. In this thesis two novel domain-independent representation frameworks for temporal data suitable for off-line and online mining tasks are formulated. First, a domain-independent temporal data representation framework based on a novel data description strategy which combines structural and statistical pattern recognition approaches is developed. The key idea here is to move the structural pattern recognition problem to the probability domain. This framework is composed of three general tasks: a) decomposing input temporal patterns into subpatterns in time or any other transformed domain (for instance, wavelet domain); b) mapping these subpatterns into the probability domain to find attributes of elemental probability subpatterns called primitives; and c) mining input temporal patterns according to the attributes of their corresponding probability domain subpatterns. This framework is referred to as Structural Generative Descriptions (SGDs). Two off-line and two online algorithmic instantiations of the proposed SGDs framework are then formulated: i) For the off-line case, the first instantiation is based on the use of Discrete Wavelet Transform (DWT) and Wavelet Density Estimators (WDE), while the second algorithm includes DWT and Finite Gaussian Mixtures. ii) For the online case, the first instantiation relies on an online implementation of DWT and a recursive version of WDE (RWDE), whereas the second algorithm is based on a multi-resolution exponentially weighted moving average filter and RWDE. The empirical evaluation of proposed SGDs-based algorithms is performed in the context of time series classification, for off-line algorithms, and in the context of change detection and clustering, for online algorithms. For this purpose, synthetic and publicly available real-world data are used. Additionally, a novel framework for multidimensional data stream evolution diagnosis incorporating RWDE into the context of Velocity Density Estimation (VDE) is formulated. Changes in streaming data and changes in their correlation structure are characterised by means of local and global evolution coefficients as well as by means of recursive correlation coefficients. The proposed VDE framework is evaluated using temperature data from the UK and air pollution data from Hong Kong.Open Acces

    Aircraft Fault Detection Using Real-Time Frequency Response Estimation

    Get PDF
    A real-time method for estimating time-varying aircraft frequency responses from input and output measurements was demonstrated. The Bat-4 subscale airplane was used with NASA Langley Research Center's AirSTAR unmanned aerial flight test facility to conduct flight tests and collect data for dynamic modeling. Orthogonal phase-optimized multisine inputs, summed with pilot stick and pedal inputs, were used to excite the responses. The aircraft was tested in its normal configuration and with emulated failures, which included a stuck left ruddervator and an increased command path latency. No prior knowledge of a dynamic model was used or available for the estimation. The longitudinal short period dynamics were investigated in this work. Time-varying frequency responses and stability margins were tracked well using a 20 second sliding window of data, as compared to a post-flight analysis using output error parameter estimation and a low-order equivalent system model. This method could be used in a real-time fault detection system, or for other applications of dynamic modeling such as real-time verification of stability margins during envelope expansion tests

    Adaptive Regression Methods with Application to Streaming Financial Data

    No full text
    This thesis is concerned with the analysis of adaptive incremental regression algorithms for data streams. The development of these algorithms is motivated by issues pertaining to financial data streams, data which are very noisy, non-stationary and exhibit high degrees of dependence. These incremental regression techniques are subsequently used to develop efficient and adaptive algorithms for portfolio allocation. We develop a number of temporally incremental regression algorithms that have the following attributes; efficiency: the algorithms are iterative, robustness: the algorithms have a built-in safeguard for outliers and/or use regularisation techniques to alleviate for estimation error, and adaptiveness: the algorithms estimation is adaptive to the underlying streaming data. These algorithms make use of known regression techniques: EWRLS (Exponentially Weighted Recursive Least Squares), TSVD (Truncated Singular Value Decomposition) and FLS (Flexible Least Squares). We focus more of our attention on a proposed robust version of EWRLS algorithm, denoted R-EWRLS, and assess its robustness using a purpose built simulation engine. This simulation engine is able to generate correlated data streams whose drift and correlation change over time and can be subjected to randomly generated outliers whose magnitudes and directions vary. The R-EWRLS algorithm is developed further to allow for a self-tuned forgetting factor in the formulation. The forgetting factor is an important tool to account for non-stationarity in the data through an exponential decay profile which assigns more weight to the more recent data. The new algorithm is assessed against the R-EWRLS algorithm using various performance measures. A number of applications with real data from equities and foreign exchange are used. Various measures are computed to compare our algorithms to established portfolio allocation techniques. The results are promising and in many cases outperform benchmark allocation techniques
    corecore