8 research outputs found

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded â„“2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded â„“2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    kD-STR : a method for spatio-temporal data reduction and modelling

    Get PDF
    Analysing and learning from spatio-temporal datasets is an important process in many domains, including transportation, healthcare and meteorology. In particular, data collected by sensors in the environment allows us to understand and model the processes acting within the environment. Recently, the volume of spatio-temporal data collected has increased significantly, presenting several challenges for data scientists. Methods are therefore needed to reduce the quantity of data that needs to be processed in order to analyse and learn from spatio-temporal datasets. In this article, we present the -Dimensional Spatio-Temporal Reduction method (D-STR) for reducing the quantity of data used to store a dataset whilst enabling multiple types of analysis on the reduced dataset. D-STR uses hierarchical partitioning to find spatio-temporal regions of similar instances, and models the instances within each region to summarise the dataset. We demonstrate the generality of D-STR with three datasets exhibiting different spatio-temporal characteristics and present results for a range of data modelling techniques. Finally, we compare D-STR with other techniques for reducing the volume of spatio-temporal data. Our results demonstrate that D-STR is effective in reducing spatio-temporal data and generalises to datasets that exhibit different properties

    Reducing spatio-temporal data : methods and analysis

    Get PDF
    Analysing and learning from spatio-temporal datasets is an important process in many domains, including transportation, healthcare and meteorology. However, in recent years, the volume of data generated for such datasets has increased significantly. This poses several challenges for data scientists, including increased processing overheads and costs. Thus, several methods have been proposed for reducing the volume of data stored and processed to analyse and learn from these datasets. However, existing methods fail to take advantage of the spatial and temporal autocorrelation present in spatio-temporal data, incur unnecessary overheads when retrieving the data, or fail to retain information about all instances and features. This thesis introduces several data reduction methods to address these limitations. First, the kD-STR algorithm is introduced, which hierarchically partitions and models the data, thereby reducing the storage overhead of the dataset. This method minimises the storage used and error incurred. Second, this reduction method is adapted for the context of data linking, and an alternative heuristic proposed that minimises error in the features engineered during linking. Third, adapted algorithms are presented for reducing multiple datasets simultaneously, and reducing large datasets in a distributed manner. Through empirical analysis using real-world datasets, the utility of these algorithms is investigated. The results presented demonstrate the data reduction that can be achieved using these algorithms, as well as the impact of using different spatial referencing systems and modelling techniques. Further analysis is presented that demonstrates the effect of error in location and time, noise and missing data on the data reduction. Combined, the algorithms presented offer an improvement over the state-of-the-art in spatio-temporal data reduction, and the analysis presented demonstrates the results that may be achieved for datasets exhibiting a range of characteristics

    Data Sketching for Large-Scale Kalman Filtering

    No full text
    corecore