117,861 research outputs found

    A Robust Zero-Calibration RF-based Localization System for Realistic Environments

    Full text link
    Due to the noisy indoor radio propagation channel, Radio Frequency (RF)-based location determination systems usually require a tedious calibration phase to construct an RF fingerprint of the area of interest. This fingerprint varies with the used mobile device, changes of the transmit power of smart access points (APs), and dynamic changes in the environment; requiring re-calibration of the area of interest; which reduces the technology ease of use. In this paper, we present IncVoronoi: a novel system that can provide zero-calibration accurate RF-based indoor localization that works in realistic environments. The basic idea is that the relative relation between the received signal strength from two APs at a certain location reflects the relative distance from this location to the respective APs. Building on this, IncVoronoi incrementally reduces the user ambiguity region based on refining the Voronoi tessellation of the area of interest. IncVoronoi also includes a number of modules to efficiently run in realtime as well as to handle practical deployment issues including the noisy wireless environment, obstacles in the environment, heterogeneous devices hardware, and smart APs. We have deployed IncVoronoi on different Android phones using the iBeacons technology in a university campus. Evaluation of IncVoronoi with a side-by-side comparison with traditional fingerprinting techniques shows that it can achieve a consistent median accuracy of 2.8m under different scenarios with a low beacon density of one beacon every 44m2. Compared to fingerprinting techniques, whose accuracy degrades by at least 156%, this accuracy comes with no training overhead and is robust to the different user devices, different transmit powers, and over temporal changes in the environment. This highlights the promise of IncVoronoi as a next generation indoor localization system.Comment: 9 pages, 13 figures, published in SECON 201

    The multi-program performance model: debunking current practice in multi-core simulation

    Get PDF
    Composing a representative multi-program multi-core workload is non-trivial. A multi-core processor can execute multiple independent programs concurrently, and hence, any program mix can form a potential multi-program workload. Given the very large number of possible multiprogram workloads and the limited speed of current simulation methods, it is impossible to evaluate all possible multi-program workloads. This paper presents the Multi-Program Performance Model (MPPM), a method for quickly estimating multiprogram multi-core performance based on single-core simulation runs. MPPM employs an iterative method to model the tight performance entanglement between co-executing programs on a multi-core processor with shared caches. Because MPPM involves analytical modeling, it is very fast, and it estimates multi-core performance for a very large number of multi-program workloads in a reasonable amount of time. In addition, it provides confidence bounds on its performance estimates. Using SPEC CPU2006 and up to 16 cores, we report an average performance prediction error of 2.3% and 2.9% for system throughput (STP) and average normalized turnaround time (ANTT), respectively, while being up to five orders of magnitude faster than detailed simulation. Subsequently, we demonstrate that randomly picking a limited number of multi-program workloads, as done in current pactice, can lead to incorrect design decisions in practical design and research studies, which is alleviated using MPPM. In addition, MPPM can be used to quickly identify multi-program workloads that stress multi-core performance through excessive conflict behavior in shared caches; these stress workloads can then be used for driving the design process further

    Racing to hardware-validated simulation

    Get PDF
    Processor simulators rely on detailed timing models of the processor pipeline to evaluate performance. The diversity in real-world processor designs mandates building flexible simulators that expose parts of the underlying model to the user in the form of configurable parameters. Consequently, the accuracy of modeling a real processor relies on both the accuracy of the pipeline model itself, and the accuracy of adjusting the configuration parameters according to the modeled processor. Unfortunately, processor vendors publicly disclose only a subset of their design decisions, raising the probability of introducing specification inaccuracies when modeling these processors. Inaccurately tuning model parameters deviates the simulated processor from the actual one. In the worst case, using improper parameters may lead to imbalanced pipeline models compromising the simulation output. Therefore, simulation models should be hardware-validated before using them for performance evaluation. As processors increase in complexity and diversity, validating a simulator model against real hardware becomes increasingly more challenging and time-consuming. In this work, we propose a methodology for validating simulation models against real hardware. We create a framework that relies on micro-benchmarks to collect performance statistics on real hardware, and machine learning-based algorithms to fine-tune the unknown parameters based on the accumulated statistics. We overhaul the Sniper simulator to support the ARM AArch64 instruction-set architecture (ISA), and introduce two new timing models for ARM-based in-order and out-of-order cores. Using our proposed simulator validation framework, we tune the in-order and out-of-order models to match the performance of a real-world implementation of the Cortex-A53 and Cortex-A72 cores with an average error of 7% and 15%, respectively, across a set of SPEC CPU2017 benchmarks

    BarrierPoint: sampled simulation of multi-threaded applications

    Get PDF
    Sampling is a well-known technique to speed up architectural simulation of long-running workloads while maintaining accurate performance predictions. A number of sampling techniques have recently been developed that extend well- known single-threaded techniques to allow sampled simulation of multi-threaded applications. Unfortunately, prior work is limited to non-synchronizing applications (e.g., server throughput workloads); requires the functional simulation of the entire application using a detailed cache hierarchy which limits the overall simulation speedup potential; leads to different units of work across different processor architectures which complicates performance analysis; or, requires massive machine resources to achieve reasonable simulation speedups. In this work, we propose BarrierPoint, a sampling methodology to accelerate simulation by leveraging globally synchronizing barriers in multi-threaded applications. BarrierPoint collects microarchitecture-independent code and data signatures to determine the most representative inter-barrier regions, called barrierpoints. BarrierPoint estimates total application execution time (and other performance metrics of interest) through detailed simulation of these barrierpoints only, leading to substantial simulation speedups. Barrierpoints can be simulated in parallel, use fewer simulation resources, and define fixed units of work to be used in performance comparisons across processor architectures. Our evaluation of BarrierPoint using NPB and Parsec benchmarks reports average simulation speedups of 24.7x (and up to 866.6x) with an average simulation error of 0.9% and 2.9% at most. On average, BarrierPoint reduces the number of simulation machine resources needed by 78x
    corecore