1,504 research outputs found

    Complications for Computational Experiments from Modern Processors

    Get PDF
    In this paper, we revisit the approach to empirical experiments for combinatorial solvers. We provide a brief survey on tools that can help to make empirical work easier. We illustrate origins of uncertainty in modern hardware and show how strong the influence of certain aspects of modern hardware and its experimental setup can be in an actual experimental evaluation. More specifically, there can be situations where (i) two different researchers run a reasonable-looking experiment comparing the same solvers and come to different conclusions and (ii) one researcher runs the same experiment twice on the same hardware and reaches different conclusions based upon how the hardware is configured and used. We investigate these situations from a hardware perspective. Furthermore, we provide an overview on standard measures, detailed explanations on effects, potential errors, and biased suggestions for useful tools. Alongside the tools, we discuss their feasibility as experiments often run on clusters to which the experimentalist has only limited access. Our work sheds light on a number of benchmarking-related issues which could be considered to be folklore or even myths

    Dynamic Scheduling for Energy Minimization in Delay-Sensitive Stream Mining

    Get PDF
    Numerous stream mining applications, such as visual detection, online patient monitoring, and video search and retrieval, are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness (i.e., delay) constraints for user interactivity and, at the same time, must be optimized for energy efficiency. The increasingly heterogeneous power-versus-performance profile of modern hardware presents new opportunities for energy saving as well as challenges. For example, employing low-performance processing nodes can save energy but may violate delay requirements, whereas employing high-performance processing nodes can deliver a fast response but may unnecessarily waste energy. Existing scheduling algorithms balance energy versus delay assuming constant processing and power requirements throughout the execution of a stream mining task and without exploiting hardware heterogeneity. In this paper, we propose a novel framework for dynamic scheduling for energy minimization (DSE) that leverages this emerging hardware heterogeneity. By optimally determining the processing speeds for hardware executing classifiers, DSE minimizes the average energy consumption while satisfying an average delay constraint. To assess the performance of DSE, we build a face detection application based on the Viola-Jones classifier chain and conduct experimental studies via heterogeneous processor system emulation. The results show that, under the same delay requirement, DSE reduces the average energy consumption by up to 50% in comparison to conventional scheduling that does not exploit hardware heterogeneity. We also demonstrate that DSE is robust against processing node switching overhead and model inaccuracy

    Adaptive Dispatching of Tasks in the Cloud

    Full text link
    The increasingly wide application of Cloud Computing enables the consolidation of tens of thousands of applications in shared infrastructures. Thus, meeting the quality of service requirements of so many diverse applications in such shared resource environments has become a real challenge, especially since the characteristics and workload of applications differ widely and may change over time. This paper presents an experimental system that can exploit a variety of online quality of service aware adaptive task allocation schemes, and three such schemes are designed and compared. These are a measurement driven algorithm that uses reinforcement learning, secondly a "sensible" allocation algorithm that assigns jobs to sub-systems that are observed to provide a lower response time, and then an algorithm that splits the job arrival stream into sub-streams at rates computed from the hosts' processing capabilities. All of these schemes are compared via measurements among themselves and with a simple round-robin scheduler, on two experimental test-beds with homogeneous and heterogeneous hosts having different processing capacities.Comment: 10 pages, 9 figure

    Low power digital signal processing

    Get PDF

    Fast algorithm for real-time rings reconstruction

    Get PDF
    The GAP project is dedicated to study the application of GPU in several contexts in which real-time response is important to take decisions. The definition of real-time depends on the application under study, ranging from answer time of μs up to several hours in case of very computing intensive task. During this conference we presented our work in low level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6]. Apart from the study of dedicated solution to decrease the latency due to data transport and preparation, the computing algorithms play an essential role in any GPU application. In this contribution, we show an original algorithm developed for triggers application, to accelerate the ring reconstruction in RICH detector when it is not possible to have seeds for reconstruction from external trackers

    CrowdAdaptor: A Crowd Sourcing Approach toward Adaptive Energy-Efficient Configurations of Virtual Machines Hosting Mobile Applications

    Get PDF
    Applications written by end-user programmers are hardly energy-optimized by these programmers. The end users of such applications thus suffer significant energy issues. In this paper, we propose CrowdAdaptor, a novel approach toward locating energy-efficient configurations to execute the applications hosted in virtual machines on handheld devices. CrowdAdaptor innovatively makes use of the development artifacts (test cases) and the very large installation base of the same application to distribute the test executions and performance data collection of the whole test suites against many different virtual machine configurations among these installation bases. It synthesizes these data, continuously discovers better energy-efficient configurations, and makes them available to all the installations of the same applications. We report a multi-subject case study on the ability of the framework to discover energy-efficient configurations in three power models. The results show that Crowd Adaptor can achieve up to 50% of energy savings based on a conservative linear power model.published_or_final_versio

    KALwEN: a new practical and interoperable key management scheme for body sensor networks

    Get PDF
    Key management is the pillar of a security architecture. Body sensor networks (BSNs) pose several challenges–some inherited from wireless sensor networks (WSNs), some unique to themselves–that require a new key management scheme to be tailor-made. The challenge is taken on, and the result is KALwEN, a new parameterized key management scheme that combines the best-suited cryptographic techniques in a seamless framework. KALwEN is user-friendly in the sense that it requires no expert knowledge of a user, and instead only requires a user to follow a simple set of instructions when bootstrapping or extending a network. One of KALwEN's key features is that it allows sensor devices from different manufacturers, which expectedly do not have any pre-shared secret, to establish secure communications with each other. KALwEN is decentralized, such that it does not rely on the availability of a local processing unit (LPU). KALwEN supports secure global broadcast, local broadcast, and local (neighbor-to-neighbor) unicast, while preserving past key secrecy and future key secrecy (FKS). The fact that the cryptographic protocols of KALwEN have been formally verified also makes a convincing case. With both formal verification and experimental evaluation, our results should appeal to theorists and practitioners alike

    Visual-Inertial Odometry on Chip: An Algorithm-and-Hardware Co-design Approach

    Get PDF
    Autonomous navigation of miniaturized robots (e.g., nano/pico aerial vehicles) is currently a grand challenge for robotics research, due to the need of processing a large amount of sensor data (e.g., camera frames) with limited on-board computational resources. In this paper we focus on the design of a visual-inertial odometry (VIO) system in which the robot estimates its ego-motion (and a landmark-based map) from on- board camera and IMU data. We argue that scaling down VIO to miniaturized platforms (without sacrificing performance) requires a paradigm shift in the design of perception algorithms, and we advocate a co-design approach in which algorithmic and hardware design choices are tightly coupled. Our contribution is four-fold. First, we discuss the VIO co-design problem, in which one tries to attain a desired resource-performance trade-off, by making suitable design choices (in terms of hardware, algorithms, implementation, and parameters). Second, we characterize the design space, by discussing how a relevant set of design choices affects the resource-performance trade-off in VIO. Third, we provide a systematic experiment-driven way to explore the design space, towards a design that meets the desired trade-off. Fourth, we demonstrate the result of the co-design process by providing a VIO implementation on specialized hardware and showing that such implementation has the same accuracy and speed of a desktop implementation, while requiring a fraction of the power.United States. Air Force Office of Scientific Research. Young Investigator Program (FA9550-16-1-0228)National Science Foundation (U.S.) (NSF CAREER 1350685
    corecore