23 research outputs found

    Fast Parallel Exact Inference on Bayesian Networks: Poster

    Full text link
    Bayesian networks (BNs) are attractive, because they are graphical and interpretable machine learning models. However, exact inference on BNs is time-consuming, especially for complex problems. To improve the efficiency, we propose a fast BN exact inference solution named Fast-BNI on multi-core CPUs. Fast-BNI enhances the efficiency of exact inference through hybrid parallelism that tightly integrates coarse- and fine-grained parallelism. We also propose techniques to further simplify the bottleneck operations of BN exact inference. Fast-BNI source code is freely available at https://github.com/jjiantong/FastBN

    Dynamic Scheduling for Energy Minimization in Delay-Sensitive Stream Mining

    Get PDF
    Numerous stream mining applications, such as visual detection, online patient monitoring, and video search and retrieval, are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness (i.e., delay) constraints for user interactivity and, at the same time, must be optimized for energy efficiency. The increasingly heterogeneous power-versus-performance profile of modern hardware presents new opportunities for energy saving as well as challenges. For example, employing low-performance processing nodes can save energy but may violate delay requirements, whereas employing high-performance processing nodes can deliver a fast response but may unnecessarily waste energy. Existing scheduling algorithms balance energy versus delay assuming constant processing and power requirements throughout the execution of a stream mining task and without exploiting hardware heterogeneity. In this paper, we propose a novel framework for dynamic scheduling for energy minimization (DSE) that leverages this emerging hardware heterogeneity. By optimally determining the processing speeds for hardware executing classifiers, DSE minimizes the average energy consumption while satisfying an average delay constraint. To assess the performance of DSE, we build a face detection application based on the Viola-Jones classifier chain and conduct experimental studies via heterogeneous processor system emulation. The results show that, under the same delay requirement, DSE reduces the average energy consumption by up to 50% in comparison to conventional scheduling that does not exploit hardware heterogeneity. We also demonstrate that DSE is robust against processing node switching overhead and model inaccuracy

    ACCURACY AND MULTI-CORE PERFORMANCE OF MACHINE LEARNING ALGORITHMS FOR HANDWRITTEN CHARACTER RECOGNITION

    Get PDF
    There have been considerable developments in the quest for intelligent machines since the beginning of the cybernetics revolution and the advent of computers. In the last two decades with the onset of the internet the developments have been extensive. This quest for building intelligent machines have led into research on the working of human brain, which has in turn led to the development of pattern recognition models which take inspiration in their structure and performance from biological neural networks. Research in creating intelligent systems poses two main problems. The first one is to develop algorithms which can generalize and predict accurately based on previous examples. The second one is to make these algorithms run fast enough to be able to do real time tasks. The aim of this thesis is to study and compare the accuracy and multi-core performance of some of the best learning algorithms to the task of handwritten character recognition. Seven algorithms are compared for their accuracy on the MNIST database, and the test set accuracy (generalization) for the different algorithms are compared. The second task is to implement and compare the performance of two of the hierarchical Bayesian based cortical algorithms, Hierarchical Temporal Memory (HTM) and Hierarchical Expectation Refinement Algorithm (HERA) on multi-core architectures. The results indicate that the HTM and HERA algorithms can make use of the parallelism in multi-core architectures

    Towards Real-time, On-board, Hardware-Supported Sensor and Software Health Management for Unmanned Aerial Systems

    Get PDF
    Unmanned aerial systems (UASs) can only be deployed if they can effectively complete their missions and respond to failures and uncertain environmental conditions while maintaining safety with respect to other aircraft as well as humans and property on the ground. In this paper, we design a real-time, on-board system health management (SHM) capability to continuously monitor sensors, software, and hardware components for detection and diagnosis of failures and violations of safety or performance rules during the flight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and/or software signals; (2) signal analysis, preprocessing, and advanced on the- fly temporal and Bayesian probabilistic fault diagnosis; (3) an unobtrusive, lightweight, read-only, low-power realization using Field Programmable Gate Arrays (FPGAs) that avoids overburdening limited computing resources or costly re-certification of flight software due to instrumentation. Our implementation provides a novel approach of combining modular building blocks, integrating responsive runtime monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. We demonstrate this approach using actual data from the NASA Swift UAS, an experimental all-electric aircraft

    Optimising Memory Management for Belief Propagation in Junction Trees using GPGPUs

    Get PDF
    Belief Propagation (BP) in Junction Trees (JT) is one of the most popular approaches to compute posteriors in Bayesian Networks (BN). Such approach has significant computational requirements that can be addressed by using highly parallel architectures (i.e., General Purpose Graphic Processing Units) to parallelise the message update phases of BP. In this paper, we propose a novel approach to parallelise BP with GPGPUs, which focuses on optimising the memory layout of the BN tables so to achieve better performance in terms of increased speedup, reduced data transfers between the host and the GPGPU, and scalability. Our empirical comparison with the state of the art approach on standard datasets confirms significant improvements in speedups (up to +594%), and scalability (as our method can operate on networks whose potential tables exceed the global memory of the GPGPU)

    Towards Real-Time, On-Board, Hardware-Supported Sensor and Software Health Management for Unmanned Aerial Systems

    Get PDF
    For unmanned aerial systems (UAS) to be successfully deployed and integrated within the national airspace, it is imperative that they possess the capability to effectively complete their missions without compromising the safety of other aircraft, as well as persons and property on the ground. This necessity creates a natural requirement for UAS that can respond to uncertain environmental conditions and emergent failures in real-time, with robustness and resilience close enough to those of manned systems. We introduce a system that meets this requirement with the design of a real-time onboard system health management (SHM) capability to continuously monitor sensors, software, and hardware components. This system can detect and diagnose failures and violations of safety or performance rules during the flight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and software signals; (2) signal analysis, preprocessing, and advanced on-the-fly temporal and Bayesian probabilistic fault diagnosis; and (3) an unobtrusive, lightweight, read-only, low-power realization using Field Programmable Gate Arrays (FPGAs) that avoids overburdening limited computing resources or costly re-certification of flight software. We call this approach rt-R2U2, a name derived from its requirements. Our implementation provides a novel approach of combining modular building blocks, integrating responsive runtime monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. We demonstrate this approach using actual flight data from the NASA Swift UAS

    Acceleration of Spiking Neural Networks on Multicore Architectures

    Get PDF
    The human cortex is the seat of learning and cognition. Biological scale implementations of cortical models have the potential to provide significantly more power problem solving capabilities than traditional computing algorithms. The large scale implementation and design of these models has attracted significant attention recently. High performance implementations of the models are needed to enable such large scale designs. This thesis examines the acceleration of the spiking neural network class of cortical models on several modern multicore processors. These include the Izhikevich, Wilson, Morris-Lecar, and Hodgkin-Huxley models. The architectures examined are the STI Cell, Sun UltraSPARC T2+, and Intel Xeon E5345. Results indicate that these modern multicore processors can provide significant speed-ups and thus are useful in developing large scale cortical models. The models are then implemented on a 50 TeraFLOPS 336 node PlayStation 3 cluster. Results indicate that the models scale well on this cluster and can emulate 108 neurons and 1010 synapses. These numbers are comparable to the large scale cortical model implementation studies performed by IBM using the Blue Gene/L supercomputer. This study indicates that a cluster of PlayStation 3s can provide an economical, yet powerful, platform for simulating large scale biological models

    Towards Real-time, On-board, Hardware-supported Sensor and Software Health Management for Unmanned Aerial Systems

    Get PDF
    For unmanned aerial systems (UAS) to be successfully deployed and integrated within the national airspace, it is imperative that they possess the capability to effectively complete their missions without compromising the safety of other aircraft, as well as persons and property on the ground. This necessity creates a natural requirement for UAS that can respond to uncertain environmental conditions and emergent failures in real-time, with robustness and resilience close enough to those of manned systems. We introduce a system that meets this requirement with the design of a real-time onboard system health management (SHM) capability to continuously monitor sensors, software, and hardware components. This system can detect and diagnose failures and violations of safety or performance rules during the flight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and software signals; (2) signal analysis, preprocessing, and advanced on-the-fly temporal and Bayesian probabilistic fault diagnosis; and (3) an unobtrusive, lightweight, read-only, low-power realization using Field Programmable Gate Arrays (FPGAs) that avoids overburdening limited computing resources or costly re-certification of flight software. We call this approach rt-R2U2, a name derived from its requirements. Our implementation provides a novel approach of combining modular building blocks, integrating responsive runtime monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. We demonstrate this approach using actual flight data from the NASA Swift UAS
    corecore