4,800 research outputs found

    PPF - A Parallel Particle Filtering Library

    Full text link
    We present the parallel particle filtering (PPF) software library, which enables hybrid shared-memory/distributed-memory parallelization of particle filtering (PF) algorithms combining the Message Passing Interface (MPI) with multithreading for multi-level parallelism. The library is implemented in Java and relies on OpenMPI's Java bindings for inter-process communication. It includes dynamic load balancing, multi-thread balancing, and several algorithmic improvements for PF, such as input-space domain decomposition. The PPF library hides the difficulties of efficient parallel programming of PF algorithms and provides application developers with the necessary tools for parallel implementation of PF methods. We demonstrate the capabilities of the PPF library using two distributed PF algorithms in two scenarios with different numbers of particles. The PPF library runs a 38 million particle problem, corresponding to more than 1.86 GB of particle data, on 192 cores with 67% parallel efficiency. To the best of our knowledge, the PPF library is the first open-source software that offers a parallel framework for PF applications.Comment: 8 pages, 8 figures; will appear in the proceedings of the IET Data Fusion & Target Tracking Conference 201

    Scalable multimodal convolutional networks for brain tumour segmentation

    Get PDF
    Brain tumour segmentation plays a key role in computer-assisted surgery. Deep neural networks have increased the accuracy of automatic segmentation significantly, however these models tend to generalise poorly to different imaging modalities than those for which they have been designed, thereby limiting their applications. For example, a network architecture initially designed for brain parcellation of monomodal T1 MRI can not be easily translated into an efficient tumour segmentation network that jointly utilises T1, T1c, Flair and T2 MRI. To tackle this, we propose a novel scalable multimodal deep learning architecture using new nested structures that explicitly leverage deep features within or across modalities. This aims at making the early layers of the architecture structured and sparse so that the final architecture becomes scalable to the number of modalities. We evaluate the scalable architecture for brain tumour segmentation and give evidence of its regularisation effect compared to the conventional concatenation approach.Comment: Paper accepted at MICCAI 201

    Complete methods set for scalable ion trap quantum information processing

    Full text link
    Large-scale quantum information processors must be able to transport and maintain quantum information, and repeatedly perform logical operations. Here we demonstrate a combination of all the fundamental elements required to perform scalable quantum computing using qubits stored in the internal states of trapped atomic ions. We quantify the repeatability of a multi-qubit operation, observing no loss of performance despite qubit transport over macroscopic distances. Key to these results is the use of different pairs of beryllium ion hyperfine states for robust qubit storage, readout and gates, and simultaneous trapping of magnesium re-cooling ions along with the qubit ions.Comment: 9 pages, 4 figures. Accepted to Science, and thus subject to a press embarg

    TaskPoint: sampled simulation of task-based programs

    Get PDF
    Sampled simulation is a mature technique for reducing simulation time of single-threaded programs, but it is not directly applicable to simulation of multi-threaded architectures. Recent multi-threaded sampling techniques assume that the workload assigned to each thread does not change across multiple executions of a program. This assumption does not hold for dynamically scheduled task-based programming models. Task-based programming models allow the programmer to specify program segments as tasks which are instantiated many times and scheduled dynamically to available threads. Due to system noise and variation in scheduling decisions, two consecutive executions on the same machine typically result in different instruction streams processed by each thread. In this paper, we propose TaskPoint, a sampled simulation technique for dynamically scheduled task-based programs. We leverage task instances as sampling units and simulate only a fraction of all task instances in detail. Between detailed simulation intervals we employ a novel fast-forward mechanism for dynamically scheduled programs. We evaluate the proposed technique on a set of 19 task-based parallel benchmarks and two different architectures. Compared to detailed simulation, TaskPoint accelerates architectural simulation with 64 simulated threads by an average factor of 19.1 at an average error of 1.8% and a maximum error of 15.0%.This work has been supported by the Spanish Government (Severo Ochoa grants SEV2015-0493, SEV-2011-00067), the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), the RoMoL ERC Advanced Grant (GA 321253), the European HiPEAC Network of Excellence and the Mont-Blanc project (EU-FP7-610402 and EU-H2020-671697). M. Moreto has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship JCI-2012-15047. M. Casas is supported by the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the EUFP7 (contract 2013BP B 00243). T.Grass has been partially supported by the AGAUR of the Generalitat de Catalunya (grant 2013FI B 0058).Peer ReviewedPostprint (author's final draft

    A Parallel Histogram-based Particle Filter for Object Tracking on SIMD-based Smart Cameras

    Get PDF
    We present a parallel implementation of a histogram-based particle filter for object tracking on smart cameras based on SIMD processors. We specifically focus on parallel computation of the particle weights and parallel construction of the feature histograms since these are the major bottlenecks in standard implementations of histogram-based particle filters. The proposed algorithm can be applied with any histogram-based feature sets—we show in detail how the parallel particle filter can employ simple color histograms as well as more complex histograms of oriented gradients (HOG). The algorithm was successfully implemented on an SIMD processor and performs robust object tracking at up to 30 frames per second—a performance difficult to achieve even on a modern desktop computer
    • …
    corecore