2,442 research outputs found

    High Performance Algorithms for Counting Collisions and Pairwise Interactions

    Full text link
    The problem of counting collisions or interactions is common in areas as computer graphics and scientific simulations. Since it is a major bottleneck in applications of these areas, a lot of research has been carried out on such subject, mainly focused on techniques that allow calculations to be performed within pruned sets of objects. This paper focuses on how interaction calculation (such as collisions) within these sets can be done more efficiently than existing approaches. Two algorithms are proposed: a sequential algorithm that has linear complexity at the cost of high memory usage; and a parallel algorithm, mathematically proved to be correct, that manages to use GPU resources more efficiently than existing approaches. The proposed and existing algorithms were implemented, and experiments show a speedup of 21.7 for the sequential algorithm (on small problem size), and 1.12 for the parallel proposal (large problem size). By improving interaction calculation, this work contributes to research areas that promote interconnection in the modern world, such as computer graphics and robotics.Comment: Accepted in ICCS 2019 and published in Springer's LNCS series. Supplementary content at https://mjsaldanha.com/articles/1-hpc-ssp

    Throughput Analysis of CSMA Wireless Networks with Finite Offered-load

    Full text link
    This paper proposes an approximate method, equivalent access intensity (EAI), for the throughput analysis of CSMA wireless networks in which links have finite offered-load and their MAC-layer transmit buffers may be empty from time to time. Different from prior works that mainly considered the saturated network, we take into account in our analysis the impacts of empty transmit buffers on the interactions and dependencies among links in the network that is more common in practice. It is known that the empty transmit buffer incurs extra waiting time for a link to compete for the channel airtime usage, since when it has no packet waiting for transmission, the link will not perform channel competition. The basic idea behind EAI is that this extra waiting time can be mapped to an equivalent "longer" backoff countdown time for the unsaturated link, yielding a lower link access intensity that is defined as the mean packet transmission time divided by the mean backoff countdown time. That is, we can compute the "equivalent access intensity" of an unsaturated link to incorporate the effects of the empty transmit buffer on its behavior of channel competition. Then, prior saturated ideal CSMA network (ICN) model can be adopted for link throughput computation. Specifically, we propose an iterative algorithm, "Compute-and-Compare", to identify which links are unsaturated under current offered-load and protocol settings, compute their "equivalent access intensities" and calculate link throughputs. Simulation shows that our algorithm has high accuracy under various offered-load and protocol settings. We believe the ability to identify unsaturated links and compute links throughputs as established in this paper will serve an important first step toward the design and optimization of general CSMA wireless networks with offered-load control.Comment: 6 pages. arXiv admin note: text overlap with arXiv:1007.5255 by other author

    Protein Structure Prediction with Parallel Algorithms Orthogonal to Parallel Platforms

    Get PDF
    The problem of Protein Structure Prediction (PSP) is known to be computationally expensive, which calls for the application of high performance techniques. In this project, parallel PSP algorithms found in the literature are being accelerated and ported to different parallel platforms, producing a set of algorithms that it is diverse in terms of the parallel architectures and parallel programming models used. The algorithms are intended to help other research projects and they have also been made publicly available so as to support the development of more elaborate prediction algorithms. We have thus far produced a set of 16 algorithms (mixing CUDA, OpenMP, MPI and/or complexity reduction optimizations); during its development, two algorithms that promote high performance were proposed, and they have been written in an article that was accepted in the International Conference on Computational Science (ICCS)

    Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles

    Full text link
    We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes. Our formulation is based on an adaptive particle filtering scheme that uses a multi-agent motion model based on velocity-obstacles, and takes into account local interactions as well as physical and personal constraints of each pedestrian. Our method dynamically changes the number of particles allocated to each pedestrian based on different confidence metrics. Additionally, we use a new high-definition crowd video dataset, which is used to evaluate the performance of different pedestrian tracking algorithms. This dataset consists of videos of indoor and outdoor scenes, recorded at different locations with 30-80 pedestrians. We highlight the performance benefits of our algorithm over prior techniques using this dataset. In practice, our algorithm can compute trajectories of tens of pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per second). To the best of our knowledge, our approach is 4-5 times faster than prior methods, which provide similar accuracy

    Energy flow polynomials: A complete linear basis for jet substructure

    Get PDF
    We introduce the energy flow polynomials: a complete set of jet substructure observables which form a discrete linear basis for all infrared- and collinear-safe observables. Energy flow polynomials are multiparticle energy correlators with specific angular structures that are a direct consequence of infrared and collinear safety. We establish a powerful graph-theoretic representation of the energy flow polynomials which allows us to design efficient algorithms for their computation. Many common jet observables are exact linear combinations of energy flow polynomials, and we demonstrate the linear spanning nature of the energy flow basis by performing regression for several common jet observables. Using linear classification with energy flow polynomials, we achieve excellent performance on three representative jet tagging problems: quark/gluon discrimination, boosted W tagging, and boosted top tagging. The energy flow basis provides a systematic framework for complete investigations of jet substructure using linear methods.Comment: 41+15 pages, 13 figures, 5 tables; v2: updated to match JHEP versio
    • …
    corecore