2,442 research outputs found
High Performance Algorithms for Counting Collisions and Pairwise Interactions
The problem of counting collisions or interactions is common in areas as
computer graphics and scientific simulations. Since it is a major bottleneck in
applications of these areas, a lot of research has been carried out on such
subject, mainly focused on techniques that allow calculations to be performed
within pruned sets of objects. This paper focuses on how interaction
calculation (such as collisions) within these sets can be done more efficiently
than existing approaches. Two algorithms are proposed: a sequential algorithm
that has linear complexity at the cost of high memory usage; and a parallel
algorithm, mathematically proved to be correct, that manages to use GPU
resources more efficiently than existing approaches. The proposed and existing
algorithms were implemented, and experiments show a speedup of 21.7 for the
sequential algorithm (on small problem size), and 1.12 for the parallel
proposal (large problem size). By improving interaction calculation, this work
contributes to research areas that promote interconnection in the modern world,
such as computer graphics and robotics.Comment: Accepted in ICCS 2019 and published in Springer's LNCS series.
Supplementary content at https://mjsaldanha.com/articles/1-hpc-ssp
Throughput Analysis of CSMA Wireless Networks with Finite Offered-load
This paper proposes an approximate method, equivalent access intensity (EAI),
for the throughput analysis of CSMA wireless networks in which links have
finite offered-load and their MAC-layer transmit buffers may be empty from time
to time. Different from prior works that mainly considered the saturated
network, we take into account in our analysis the impacts of empty transmit
buffers on the interactions and dependencies among links in the network that is
more common in practice. It is known that the empty transmit buffer incurs
extra waiting time for a link to compete for the channel airtime usage, since
when it has no packet waiting for transmission, the link will not perform
channel competition. The basic idea behind EAI is that this extra waiting time
can be mapped to an equivalent "longer" backoff countdown time for the
unsaturated link, yielding a lower link access intensity that is defined as the
mean packet transmission time divided by the mean backoff countdown time. That
is, we can compute the "equivalent access intensity" of an unsaturated link to
incorporate the effects of the empty transmit buffer on its behavior of channel
competition. Then, prior saturated ideal CSMA network (ICN) model can be
adopted for link throughput computation. Specifically, we propose an iterative
algorithm, "Compute-and-Compare", to identify which links are unsaturated under
current offered-load and protocol settings, compute their "equivalent access
intensities" and calculate link throughputs. Simulation shows that our
algorithm has high accuracy under various offered-load and protocol settings.
We believe the ability to identify unsaturated links and compute links
throughputs as established in this paper will serve an important first step
toward the design and optimization of general CSMA wireless networks with
offered-load control.Comment: 6 pages. arXiv admin note: text overlap with arXiv:1007.5255 by other
author
Protein Structure Prediction with Parallel Algorithms Orthogonal to Parallel Platforms
The problem of Protein Structure Prediction (PSP) is known to be computationally expensive, which calls for the application of high performance techniques. In this project, parallel PSP algorithms found in the literature are being accelerated and ported to different parallel platforms, producing a set of algorithms that it is diverse in terms of the parallel architectures and parallel programming models used. The algorithms are intended to help other research projects and they have also been made publicly available so as to support the development of more elaborate prediction algorithms. We have thus far produced a set of 16 algorithms (mixing CUDA, OpenMP, MPI and/or complexity reduction optimizations); during its development, two algorithms that promote high performance were proposed, and they have been written in an article that was accepted in the International Conference on Computational Science (ICCS)
Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles
We present a novel, realtime algorithm to compute the trajectory of each
pedestrian in moderately dense crowd scenes. Our formulation is based on an
adaptive particle filtering scheme that uses a multi-agent motion model based
on velocity-obstacles, and takes into account local interactions as well as
physical and personal constraints of each pedestrian. Our method dynamically
changes the number of particles allocated to each pedestrian based on different
confidence metrics. Additionally, we use a new high-definition crowd video
dataset, which is used to evaluate the performance of different pedestrian
tracking algorithms. This dataset consists of videos of indoor and outdoor
scenes, recorded at different locations with 30-80 pedestrians. We highlight
the performance benefits of our algorithm over prior techniques using this
dataset. In practice, our algorithm can compute trajectories of tens of
pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per
second). To the best of our knowledge, our approach is 4-5 times faster than
prior methods, which provide similar accuracy
Energy flow polynomials: A complete linear basis for jet substructure
We introduce the energy flow polynomials: a complete set of jet substructure
observables which form a discrete linear basis for all infrared- and
collinear-safe observables. Energy flow polynomials are multiparticle energy
correlators with specific angular structures that are a direct consequence of
infrared and collinear safety. We establish a powerful graph-theoretic
representation of the energy flow polynomials which allows us to design
efficient algorithms for their computation. Many common jet observables are
exact linear combinations of energy flow polynomials, and we demonstrate the
linear spanning nature of the energy flow basis by performing regression for
several common jet observables. Using linear classification with energy flow
polynomials, we achieve excellent performance on three representative jet
tagging problems: quark/gluon discrimination, boosted W tagging, and boosted
top tagging. The energy flow basis provides a systematic framework for complete
investigations of jet substructure using linear methods.Comment: 41+15 pages, 13 figures, 5 tables; v2: updated to match JHEP versio
- …