12,625 research outputs found
Simple Online and Realtime Tracking with a Deep Association Metric
Simple Online and Realtime Tracking (SORT) is a pragmatic approach to
multiple object tracking with a focus on simple, effective algorithms. In this
paper, we integrate appearance information to improve the performance of SORT.
Due to this extension we are able to track objects through longer periods of
occlusions, effectively reducing the number of identity switches. In spirit of
the original framework we place much of the computational complexity into an
offline pre-training stage where we learn a deep association metric on a
large-scale person re-identification dataset. During online application, we
establish measurement-to-track associations using nearest neighbor queries in
visual appearance space. Experimental evaluation shows that our extensions
reduce the number of identity switches by 45%, achieving overall competitive
performance at high frame rates.Comment: 5 pages, 1 figur
Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles
We present a novel, realtime algorithm to compute the trajectory of each
pedestrian in moderately dense crowd scenes. Our formulation is based on an
adaptive particle filtering scheme that uses a multi-agent motion model based
on velocity-obstacles, and takes into account local interactions as well as
physical and personal constraints of each pedestrian. Our method dynamically
changes the number of particles allocated to each pedestrian based on different
confidence metrics. Additionally, we use a new high-definition crowd video
dataset, which is used to evaluate the performance of different pedestrian
tracking algorithms. This dataset consists of videos of indoor and outdoor
scenes, recorded at different locations with 30-80 pedestrians. We highlight
the performance benefits of our algorithm over prior techniques using this
dataset. In practice, our algorithm can compute trajectories of tens of
pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per
second). To the best of our knowledge, our approach is 4-5 times faster than
prior methods, which provide similar accuracy
Multi-camera Realtime 3D Tracking of Multiple Flying Animals
Automated tracking of animal movement allows analyses that would not
otherwise be possible by providing great quantities of data. The additional
capability of tracking in realtime - with minimal latency - opens up the
experimental possibility of manipulating sensory feedback, thus allowing
detailed explorations of the neural basis for control of behavior. Here we
describe a new system capable of tracking the position and body orientation of
animals such as flies and birds. The system operates with less than 40 msec
latency and can track multiple animals simultaneously. To achieve these
results, a multi target tracking algorithm was developed based on the Extended
Kalman Filter and the Nearest Neighbor Standard Filter data association
algorithm. In one implementation, an eleven camera system is capable of
tracking three flies simultaneously at 60 frames per second using a gigabit
network of nine standard Intel Pentium 4 and Core 2 Duo computers. This
manuscript presents the rationale and details of the algorithms employed and
shows three implementations of the system. An experiment was performed using
the tracking system to measure the effect of visual contrast on the flight
speed of Drosophila melanogaster. At low contrasts, speed is more variable and
faster on average than at high contrasts. Thus, the system is already a useful
tool to study the neurobiology and behavior of freely flying animals. If
combined with other techniques, such as `virtual reality'-type computer
graphics or genetic manipulation, the tracking system would offer a powerful
new way to investigate the biology of flying animals.Comment: pdfTeX using libpoppler 3.141592-1.40.3-2.2 (Web2C 7.5.6), 18 pages
with 9 figure
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
Realtime State Estimation with Tactile and Visual sensing. Application to Planar Manipulation
Accurate and robust object state estimation enables successful object
manipulation. Visual sensing is widely used to estimate object poses. However,
in a cluttered scene or in a tight workspace, the robot's end-effector often
occludes the object from the visual sensor. The robot then loses visual
feedback and must fall back on open-loop execution.
In this paper, we integrate both tactile and visual input using a framework
for solving the SLAM problem, incremental smoothing and mapping (iSAM), to
provide a fast and flexible solution. Visual sensing provides global pose
information but is noisy in general, whereas contact sensing is local, but its
measurements are more accurate relative to the end-effector. By combining them,
we aim to exploit their advantages and overcome their limitations. We explore
the technique in the context of a pusher-slider system. We adapt iSAM's
measurement cost and motion cost to the pushing scenario, and use an
instrumented setup to evaluate the estimation quality with different object
shapes, on different surface materials, and under different contact modes
A Robust Zero-Calibration RF-based Localization System for Realistic Environments
Due to the noisy indoor radio propagation channel, Radio Frequency (RF)-based
location determination systems usually require a tedious calibration phase to
construct an RF fingerprint of the area of interest. This fingerprint varies
with the used mobile device, changes of the transmit power of smart access
points (APs), and dynamic changes in the environment; requiring re-calibration
of the area of interest; which reduces the technology ease of use. In this
paper, we present IncVoronoi: a novel system that can provide zero-calibration
accurate RF-based indoor localization that works in realistic environments. The
basic idea is that the relative relation between the received signal strength
from two APs at a certain location reflects the relative distance from this
location to the respective APs. Building on this, IncVoronoi incrementally
reduces the user ambiguity region based on refining the Voronoi tessellation of
the area of interest. IncVoronoi also includes a number of modules to
efficiently run in realtime as well as to handle practical deployment issues
including the noisy wireless environment, obstacles in the environment,
heterogeneous devices hardware, and smart APs. We have deployed IncVoronoi on
different Android phones using the iBeacons technology in a university campus.
Evaluation of IncVoronoi with a side-by-side comparison with traditional
fingerprinting techniques shows that it can achieve a consistent median
accuracy of 2.8m under different scenarios with a low beacon density of one
beacon every 44m2. Compared to fingerprinting techniques, whose accuracy
degrades by at least 156%, this accuracy comes with no training overhead and is
robust to the different user devices, different transmit powers, and over
temporal changes in the environment. This highlights the promise of IncVoronoi
as a next generation indoor localization system.Comment: 9 pages, 13 figures, published in SECON 201
- …