45,357 research outputs found
Learning Human Pose Estimation Features with Convolutional Networks
This paper introduces a new architecture for human pose estimation using a
multi- layer convolutional network architecture and a modified learning
technique that learns low-level features and higher-level weak spatial models.
Unconstrained human pose estimation is one of the hardest problems in computer
vision, and our new architecture and learning schema shows significant
improvement over the current state-of-the-art results. The main contribution of
this paper is showing, for the first time, that a specific variation of deep
learning is able to outperform all existing traditional architectures on this
task. The paper also discusses several lessons learned while researching
alternatives, most notably, that it is possible to learn strong low-level
feature detectors on features that might even just cover a few pixels in the
image. Higher-level spatial models improve somewhat the overall result, but to
a much lesser extent then expected. Many researchers previously argued that the
kinematic structure and top-down information is crucial for this domain, but
with our purely bottom up, and weak spatial model, we could improve other more
complicated architectures that currently produce the best results. This mirrors
what many other researchers, like those in the speech recognition, object
recognition, and other domains have experienced
Generalized Kernel-based Visual Tracking
In this work we generalize the plain MS trackers and attempt to overcome
standard mean shift trackers' two limitations.
It is well known that modeling and maintaining a representation of a target
object is an important component of a successful visual tracker.
However, little work has been done on building a robust template model for
kernel-based MS tracking. In contrast to building a template from a single
frame, we train a robust object representation model from a large amount of
data. Tracking is viewed as a binary classification problem, and a
discriminative classification rule is learned to distinguish between the object
and background. We adopt a support vector machine (SVM) for training. The
tracker is then implemented by maximizing the classification score. An
iterative optimization scheme very similar to MS is derived for this purpose.Comment: 12 page
Energy efficient enabling technologies for semantic video processing on mobile devices
Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This
thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the
human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and
reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing
any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
Unmanned Aerial Systems for Wildland and Forest Fires
Wildfires represent an important natural risk causing economic losses, human
death and important environmental damage. In recent years, we witness an
increase in fire intensity and frequency. Research has been conducted towards
the development of dedicated solutions for wildland and forest fire assistance
and fighting. Systems were proposed for the remote detection and tracking of
fires. These systems have shown improvements in the area of efficient data
collection and fire characterization within small scale environments. However,
wildfires cover large areas making some of the proposed ground-based systems
unsuitable for optimal coverage. To tackle this limitation, Unmanned Aerial
Systems (UAS) were proposed. UAS have proven to be useful due to their
maneuverability, allowing for the implementation of remote sensing, allocation
strategies and task planning. They can provide a low-cost alternative for the
prevention, detection and real-time support of firefighting. In this paper we
review previous work related to the use of UAS in wildfires. Onboard sensor
instruments, fire perception algorithms and coordination strategies are
considered. In addition, we present some of the recent frameworks proposing the
use of both aerial vehicles and Unmanned Ground Vehicles (UV) for a more
efficient wildland firefighting strategy at a larger scale.Comment: A recent published version of this paper is available at:
https://doi.org/10.3390/drones501001
- …