425 research outputs found
DeepPose: Human Pose Estimation via Deep Neural Networks
We propose a method for human pose estimation based on Deep Neural Networks
(DNNs). The pose estimation is formulated as a DNN-based regression problem
towards body joints. We present a cascade of such DNN regressors which results
in high precision pose estimates. The approach has the advantage of reasoning
about pose in a holistic fashion and has a simple but yet powerful formulation
which capitalizes on recent advances in Deep Learning. We present a detailed
empirical analysis with state-of-art or better performance on four academic
benchmarks of diverse real-world images.Comment: IEEE Conference on Computer Vision and Pattern Recognition, 201
Racah Sum Rule and Biedenharn-Elliott Identity for the Super-Rotation symbols
It is shown that the well known Racah sum rule and Biedenharn-Elliott
identity satisfied by the recoupling coefficients or by the symbols of
the usual rotation algebra can be extended to the corresponding
features of the super-rotation superalgebra. The structure of the
sum rules is completely similar in both cases, the only difference concerns the
signs which are more involved in the super-rotation case.Comment: 9 pages. Two misprints correcte
Testing and Comparing Value-at-Risk Measures in the Bulgarian Stock Market
The purpose of this thesis is to compare commonly used Value-at-Risk measures calculated through Historical and Monte Carlo Simulations and to answer the question whether these measures adequately capture market risk in EU new member country. Data set of daily returns price for ten years period from 24 October 2000 to 30 April 2010 was collected for the following market indices: SOFIX, S&P 500, NASDAQ, OMXS, FTSE 100 and DAX, to give representative overview of the developed world markets and compare them with the new EU member state Bulgaria. The behaviour of Value-at-Risk models with 99 % and 95 % confidence level using rolling data windows of 100 and 250 days is analyzed with the help of a range of backtesting procedures.
Employed tests revealed that the distribution of daily returns of SOFIX index differs significantly from the Normal distribution, with high kutosis and large negative skewness. Highest Value-at-Risk violation levels were observed during periods with steep volatility jumps, which indicate that the measure reacts poorly to volatility changes and underestimate risk in turbulent market conditions. Based on the backtesting results it can be derived that VaR models that are commonly used in developed stock markets are not well suited for measuring market risk in EU new member states.fi=Opinnäytetyö kokotekstinä PDF-muodossa.|en=Thesis fulltext in PDF format.|sv=Lärdomsprov tillgängligt som fulltext i PDF-format
Risks and Prospects of Smart Electric Grids Systems measured with Real Options
fi=vertaisarvioitu|en=peerReviewed
Scalable Object Detection using Deep Neural Networks
Deep convolutional neural networks have recently achieved state-of-the-art
performance on a number of image recognition benchmarks, including the ImageNet
Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on
the localization sub-task was a network that predicts a single bounding box and
a confidence score for each object category in the image. Such a model captures
the whole-image context around the objects but cannot handle multiple instances
of the same object in the image without naively replicating the number of
outputs for each instance. In this work, we propose a saliency-inspired neural
network model for detection, which predicts a set of class-agnostic bounding
boxes along with a single score for each box, corresponding to its likelihood
of containing any object of interest. The model naturally handles a variable
number of instances for each class and allows for cross-class generalization at
the highest levels of the network. We are able to obtain competitive
recognition performance on VOC2007 and ILSVRC2012, while using only the top few
predicted locations in each image and a small number of neural network
evaluations
Sim2Real View Invariant Visual Servoing by Recurrent Control
Humans are remarkably proficient at controlling their limbs and tools from a
wide range of viewpoints and angles, even in the presence of optical
distortions. In robotics, this ability is referred to as visual servoing:
moving a tool or end-point to a desired location using primarily visual
feedback. In this paper, we study how viewpoint-invariant visual servoing
skills can be learned automatically in a robotic manipulation scenario. To this
end, we train a deep recurrent controller that can automatically determine
which actions move the end-point of a robotic arm to a desired object. The
problem that must be solved by this controller is fundamentally ambiguous:
under severe variation in viewpoint, it may be impossible to determine the
actions in a single feedforward operation. Instead, our visual servoing system
must use its memory of past movements to understand how the actions affect the
robot motion from the current viewpoint, correcting mistakes and gradually
moving closer to the target. This ability is in stark contrast to most visual
servoing methods, which either assume known dynamics or require a calibration
phase. We show how we can learn this recurrent controller using simulated data
and a reinforcement learning objective. We then describe how the resulting
model can be transferred to a real-world robot by disentangling perception from
control and only adapting the visual layers. The adapted model can servo to
previously unseen objects from novel viewpoints on a real-world Kuka IIWA
robotic arm. For supplementary videos, see:
https://fsadeghi.github.io/Sim2RealViewInvariantServoComment: Supplementary video:
https://fsadeghi.github.io/Sim2RealViewInvariantServ
Show and Tell: A Neural Image Caption Generator
Automatically describing the content of an image is a fundamental problem in
artificial intelligence that connects computer vision and natural language
processing. In this paper, we present a generative model based on a deep
recurrent architecture that combines recent advances in computer vision and
machine translation and that can be used to generate natural sentences
describing an image. The model is trained to maximize the likelihood of the
target description sentence given the training image. Experiments on several
datasets show the accuracy of the model and the fluency of the language it
learns solely from image descriptions. Our model is often quite accurate, which
we verify both qualitatively and quantitatively. For instance, while the
current state-of-the-art BLEU-1 score (the higher the better) on the Pascal
dataset is 25, our approach yields 59, to be compared to human performance
around 69. We also show BLEU-1 score improvements on Flickr30k, from 56 to 66,
and on SBU, from 19 to 28. Lastly, on the newly released COCO dataset, we
achieve a BLEU-4 of 27.7, which is the current state-of-the-art
Deep Convolutional Ranking for Multilabel Image Annotation
Multilabel image annotation is one of the most important challenges in
computer vision with many real-world applications. While existing work usually
use conventional visual features for multilabel annotation, features based on
Deep Neural Networks have shown potential to significantly boost performance.
In this work, we propose to leverage the advantage of such features and analyze
key components that lead to better performances. Specifically, we show that a
significant performance gain could be obtained by combining convolutional
architectures with approximate top- ranking objectives, as thye naturally
fit the multilabel tagging problem. Our experiments on the NUS-WIDE dataset
outperforms the conventional visual features by about 10%, obtaining the best
reported performance in the literature
- …