920 research outputs found
CT-SRCNN: Cascade Trained and Trimmed Deep Convolutional Neural Networks for Image Super Resolution
We propose methodologies to train highly accurate and efficient deep
convolutional neural networks (CNNs) for image super resolution (SR). A cascade
training approach to deep learning is proposed to improve the accuracy of the
neural networks while gradually increasing the number of network layers. Next,
we explore how to improve the SR efficiency by making the network slimmer. Two
methodologies, the one-shot trimming and the cascade trimming, are proposed.
With the cascade trimming, the network's size is gradually reduced layer by
layer, without significant loss on its discriminative ability. Experiments on
benchmark image datasets show that our proposed SR network achieves the
state-of-the-art super resolution accuracy, while being more than 4 times
faster compared to existing deep super resolution networks.Comment: Accepted to IEEE Winter Conf. on Applications of Computer Vision
(WACV) 2018, Lake Tahoe, US
Cascaded 3D Full-body Pose Regression from Single Depth Image at 100 FPS
There are increasing real-time live applications in virtual reality, where it
plays an important role in capturing and retargetting 3D human pose. But it is
still challenging to estimate accurate 3D pose from consumer imaging devices
such as depth camera. This paper presents a novel cascaded 3D full-body pose
regression method to estimate accurate pose from a single depth image at 100
fps. The key idea is to train cascaded regressors based on Gradient Boosting
algorithm from pre-recorded human motion capture database. By incorporating
hierarchical kinematics model of human pose into the learning procedure, we can
directly estimate accurate 3D joint angles instead of joint positions. The
biggest advantage of this model is that the bone length can be preserved during
the whole 3D pose estimation procedure, which leads to more effective features
and higher pose estimation accuracy. Our method can be used as an
initialization procedure when combining with tracking methods. We demonstrate
the power of our method on a wide range of synthesized human motion data from
CMU mocap database, Human3.6M dataset and real human movements data captured in
real time. In our comparison against previous 3D pose estimation methods and
commercial system such as Kinect 2017, we achieve the state-of-the-art
accuracy
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution
Convolutional neural networks have recently demonstrated high-quality
reconstruction for single-image super-resolution. In this paper, we propose the
Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively
reconstruct the sub-band residuals of high-resolution images. At each pyramid
level, our model takes coarse-resolution feature maps as input, predicts the
high-frequency residuals, and uses transposed convolutions for upsampling to
the finer level. Our method does not require the bicubic interpolation as the
pre-processing step and thus dramatically reduces the computational complexity.
We train the proposed LapSRN with deep supervision using a robust Charbonnier
loss function and achieve high-quality reconstruction. Furthermore, our network
generates multi-scale predictions in one feed-forward pass through the
progressive reconstruction, thereby facilitates resource-aware applications.
Extensive quantitative and qualitative evaluations on benchmark datasets show
that the proposed algorithm performs favorably against the state-of-the-art
methods in terms of speed and accuracy.Comment: This work is accepted in CVPR 2017. The code and datasets are
available on http://vllab.ucmerced.edu/wlai24/LapSRN
- …