24 research outputs found
Fast and Efficient Model for Real-Time Tiger Detection In The Wild
The highest accuracy object detectors to date are based either on a two-stage
approach such as Fast R-CNN or one-stage detectors such as Retina-Net or SSD
with deep and complex backbones. In this paper we present TigerNet - simple yet
efficient FPN based network architecture for Amur Tiger Detection in the wild.
The model has 600k parameters, requires 0.071 GFLOPs per image and can run on
the edge devices (smart cameras) in near real time. In addition, we introduce a
two-stage semi-supervised learning via pseudo-labelling learning approach to
distill the knowledge from the larger networks. For ATRW-ICCV 2019 tiger
detection sub-challenge, based on public leaderboard score, our approach shows
superior performance in comparison to other methods
DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
We present DeblurGAN, an end-to-end learned method for motion deblurring. The
learning is based on a conditional GAN and the content loss . DeblurGAN
achieves state-of-the art performance both in the structural similarity measure
and visual appearance. The quality of the deblurring model is also evaluated in
a novel way on a real-world problem -- object detection on (de-)blurred images.
The method is 5 times faster than the closest competitor -- DeepDeblur. We also
introduce a novel method for generating synthetic motion blurred images from
sharp ones, allowing realistic dataset augmentation.
The model, code and the dataset are available at
https://github.com/KupynOrest/DeblurGANComment: CVPR 2018 camera-read
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker
We present FEAR, a family of fast, efficient, accurate, and robust Siamese
visual trackers. We present a novel and efficient way to benefit from
dual-template representation for object model adaption, which incorporates
temporal information with only a single learnable parameter. We further improve
the tracker architecture with a pixel-wise fusion block. By plugging-in
sophisticated backbones with the abovementioned modules, FEAR-M and FEAR-L
trackers surpass most Siamese trackers on several academic benchmarks in both
accuracy and efficiency. Employed with the lightweight backbone, the optimized
version FEAR-XS offers more than 10 times faster tracking than current Siamese
trackers while maintaining near state-of-the-art results. FEAR-XS tracker is
2.4x smaller and 4.3x faster than LightTrack with superior accuracy. In
addition, we expand the definition of the model efficiency by introducing FEAR
benchmark that assesses energy consumption and execution speed. We show that
energy consumption is a limiting factor for trackers on mobile devices. Source
code, pretrained models, and evaluation protocol are available at
https://github.com/PinataFarms/FEARTracker
DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image
We present DAD-3DHeads, a dense and diverse large-scale dataset, and a robust
model for 3D Dense Head Alignment in the wild. It contains annotations of over
3.5K landmarks that accurately represent 3D head shape compared to the
ground-truth scans. The data-driven model, DAD-3DNet, trained on our dataset,
learns shape, expression, and pose parameters, and performs 3D reconstruction
of a FLAME mesh. The model also incorporates a landmark prediction branch to
take advantage of rich supervision and co-training of multiple related tasks.
Experimentally, DAD-3DNet outperforms or is comparable to the state-of-the-art
models in (i) 3D Head Pose Estimation on AFLW2000-3D and BIWI, (ii) 3D Face
Shape Reconstruction on NoW and Feng, and (iii) 3D Dense Head Alignment and 3D
Landmarks Estimation on DAD-3DHeads dataset. Finally, the diversity of
DAD-3DHeads in camera angles, facial expressions, and occlusions enables a
benchmark to study in-the-wild generalization and robustness to distribution
shifts. The dataset webpage is https://p.farm/research/dad-3dheads
Self-supervised Blur Detection from Synthetically Blurred Scenes
Blur detection aims at segmenting the blurred areas of a given image. Recent deep learning-based methods approach this problem by learning an end-to-end mapping between the blurred input and a binary mask representing the localization of its blurred areas. Nevertheless, the effectiveness of such deep models is limited due to the scarcity of datasets annotated in terms of blur segmentation, as blur annotation is labour intensive. In this work, we bypass the need for such annotated datasets for end-to-end learning, and instead rely on object proposals and a model for blur generation in order to produce a dataset of synthetically blurred images. This allows us to perform self-supervised learning over the generated image and ground truth blur mask pairs using CNNs, defining a framework that can be employed in purely self-supervised, weakly supervised or semi-supervised configurations. Interestingly, experimental results of such setups over the largest blur segmentation datasets available show that this approach achieves state of the art results in blur segmentation, even without ever observing any real blurred image.This research was partially funded by the Basque Government’s Industry Department under the ELKARTEK program’s project ONKOIKER under agreement KK2018/00090. We thank the Spanish project TIN2016- 79717-R and mention Generalitat de Catalunya CERCA Program
Conditional Adversarial Networks for Blind Image Deblurring
We present an end-to-end learning approach for motion deblurring,
which is based on conditional GAN and content loss – DeblurGAN.
DeblurGAN achieves state-of-the art in structural similarity measure
and by visual appearance. The quality of the deblurring model is also
evaluated in a novel way on a real-world problem – object detection
on (de-)blurred images. The method is 5 times faster than the closest
competitor.
Second, we present a novel method of generating synthetic motion
blurred images from the sharp ones, which allows realistic dataset
augmentation
Recommended from our members
Money For Nothing? Opportunity Zones and Causal Inference
The Tax Cuts and Jobs Act of 2017 permitted US state governments to designate selected low-income census tracts as "Opportunity Zones." This designation permitted investors in projects located in these "Opportunity Zones" (OZs) to avoid or defer capital gains taxes on their investments. This provision was intended to increase the amount of investment in OZs, raising the incomes of households in designated census tracts. The processes of OZ designation was not uniformly transparent, with some indications areas with significant outside investments already in planned were more likely to receive OZ designations. This situation poses a challenge for traditional causal inference techniques, such as difference-in-differences. In this paper, an alternative set of assumptions are used to evaluate the effect of OZ designation on growth in median household income. These results suggest that the Opportunity Zones program has had a positive effect on income growth in areas that received the Opportunity Zone designation, but highlight the significant uncertainty involved in such an estimate