1,482 research outputs found
360-degree Video Stitching for Dual-fisheye Lens Cameras Based On Rigid Moving Least Squares
Dual-fisheye lens cameras are becoming popular for 360-degree video capture,
especially for User-generated content (UGC), since they are affordable and
portable. Images generated by the dual-fisheye cameras have limited overlap and
hence require non-conventional stitching techniques to produce high-quality
360x180-degree panoramas. This paper introduces a novel method to align these
images using interpolation grids based on rigid moving least squares.
Furthermore, jitter is the critical issue arising when one applies the
image-based stitching algorithms to video. It stems from the unconstrained
movement of stitching boundary from one frame to another. Therefore, we also
propose a new algorithm to maintain the temporal coherence of stitching
boundary to provide jitter-free 360-degree videos. Results show that the method
proposed in this paper can produce higher quality stitched images and videos
than prior work.Comment: Preprint versio
Predicting Aesthetic Score Distribution through Cumulative Jensen-Shannon Divergence
Aesthetic quality prediction is a challenging task in the computer vision
community because of the complex interplay with semantic contents and
photographic technologies. Recent studies on the powerful deep learning based
aesthetic quality assessment usually use a binary high-low label or a numerical
score to represent the aesthetic quality. However the scalar representation
cannot describe well the underlying varieties of the human perception of
aesthetics. In this work, we propose to predict the aesthetic score
distribution (i.e., a score distribution vector of the ordinal basic human
ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs
which aim to minimize the difference between the predicted scalar numbers or
vectors and the ground truth cannot be directly used for the ordinal basic
rating distribution. Thus, a novel CNN based on the Cumulative distribution
with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic
score distribution of human ratings, with a new reliability-sensitive learning
method based on the kurtosis of the score distribution, which eliminates the
requirement of the original full data of human ratings (without normalization).
Experimental results on large scale aesthetic dataset demonstrate the
effectiveness of our introduced CJS-CNN in this task.Comment: AAAI Conference on Artificial Intelligence (AAAI), New Orleans,
Louisiana, USA. 2-7 Feb. 201
High Speed Mid-Wave Infrared Uni-traveling Carrier Photodetector
Mid-wave infrared (MWIR) frequency comb is expected to dramatically improve
the precision and sensitivity of molecular spectroscopy. For high resolution
application, high speed MWIR photodetector is one of the key components,
however, the commercially available high speed MWIR photodetector only has
sub-GHz bandwidth currently. In this paper, we demonstrate, for the first time
to our knowledge, a high speed mid-wave infrared (MWIR) uni-traveling carrier
photodetector based on InAs/GaSb type-II superlattice (T2SL) at room
temperature. The device exhibits a cutoff wavelength of 5.6{\mu}m, and 3dB
bandwidth of 6.58 GHz for a 20{\mu}m diameter device at 300K. These promising
results show the device has potential to be utilized in high speed applications
such as frequency comb spectroscopy, free space communication and others. The
limitations on the high frequency performance of the photodetectors are also
discussed
Traffic Danger Recognition With Surveillance Cameras Without Training Data
We propose a traffic danger recognition model that works with arbitrary
traffic surveillance cameras to identify and predict car crashes. There are too
many cameras to monitor manually. Therefore, we developed a model to predict
and identify car crashes from surveillance cameras based on a 3D reconstruction
of the road plane and prediction of trajectories. For normal traffic, it
supports real-time proactive safety checks of speeds and distances between
vehicles to provide insights about possible high-risk areas. We achieve good
prediction and recognition of car crashes without using any labeled training
data of crashes. Experiments on the BrnoCompSpeed dataset show that our model
can accurately monitor the road, with mean errors of 1.80% for distance
measurement, 2.77 km/h for speed measurement, 0.24 m for car position
prediction, and 2.53 km/h for speed prediction.Comment: To be published in proceedings of Advanced Video and Signal-based
Surveillance (AVSS), 2018 15th IEEE International Conference on, pp. 378-383,
IEE
Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding
Crack is one of the most common road distresses which may pose road safety
hazards. Generally, crack detection is performed by either certified inspectors
or structural engineers. This task is, however, time-consuming, subjective and
labor-intensive. In this paper, we propose a novel road crack detection
algorithm based on deep learning and adaptive image segmentation. Firstly, a
deep convolutional neural network is trained to determine whether an image
contains cracks or not. The images containing cracks are then smoothed using
bilateral filtering, which greatly minimizes the number of noisy pixels.
Finally, we utilize an adaptive thresholding method to extract the cracks from
road surface. The experimental results illustrate that our network can classify
images with an accuracy of 99.92%, and the cracks can be successfully extracted
from the images using our proposed thresholding algorithm.Comment: 6 pages, 8 figures, 2019 IEEE Intelligent Vehicles Symposiu
Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons
Current methods for skeleton-based human action recognition usually work with
completely observed skeletons. However, in real scenarios, it is prone to
capture incomplete and noisy skeletons, which will deteriorate the performance
of traditional models. To enhance the robustness of action recognition models
to incomplete skeletons, we propose a multi-stream graph convolutional network
(GCN) for exploring sufficient discriminative features distributed over all
skeleton joints. Here, each stream of the network is only responsible for
learning features from currently unactivated joints, which are distinguished by
the class activation maps (CAM) obtained by preceding streams, so that the
activated joints of the proposed method are obviously more than traditional
methods. Thus, the proposed method is termed richly activated GCN (RA-GCN),
where the richly discovered features will improve the robustness of the model.
Compared to the state-of-the-art methods, the RA-GCN achieves comparable
performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion
dataset, the performance deterioration can be alleviated by the RA-GCN
significantly.Comment: Accepted by ICIP 2019, 5 pages, 3 figures, 3 table
- …