1,482 research outputs found

    360-degree Video Stitching for Dual-fisheye Lens Cameras Based On Rigid Moving Least Squares

    Full text link
    Dual-fisheye lens cameras are becoming popular for 360-degree video capture, especially for User-generated content (UGC), since they are affordable and portable. Images generated by the dual-fisheye cameras have limited overlap and hence require non-conventional stitching techniques to produce high-quality 360x180-degree panoramas. This paper introduces a novel method to align these images using interpolation grids based on rigid moving least squares. Furthermore, jitter is the critical issue arising when one applies the image-based stitching algorithms to video. It stems from the unconstrained movement of stitching boundary from one frame to another. Therefore, we also propose a new algorithm to maintain the temporal coherence of stitching boundary to provide jitter-free 360-degree videos. Results show that the method proposed in this paper can produce higher quality stitched images and videos than prior work.Comment: Preprint versio

    Predicting Aesthetic Score Distribution through Cumulative Jensen-Shannon Divergence

    Full text link
    Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization). Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task.Comment: AAAI Conference on Artificial Intelligence (AAAI), New Orleans, Louisiana, USA. 2-7 Feb. 201

    High Speed Mid-Wave Infrared Uni-traveling Carrier Photodetector

    Full text link
    Mid-wave infrared (MWIR) frequency comb is expected to dramatically improve the precision and sensitivity of molecular spectroscopy. For high resolution application, high speed MWIR photodetector is one of the key components, however, the commercially available high speed MWIR photodetector only has sub-GHz bandwidth currently. In this paper, we demonstrate, for the first time to our knowledge, a high speed mid-wave infrared (MWIR) uni-traveling carrier photodetector based on InAs/GaSb type-II superlattice (T2SL) at room temperature. The device exhibits a cutoff wavelength of 5.6{\mu}m, and 3dB bandwidth of 6.58 GHz for a 20{\mu}m diameter device at 300K. These promising results show the device has potential to be utilized in high speed applications such as frequency comb spectroscopy, free space communication and others. The limitations on the high frequency performance of the photodetectors are also discussed

    Traffic Danger Recognition With Surveillance Cameras Without Training Data

    Full text link
    We propose a traffic danger recognition model that works with arbitrary traffic surveillance cameras to identify and predict car crashes. There are too many cameras to monitor manually. Therefore, we developed a model to predict and identify car crashes from surveillance cameras based on a 3D reconstruction of the road plane and prediction of trajectories. For normal traffic, it supports real-time proactive safety checks of speeds and distances between vehicles to provide insights about possible high-risk areas. We achieve good prediction and recognition of car crashes without using any labeled training data of crashes. Experiments on the BrnoCompSpeed dataset show that our model can accurately monitor the road, with mean errors of 1.80% for distance measurement, 2.77 km/h for speed measurement, 0.24 m for car position prediction, and 2.53 km/h for speed prediction.Comment: To be published in proceedings of Advanced Video and Signal-based Surveillance (AVSS), 2018 15th IEEE International Conference on, pp. 378-383, IEE

    Road Crack Detection Using Deep Convolutional Neural Network and Adaptive Thresholding

    Full text link
    Crack is one of the most common road distresses which may pose road safety hazards. Generally, crack detection is performed by either certified inspectors or structural engineers. This task is, however, time-consuming, subjective and labor-intensive. In this paper, we propose a novel road crack detection algorithm based on deep learning and adaptive image segmentation. Firstly, a deep convolutional neural network is trained to determine whether an image contains cracks or not. The images containing cracks are then smoothed using bilateral filtering, which greatly minimizes the number of noisy pixels. Finally, we utilize an adaptive thresholding method to extract the cracks from road surface. The experimental results illustrate that our network can classify images with an accuracy of 99.92%, and the cracks can be successfully extracted from the images using our proposed thresholding algorithm.Comment: 6 pages, 8 figures, 2019 IEEE Intelligent Vehicles Symposiu

    Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons

    Full text link
    Current methods for skeleton-based human action recognition usually work with completely observed skeletons. However, in real scenarios, it is prone to capture incomplete and noisy skeletons, which will deteriorate the performance of traditional models. To enhance the robustness of action recognition models to incomplete skeletons, we propose a multi-stream graph convolutional network (GCN) for exploring sufficient discriminative features distributed over all skeleton joints. Here, each stream of the network is only responsible for learning features from currently unactivated joints, which are distinguished by the class activation maps (CAM) obtained by preceding streams, so that the activated joints of the proposed method are obviously more than traditional methods. Thus, the proposed method is termed richly activated GCN (RA-GCN), where the richly discovered features will improve the robustness of the model. Compared to the state-of-the-art methods, the RA-GCN achieves comparable performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion dataset, the performance deterioration can be alleviated by the RA-GCN significantly.Comment: Accepted by ICIP 2019, 5 pages, 3 figures, 3 table
    • …
    corecore