Search CORE

412 research outputs found

Two images comparison with invariance to illumination properties

Author: TAREL Jean Philippe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/09/2019
Field of study

ICIP 19 - IEEE International Conference on Image Processing, Taipei, TAIWAN, 22-/09/2019 - 25/09/2019We propose a new way of performing pixel by pixel comparison between two images, taking advantage of interesting invariance properties with respect to illumination conditions and camera settings. Moreover, we show that the proposed operator is relatively robust to strong noise on one of the compared images. The new operator can be used for background subtraction which inherits its invariance properties. The useful properties of the proposed operators are illustrated in the experiments

Crossref

SSSDET: Simple Short and Shallow Network for Resource Efficient Vehicle Detection in Aerial Scenes

Author: Mandal Murari
Meena Prashant
Shah Manal
Vipparthi Santosh Kumar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/08/2019
Field of study

Detection of small-sized targets is of paramount importance in many aerial vision-based applications. The commonly deployed low cost unmanned aerial vehicles (UAVs) for aerial scene analysis are highly resource constrained in nature. In this paper we propose a simple short and shallow network (SSSDet) to robustly detect and classify small-sized vehicles in aerial scenes. The proposed SSSDet is up to 4x faster, requires 4.4x less FLOPs, has 30x less parameters, requires 31x less memory space and provides better accuracy in comparison to existing state-of-the-art detectors. Thus, it is more suitable for hardware implementation in real-time applications. We also created a new airborne image dataset (ABD) by annotating 1396 new objects in 79 aerial images for our experiments. The effectiveness of the proposed method is validated on the existing VEDAI, DLR-3K, DOTA and Combined dataset. The SSSDet outperforms state-of-the-art detectors in term of accuracy, speed, compute and memory efficiency.Comment: International Conference on Image Processing (ICIP) 2019, Taipei, Taiwa

arXiv.org e-Print Archive

Crossref

Fusion-Based Versatile Video Coding Intra Prediction Algorithm with Template Matching and Linear Prediction

Author: He Xiaohai
Luo Dan
Ren Chao
Sheriff Raymond
Xiong Shuhua
Publication venue: 'MDPI AG'
Publication date: 01/08/2022
Field of study

The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper. Specifically, to better predict areas with similar texture information, we propose a fusion-based adaptive template matching method, which directly takes the error between reference and objective templates into account. Furthermore, to better utilize the correlation between reference pixels and the pixels to be predicted, we propose a fusion-based linear prediction method, which can compensate for the deficiency of single linear prediction. We implemented our algorithm on top of the VVC Test Model (VTM) 9.1. When compared with the VVC, our proposed fusion-based algorithm saves a bitrate of 0.89%, 0.84%, and 0.90% on average for the Y, Cb, and Cr components, respectively. In addition, when compared with some other existing works, our algorithm showed superior performance in bitrate savings

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Edge Hill University Research Information Repository

PubMed Central

BiRA-Net: Bilinear Attention Net for Diabetic Retinopathy Grading

Author: Chen Li
Chua Matthew Chin Heng
Hao Xuejie
Tian Jing
Xu Xin
Zhang Kerui
Zhao Ziyuan
Publication venue
Publication date: 01/07/2019
Field of study

Diabetic retinopathy (DR) is a common retinal disease that leads to blindness. For diagnosis purposes, DR image grading aims to provide automatic DR grade classification, which is not addressed in conventional research methods of binary DR image classification. Small objects in the eye images, like lesions and microaneurysms, are essential to DR grading in medical imaging, but they could easily be influenced by other objects. To address these challenges, we propose a new deep learning architecture, called BiRA-Net, which combines the attention model for feature extraction and bilinear model for fine-grained classification. Furthermore, in considering the distance between different grades of different DR categories, we propose a new loss function, called grading loss, which leads to improved training convergence of the proposed approach. Experimental results are provided to demonstrate the superior performance of the proposed approach.Comment: Accepted at ICIP 201

arXiv.org e-Print Archive

Crossref

Improving virtual reality streaming using HTTP/2

Author: De Turck Filip
Hosseini M.
Petrangeli Stefano
Swaminathan V.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition

Author: Gan Y. S.
Huang Yen-Chang
Khor Huai-Qian
Liong Sze-Teng
See John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In the recent year, state-of-the-art for facial micro-expression recognition have been significantly advanced by deep neural networks. The robustness of deep learning has yielded promising performance beyond that of traditional handcrafted approaches. Most works in literature emphasized on increasing the depth of networks and employing highly complex objective functions to learn more features. In this paper, we design a Shallow Triple Stream Three-dimensional CNN (STSTNet) that is computationally light whilst capable of extracting discriminative high level features and details of micro-expressions. The network learns from three optical flow features (i.e., optical strain, horizontal and vertical optical flow fields) computed based on the onset and apex frames of each video. Our experimental results demonstrate the effectiveness of the proposed STSTNet, which obtained an unweighted average recall rate of 0.7605 and unweighted F1-score of 0.7353 on the composite database consisting of 442 samples from the SMIC, CASME II and SAMM databases.Comment: 5 pages, 1 figure, Accepted and published in IEEE FG 201

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

SHDL@MMU Digital Repository