249 research outputs found
Fusion-Based Versatile Video Coding Intra Prediction Algorithm with Template Matching and Linear Prediction
The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper. Specifically, to better predict areas with similar texture information, we propose a fusion-based adaptive template matching method, which directly takes the error between reference and objective templates into account. Furthermore, to better utilize the correlation between reference pixels and the pixels to be predicted, we propose a fusion-based linear prediction method, which can compensate for the deficiency of single linear prediction. We implemented our algorithm on top of the VVC Test Model (VTM) 9.1. When compared with the VVC, our proposed fusion-based algorithm saves a bitrate of 0.89%, 0.84%, and 0.90% on average for the Y, Cb, and Cr components, respectively. In addition, when compared with some other existing works, our algorithm showed superior performance in bitrate savings
Two stream network for stroke detection in table tennis
This paper presents a table tennis stroke detection method from videos. Themethod relies on a two-stream Convolutional Neural Network processing inparallel the RGB Stream and its computed optical flow. The method has beendeveloped as part of the MediaEval 2021 benchmark for the Sport task. Ourcontribution did not outperform the provided baseline on the test set but hasperformed the best among the other participants with regard to the mAP metric.<br
BiRA-Net: Bilinear Attention Net for Diabetic Retinopathy Grading
Diabetic retinopathy (DR) is a common retinal disease that leads to
blindness. For diagnosis purposes, DR image grading aims to provide automatic
DR grade classification, which is not addressed in conventional research
methods of binary DR image classification. Small objects in the eye images,
like lesions and microaneurysms, are essential to DR grading in medical
imaging, but they could easily be influenced by other objects. To address these
challenges, we propose a new deep learning architecture, called BiRA-Net, which
combines the attention model for feature extraction and bilinear model for
fine-grained classification. Furthermore, in considering the distance between
different grades of different DR categories, we propose a new loss function,
called grading loss, which leads to improved training convergence of the
proposed approach. Experimental results are provided to demonstrate the
superior performance of the proposed approach.Comment: Accepted at ICIP 201
Robust Backdoor Attacks on Object Detection in Real World
Deep learning models are widely deployed in many applications, such as object
detection in various security fields. However, these models are vulnerable to
backdoor attacks. Most backdoor attacks were intensively studied on classified
models, but little on object detection. Previous works mainly focused on the
backdoor attack in the digital world, but neglect the real world. Especially,
the backdoor attack's effect in the real world will be easily influenced by
physical factors like distance and illumination. In this paper, we proposed a
variable-size backdoor trigger to adapt to the different sizes of attacked
objects, overcoming the disturbance caused by the distance between the viewing
point and attacked object. In addition, we proposed a backdoor training named
malicious adversarial training, enabling the backdoor object detector to learn
the feature of the trigger with physical noise. The experiment results show
this robust backdoor attack (RBA) could enhance the attack success rate in the
real world.Comment: 22 pages, 13figure
Distorted Representation Space Characterization Through Backpropagated Gradients
In this paper, we utilize weight gradients from backpropagation to
characterize the representation space learned by deep learning algorithms. We
demonstrate the utility of such gradients in applications including perceptual
image quality assessment and out-of-distribution classification. The
applications are chosen to validate the effectiveness of gradients as features
when the test image distribution is distorted from the train image
distribution. In both applications, the proposed gradient based features
outperform activation features. In image quality assessment, the proposed
approach is compared with other state of the art approaches and is generally
the top performing method on TID 2013 and MULTI-LIVE databases in terms of
accuracy, consistency, linearity, and monotonic behavior. Finally, we analyze
the effect of regularization on gradients using CURE-TSR dataset for
out-of-distribution classification.Comment: 5 pages, 5 figures, 2 tables, ICIP 201
- …