Search CORE

9 research outputs found

Quasi Real-Time Apple Defect Segmentation Using Deep Learning

Author: Mirko Agarla
Paolo Napoletano
Raimondo Schettini
Publication venue: MDPI
Publication date: 01/01/2023
Field of study

Defect segmentation of apples is an important task in the agriculture industry for quality control and food safety. In this paper, we propose a deep learning approach for the automated segmentation of apple defects using convolutional neural networks (CNNs) based on a U-shaped architecture with skip-connections only within the noise reduction block. An ad-hoc data synthesis technique has been designed to increase the number of samples and at the same time to reduce neural network overfitting. We evaluate our model on a dataset of multi-spectral apple images with pixel-wise annotations for several types of defects. In this paper, we show that our proposal outperforms in terms of segmentation accuracy general-purpose deep learning architectures commonly used for segmentation tasks. From the application point of view, we improve the previous methods for apple defect segmentation. A measure of the computational cost shows that our proposal can be employed in real-time (about 100 frame-per-second on GPU) and in quasi-real-time (about 7/8 frame-per-second on CPU) visual-based apple inspection. To further improve the applicability of the method, we investigate the potential of using only RGB images instead of multi-spectral images as input images. The results prove that the accuracy in this case is almost comparable with the multi-spectral case

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Semi-supervised cross-lingual speech emotion recognition

Author: Agarla Mirko
Bianco Simone
Celona Luigi
Napoletano Paolo
Petrovsky Alexey
Piccoli Flavio
Schettini Raimondo
Shanin Ivan
Publication venue
Publication date: 14/07/2022
Field of study

Speech emotion recognition (SER) on a single language has achieved remarkable results through deep learning approaches over the last decade. However, cross-lingual SER remains a challenge in real-world applications due to (i) a large difference between the source and target domain distributions, (ii) the availability of few labeled and many unlabeled utterances for the new language. Taking into account previous aspects, we propose a Semi-Supervised Learning (SSL) method for cross-lingual emotion recognition when a few labels from the new language are available. Based on a Convolutional Neural Network (CNN), our method adapts to a new language by exploiting a pseudo-labeling strategy for the unlabeled utterances. In particular, the use of a hard and soft pseudo-labels approach is investigated. We thoroughly evaluate the performance of the method in a speaker-independent setup on both the source and the new language and show its robustness across five languages belonging to different linguistic strains

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

NTIRE 2023 Quality Assessment of Video Enhancement Challenge

Author: Azadeh Mansouri
Chunzheng Zhu
Guangtao Zhai
Hanene Brachemi Meftah
Hang Shi
Haoning Wu
Haotian Fan
Heng Cong
Hongye Liu
Ironhead Chuang
Kai Zhang
Kai Zhao
Mirko Agarla
Radu Timofte
Shiling Zhao
Shiqi Zhou
Tengchuan Kou
Tengfei Shi
Wei Sun
Wenqi Wang
Xiaohong Liu
Xiongkuo Min
Yilin Li
Yixuan Gao
Yu Lai
Yulun Zhang
Yunlong Dong
Yuqin Cao
Zhiliang Ma
Zhiwei Huang
Ziheng Jia
Publication venue: IEEE/CVF
Publication date: 01/01/2023
Field of study

This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual Video Enhancement (VDPVE), which has a total of 1211 enhanced videos, including 600 videos with color, brightness, and contrast enhancements, 310 videos with deblurring, and 301 deshaked videos. The challenge has a total of 167 registered participants. 61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions. A total of 176 submissions were submitted by 37 participating teams during the final testing phase. Finally, 19 participating teams submitted their models and fact sheets, and detailed the methods they used. Some methods have achieved better results than baseline methods, and the winning methods have demonstrated superior prediction performance

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

On the Semantic Dependency of Video Quality Assessment Methods

Author: Agarla Mirko
Celona Luigi
Publication venue: 'Korean Society for Imaging Science and Technology'
Publication date: 01/01/2021
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

No-Reference Quality Assessment of In-Capture Distorted Videos

Author: Luigi Celona
Mirko Agarla
Raimondo Schettini
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

We introduce a no-reference method for the assessment of the quality of videos affected by in-capture distortions due to camera hardware and processing software. The proposed method encodes both quality attributes and semantic content of each video frame by using two Convolutional Neural Networks (CNNs) and then estimates the quality score of the whole video by using a Recurrent Neural Network (RNN), which models the temporal information. The extensive experiments conducted on four benchmark databases (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC) containing in-capture distortions demonstrate the effectiveness of the proposed method and its ability to generalize in cross-database setup

Multidisciplinary Digital Publishing Institute

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

An Efficient Method for No-Reference Video Quality Assessment

Author: Luigi Celona
Mirko Agarla
Raimondo Schettini
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Methods for No-Reference Video Quality Assessment (NR-VQA) of consumer-produced video content are largely investigated due to the spread of databases containing videos affected by natural distortions. In this work, we design an effective and efficient method for NR-VQA. The proposed method exploits a novel sampling module capable of selecting a predetermined number of frames from the whole video sequence on which to base the quality assessment. It encodes both the quality attributes and semantic content of video frames using two lightweight Convolutional Neural Networks (CNNs). Then, it estimates the quality score of the entire video using a Support Vector Regressor (SVR). We compare the proposed method against several relevant state-of-the-art methods using four benchmark databases containing user generated videos (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC). The results show that the proposed method at a substantially lower computational cost predicts subjective video quality in line with the state of the art methods on individual databases and generalizes better than existing methods in cross-database setup

Multidisciplinary Digital Publishing Institute

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Quality assessment of enhanced videos guided by aesthetics and technical quality attributes

Author: Claudio Rota
Luigi Celona
Mirko Agarla
Raimondo Schettini
Publication venue: IEEE/CVF
Publication date: 01/01/2023
Field of study

In this work we propose a novel method to evaluate the quality of enhanced videos. Perceived quality of a video depends on both technical aspects, such as the presence of distortions like noise and blur, and non-technical factors, such as content preference and recommendation. Our approach involves the use of three deep learning based models that encode video sequences in terms of their overall technical quality, quality-related attributes, and aesthetic quality. The resulting feature vectors are adaptively combined and used as input to a Support Vector Regressor to estimate the video quality score. Quantitative results on the recently released VQA Dataset for Perceptual Video Enhancement (VDPVE) introduced for the NTIRE 2023 Quality Assessment of Video Enhancement Challenge demonstrates the effectiveness of the proposed method

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Fast-N-Squeeze: Towards Real-Time Spectral Reconstruction From RGB Images

Author: Agarla Mirko
Bianco Simone
Buzzelli Marco
Celona Luigi
Schettini Raimondo
Publication venue: IEEE/CVF
Publication date: 01/01/2022
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

NTIRE 2022 Spectral Recovery Challenge and Data Set

Author: Arad Boaz
Bernat Amir
Cai Yuanhao
Lin Jing
Lin Zudi
Mirko Agarla
Morag Nimrod
others
Timofte Radu
Wang Haoqian
Yahel Rony
Zhang Yulun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

This paper reviews the third biennial challenge on spectral reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image. This challenge presents the "ARAD_1K" data set: a new, larger-than-ever natural hyperspectral image data set containing 1,000 images. Challenge participants were required to recover hyper-spectral information from synthetically generated JPEG-compressed RGB images simulating capture by a known calibrated camera, operating under partially known parameters, in a setting which includes acquisition noise. The challenge was attended by 241 teams, with 60 teams com-peting in the final testing phase, 12 of which provided de-tailed descriptions of their methodology which are included in this report. The performance of these submissions is re-viewed and provided here as a gauge for the current state-of-the-art in spectral reconstruction from natural RGB images

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)