13 research outputs found
Deep learning based face beauty prediction via dynamic robust losses and ensemble regression
In the last decade, several studies have shown that facial attractiveness can be learned by machines. In this paper, we address Facial Beauty Prediction from static images. The paper contains three main contributions. First, we propose a two-branch architecture (REX-INCEP) based on merging the architecture of two already trained networks to deal with the complicated high-level features associated with the FBP problem. Second, we introduce the use of a dynamic law to control the behaviour of the following robust loss functions during training: ParamSmoothL1, Huber and Tukey. Third, we propose an ensemble regression based on Convolutional Neural Networks (CNNs). In this ensemble, we use both the basic networks and our proposed network (REX-INCEP). The proposed individual CNN regressors are trained with different loss functions, namely MSE, dynamic ParamSmoothL1, dynamic Huber and dynamic Tukey. Our approach is evaluated on the SCUT-FBP5500 database using the two evaluation scenarios provided by the database creators: 60%-40% split and five-fold cross-validation. In both evaluation scenarios, our approach outperforms the state of the art on several metrics. These comparisons highlight the effectiveness of the proposed solutions for FBP. They also show that the proposed dynamic robust losses lead to more flexible and accurate estimators.This work was partially funded by the University of the Basque Country , GUI19/027
D-TrAttUnet: Toward Hybrid CNN-Transformer Architecture for Generic and Subtle Segmentation in Medical Images
Over the past two decades, machine analysis of medical imaging has advanced
rapidly, opening up significant potential for several important medical
applications. As complicated diseases increase and the number of cases rises,
the role of machine-based imaging analysis has become indispensable. It serves
as both a tool and an assistant to medical experts, providing valuable insights
and guidance. A particularly challenging task in this area is lesion
segmentation, a task that is challenging even for experienced radiologists. The
complexity of this task highlights the urgent need for robust machine learning
approaches to support medical staff. In response, we present our novel
solution: the D-TrAttUnet architecture. This framework is based on the
observation that different diseases often target specific organs. Our
architecture includes an encoder-decoder structure with a composite
Transformer-CNN encoder and dual decoders. The encoder includes two paths: the
Transformer path and the Encoders Fusion Module path. The Dual-Decoder
configuration uses two identical decoders, each with attention gates. This
allows the model to simultaneously segment lesions and organs and integrate
their segmentation losses.
To validate our approach, we performed evaluations on the Covid-19 and Bone
Metastasis segmentation tasks. We also investigated the adaptability of the
model by testing it without the second decoder in the segmentation of glands
and nuclei. The results confirmed the superiority of our approach, especially
in Covid-19 infections and the segmentation of bone metastases. In addition,
the hybrid encoder showed exceptional performance in the segmentation of glands
and nuclei, solidifying its role in modern medical image analysis.Comment: arXiv admin note: text overlap with arXiv:2303.1557
D-TrAttUnet: Dual-Decoder Transformer-Based Attention Unet Architecture for Binary and Multi-classes Covid-19 Infection Segmentation
In the last three years, the world has been facing a global crisis caused by
Covid-19 pandemic. Medical imaging has been playing a crucial role in the
fighting against this disease and saving the human lives. Indeed, CT-scans has
proved their efficiency in diagnosing, detecting, and following-up the Covid-19
infection. In this paper, we propose a new Transformer-CNN based approach for
Covid-19 infection segmentation from the CT slices. The proposed D-TrAttUnet
architecture has an Encoder-Decoder structure, where compound Transformer-CNN
encoder and Dual-Decoders are proposed. The Transformer-CNN encoder is built
using Transformer layers, UpResBlocks, ResBlocks and max-pooling layers. The
Dual-Decoder consists of two identical CNN decoders with attention gates. The
two decoders are used to segment the infection and the lung regions
simultaneously and the losses of the two tasks are joined. The proposed
D-TrAttUnet architecture is evaluated for both Binary and Multi-classes
Covid-19 infection segmentation. The experimental results prove the efficiency
of the proposed approach to deal with the complexity of Covid-19 segmentation
task from limited data. Furthermore, D-TrAttUnet architecture outperforms three
baseline CNN segmentation architectures (Unet, AttUnet and Unet++) and three
state-of-the-art architectures (AnamNet, SCOATNet and CopleNet), in both Binary
and Mutli-classes segmentation tasks
CNN based facial aesthetics analysis through dynamic robust losses and ensemble regression
In recent years, estimating beauty of faces has attracted growing interest in the fields of computer vision and machine
learning. This is due to the emergence of face beauty datasets (such as SCUT-FBP, SCUT-FBP5500 and KDEF-PT) and
the prevalence of deep learning methods in many tasks. The goal of this work is to leverage the advances in Deep
Learning architectures to provide stable and accurate face beauty estimation from static face images. To this end, our
proposed approach has three main contributions. To deal with the complicated high-level features associated with the FBP
problem by using more than one pre-trained Convolutional Neural Network (CNN) model, we propose an architecture with
two backbones (2B-IncRex). In addition to 2B-IncRex, we introduce a parabolic dynamic law to control the behavior
of the robust loss parameters during training. These robust losses are ParamSmoothL1, Huber, and Tukey. As a third
contribution, we propose an ensemble regression based on five regressors, namely Resnext-50, Inception-v3 and three
regressors based on our proposed 2B-IncRex architecture. These models are trained with the following dynamic loss
functions: Dynamic ParamSmoothL1, Dynamic Tukey, Dynamic ParamSmoothL1, Dynamic Huber, and Dynamic Tukey,
respectively. To evaluate the performance of our approach, we used two datasets: SCUT-FBP5500 and KDEF-PT. The
dataset SCUT-FBP5500 contains two evaluation scenarios provided by the database developers: 60-40% split and five-
fold cross-validation. Our approach outperforms state-of-the-art methods on several metrics in both evaluation scenarios of
SCUT-FBP5500. Moreover, experiments on the KDEF-PT dataset demonstrate the efficiency of our approach for estimating
facial beauty using transfer learning, despite the presence of facial expressions and limited data. These comparisons highlight
the effectiveness of the proposed solutions for FBP. They also show that the proposed Dynamic robust losses lead to more
flexible and accurate estimators.Open Access funding provided thanks to the CRUE-CSIC
agreement with Springer Nature
Recognition of COVID-19 from CT Scans Using Two-Stage Deep-Learning-Based Approach: CNR-IEMN
Since the appearance of the COVID-19 pandemic (at the end of 2019, Wuhan, China), the recognition of COVID-19 with medical imaging has become an active research topic for the machine learning and computer vision community. This paper is based on the results obtained from the 2021 COVID-19 SPGC challenge, which aims to classify volumetric CT scans into normal, COVID-19, or community-acquired pneumonia (Cap) classes. To this end, we proposed a deep-learning-based approach (CNR-IEMN) that consists of two main stages. In the first stage, we trained four deep learning architectures with a multi-tasks strategy for slice-level classification. In the second stage, we used the previously trained models with an XG-boost classifier to classify the whole CT scan into normal, COVID-19, or Cap classes. Our approach achieved a good result on the validation set, with an overall accuracy of 87.75% and 96.36%, 52.63%, and 95.83% sensitivities for COVID-19, Cap, and normal, respectively. On the other hand, our approach achieved fifth place on the three test datasets of SPGC in the COVID-19 challenge, where our approach achieved the best result for COVID-19 sensitivity. In addition, our approach achieved second place on two of the three testing sets
Deep learning techniques for hyperspectral image analysis in agriculture: A review
In recent years, there has been a growing emphasis on assessing and ensuring the quality of horticultural and agricultural produce. Traditional methods involving field measurements, investigations, and statistical analyses are labour-intensive, time-consuming, and costly. As a solution, Hyperspectral Imaging (HSI) has emerged as a non-destructive and environmentally friendly technology. HSI has gained significant popularity as a new technology, particularly for its promising applications in remote sensing, notably in agriculture. However, classifying HSI data is highly complex because it involves several challenges, such as the excessive redundancy of spectral bands, scarcity of training samples, and the intricate non-linear relationship between spatial positions and spectral bands. Notably, Deep Learning (DL) techniques have demonstrated remarkable efficacy in various HSI analysis tasks, including those within agriculture. As interest continues to surge in leveraging HSI data for agricultural applications through DL approaches, a pressing need exists for a comprehensive survey that can effectively navigate researchers through the significant strides achieved and the future promising research directions in this domain. This literature review diligently compiles, analyzes, and discusses recent endeavours employing DL methodologies. These methodologies encompass a spectrum of approaches, ranging from Autoencoders (AE) to Convolutional Neural Networks (CNN) (in 1D, 2D, and 3D configurations), Recurrent Neural Networks (RNN), Deep Belief Networks (DBN), Generative Adversarial Networks (GAN), Transfer Learning (TL), Semi-Supervised Learning (SSL), Few-Shot Learning (FSL) and Active Learning (AL). These approaches are tailored to address the unique challenges posed by agricultural HSI analysis. This review evaluates and discusses the performance exhibited by these diverse approaches. To this end, the efficiency of these approaches has been rigorously analyzed and discussed based on the results of the state-of-the-art papers on widely recognized land cover datasets. Github repository
COVID-19 Recognition Using Ensemble-CNNs in Two New Chest X-ray Databases
The recognition of COVID-19 infection from X-ray images is an emerging field in the learning and computer vision community. Despite the great efforts that have been made in this field since the appearance of COVID-19 (2019), the field still suffers from two drawbacks. First, the number of available X-ray scans labeled as COVID-19-infected is relatively small. Second, all the works that have been carried out in the field are separate; there are no unified data, classes, and evaluation protocols. In this work, based on public and newly collected data, we propose two X-ray COVID-19 databases, which are three-class COVID-19 and five-class COVID-19 datasets. For both databases, we evaluate different deep learning architectures. Moreover, we propose an Ensemble-CNNs approach which outperforms the deep learning architectures and shows promising results in both databases. In other words, our proposed Ensemble-CNNs achieved a high performance in the recognition of COVID-19 infection, resulting in accuracies of 100% and 98.1% in the three-class and five-class scenarios, respectively. In addition, our approach achieved promising results in the overall recognition accuracy of 75.23% and 81.0% for the three-class and five-class scenarios, respectively. We make our databases of COVID-19 X-ray scans publicly available to encourage other researchers to use it as a benchmark for their studies and comparisons
Fusion of transformed shallow features for facial expression recognition
International audienceFacial expression conveys important signs about the human affective state, cognitive activity, intention and personality. In fact, the automatic facial expression recognition systems are getting more interest year after year due to its wide range of applications in several interesting fields such as human computer/robot interaction, medical applications, animation and video gaming. In this study, the authors propose to combine between different descriptors features (histogram of oriented gradients, local phase quantisation and binarised statistical image features) after applying principal component analysis on each of them to recognise the six basic expressions and the neutral face from the static images. Their proposed fusion method has been tested on four popular databases which are: JAFFE, MMI, CASIA and CK+, using two different cross-validation schemes: subject independent and leave-one-subject-out. The obtained results show that their method outperforms both the raw features concatenation and state-of-the-art methods
Per-COVID-19: A Benchmark Dataset for COVID-19 Percentage Estimation from CT-Scans
International audienceCOVID-19 infection recognition is a very important step in the fight against the COVID-19 pandemic. In fact, many methods have been used to recognize COVID-19 infection including Reverse Transcription Polymerase Chain Reaction (RT-PCR), X-ray scan, and Computed Tomography scan (CT- scan). In addition to the recognition of the COVID-19 infection, CT scans can provide more important information about the evolution of this disease and its severity. With the extensive number of COVID-19 infections, estimating the COVID-19 percentage can help the intensive care to free up the resuscitation beds for the critical cases and follow other protocol for less severity cases. In this paper, we introduce COVID-19 percentage estimation dataset from CT-scans, where the labeling process was accomplished by two expert radiologists. Moreover, we evaluate the performance of three Convolutional Neural Network (CNN) architectures: ResneXt-50, Densenet-161, and Inception-v3. For the three CNN architectures, we use two loss functions: MSE and Dynamic Huber. In addition, two pretrained scenarios are investigated (ImageNet pretrained models and pretrained models using X-ray data). The evaluated approaches achieved promising results on the estimation of COVID-19 infection. Inception-v3 using Dynamic Huber loss function and pretrained models using X-ray data achieved the best performance for slice-level results: 0.9365, 5.10, and 9.25 for Pearson Correlation coefficient (PC), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE), respectively. On the other hand, the same approach achieved 0.9603, 4.01, and 6.79 for PCsubj, MAEsubj, and RMSEsubj, respectively, for subject-level results. These results prove that using CNN architectures can provide accurate and fast solution to estimate the COVID-19 infection percentage for monitoring the evolution of the patient state
COVID-19 Recognition Using Ensemble-CNNs in Two New Chest X-ray Databases
The recognition of COVID-19 infection from X-ray images is an emerging field in the learning and computer vision community. Despite the great efforts that have been made in this field since the appearance of COVID-19 (2019), the field still suffers from two drawbacks. First, the number of available X-ray scans labeled as COVID-19-infected is relatively small. Second, all the works that have been carried out in the field are separate; there are no unified data, classes, and evaluation protocols. In this work, based on public and newly collected data, we propose two X-ray COVID-19 databases, which are three-class COVID-19 and five-class COVID-19 datasets. For both databases, we evaluate different deep learning architectures. Moreover, we propose an Ensemble-CNNs approach which outperforms the deep learning architectures and shows promising results in both databases. In other words, our proposed Ensemble-CNNs achieved a high performance in the recognition of COVID-19 infection, resulting in accuracies of 100% and 98.1% in the three-class and five-class scenarios, respectively. In addition, our approach achieved promising results in the overall recognition accuracy of 75.23% and 81.0% for the three-class and five-class scenarios, respectively. We make our databases of COVID-19 X-ray scans publicly available to encourage other researchers to use it as a benchmark for their studies and comparisons