Search CORE

2,164 research outputs found

A deep representation for depth images from synthetic data

Author: Caputo Barbara
Carlucci Fabio Maria
Russo Paolo
Publication venue
Publication date: 30/09/2016
Field of study

Convolutional Neural Networks (CNNs) trained on large scale RGB databases have become the secret sauce in the majority of recent approaches for object categorization from RGB-D data. Thanks to colorization techniques, these methods exploit the filters learned from 2D images to extract meaningful representations in 2.5D. Still, the perceptual signature of these two kind of images is very different, with the first usually strongly characterized by textures, and the second mostly by silhouettes of objects. Ideally, one would like to have two CNNs, one for RGB and one for depth, each trained on a suitable data collection, able to capture the perceptual properties of each channel for the task at hand. This has not been possible so far, due to the lack of a suitable depth database. This paper addresses this issue, proposing to opt for synthetically generated images rather than collecting by hand a 2.5D large scale database. While being clearly a proxy for real data, synthetic images allow to trade quality for quantity, making it possible to generate a virtually infinite amount of data. We show that the filters learned from such data collection, using the very same architecture typically used on visual data, learns very different filters, resulting in depth features (a) able to better characterize the different facets of depth images, and (b) complementary with respect to those derived from CNNs pre-trained on 2D datasets. Experiments on two publicly available databases show the power of our approach

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

Author: Anwar Syed M.
Bilal Muhammad
Mehmood Raja M
Ullah Amin
Publication venue: 'MDPI AG'
Publication date: 01/05/2020
Field of study

The electrocardiogram (ECG) is one of the most extensively employed signals used in the diagnosis and prediction of cardiovascular diseases (CVDs). The ECG signals can capture the heart's rhythmic irregularities, commonly known as arrhythmias. A careful study of ECG signals is crucial for precise diagnoses of patients' acute and chronic heart conditions. In this study, we propose a two-dimensional (2-D) convolutional neural network (CNN) model for the classification of ECG signals into eight classes; namely, normal beat, premature ventricular contraction beat, paced beat, right bundle branch block beat, left bundle branch block beat, atrial premature contraction beat, ventricular flutter wave beat, and ventricular escape beat. The one-dimensional ECG time series signals are transformed into 2-D spectrograms through short-time Fourier transform. The 2-D CNN model consisting of four convolutional layers and four pooling layers is designed for extracting robust features from the input spectrograms. Our proposed methodology is evaluated on a publicly available MIT-BIH arrhythmia dataset. We achieved a state-of-the-art average classification accuracy of 99.11\%, which is better than those of recently reported results in classifying similar types of arrhythmias. The performance is significant in other indices as well, including sensitivity and specificity, which indicates the success of the proposed method.Comment: 14 pages, 5 figures, accepted for future publication in Remote Sensing MDPI Journa

arXiv.org e-Print Archive

Lancaster E-Prints

Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy

Author: Aghi Diego
Chiaberge Marcello
Mazzia Vittorio
Publication venue: 'MDPI AG'
Publication date: 26/05/2020
Field of study

With the advent of agriculture 3.0 and 4.0, researchers are increasingly focusing on the development of innovative smart farming and precision agriculture technologies by introducing automation and robotics into the agricultural processes. Autonomous agricultural field machines have been gaining significant attention from farmers and industries to reduce costs, human workload, and required resources. Nevertheless, achieving sufficient autonomous navigation capabilities requires the simultaneous cooperation of different processes; localization, mapping, and path planning are just some of the steps that aim at providing to the machine the right set of skills to operate in semi-structured and unstructured environments. In this context, this study presents a low-cost local motion planner for autonomous navigation in vineyards based only on an RGB-D camera, low range hardware, and a dual layer control algorithm. The first algorithm exploits the disparity map and its depth representation to generate a proportional control for the robotic platform. Concurrently, a second back-up algorithm, based on representations learning and resilient to illumination variations, can take control of the machine in case of a momentaneous failure of the first block. Moreover, due to the double nature of the system, after initial training of the deep learning model with an initial dataset, the strict synergy between the two algorithms opens the possibility of exploiting new automatically labeled data, coming from the field, to extend the existing model knowledge. The machine learning algorithm has been trained and tested, using transfer learning, with acquired images during different field surveys in the North region of Italy and then optimized for on-device inference with model pruning and quantization. Finally, the overall system has been validated with a customized robot platform in the relevant environment

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Evaluating color texture descriptors under large variations of controlled lighting conditions

Author: Cusano Claudio
Napoletano Paolo
Schettini Raimondo
Publication venue: 'The Optical Society'
Publication date: 05/08/2015
Field of study

The recognition of color texture under varying lighting conditions is still an open issue. Several features have been proposed for this purpose, ranging from traditional statistical descriptors to features extracted with neural networks. Still, it is not completely clear under what circumstances a feature performs better than the others. In this paper we report an extensive comparison of old and new texture features, with and without a color normalization step, with a particular focus on how they are affected by small and large variation in the lighting conditions. The evaluation is performed on a new texture database including 68 samples of raw food acquired under 46 conditions that present single and combined variations of light color, direction and intensity. The database allows to systematically investigate the robustness of texture descriptors across a large range of variations of imaging conditions.Comment: Submitted to the Journal of the Optical Society of America

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

합성곱 신경망과 완전 연결 계층의 앙상블 모델을 이용한 정밀한 파프리카 과실 발달 단계 예측

Author: 박준영
Publication venue: 서울대학교 대학원
Publication date: 01/02/2021
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 농업생명과학대학 농림생물자원학부, 2021. 2. 손정익.Accurate detection of individual fruits and prediction of their development stages enable growers to efficiently allocate labor and manage strategically. However, the prediction of the fruit development stage is challenging, especially in sweet peppers, because the fruit harvest is discrete and its immature stage is indistinguishable. An ensemble model of convolutional and fully connected neural networks was developed to detect sweet pepper (Capsicum annuum L.) fruits in images and predict their development stages. The plants were grown in four rows in a greenhouse, and images were collected in each row. Plant growth and environmental data were collected every minute and month, respectively. For predicting the fruit stage, an ensemble of convolutional neural network (CNN) and multilayer perceptron (MLP) models were used. The fruit development stage was classified into immature, breaking, and mature stages with a CNN using images. Moreover, the immature stage was internally divided into four stages with an MLP. The plant growth and environmental data and the information from the CNN output were used for the MLP input. That is, a total of six stages were classified using the CNN–MLP ensemble model. The ensemble model showed good agreement in predicting fruit development stages. The average accuracy of the six stages was F1 score = 0.77 and IoU = 0.86. The CNN-only model could classify the mature and breaking stages well, but the immature stages were not distinguished, while the MLP-only model could hardly classify the fruit stage except the immature stages. The most influential factors in classification were the data obtained from CNN and the plant growth and environment data, which contributed to the improvement of model accuracy. The ensemble models can help in appropriate labor allocation and strategic management by detecting individual fruits in images and predicting precise fruit development stages.온실에서는 고부가가치에 열매를 맺는 작물을 효율적으로 관리하는 것이 중요하다. 개별 과실을 감지하고 그것의 발달 단계를 예측함으로써 재배자가 노동력을 적재적소에 할당하고, 전략적인 관리를 할 수 있다. 그러나 파프리카의 과실 발달 단계를 예측하는 것은 과실 수확량이 불연속적이고, 미성숙 단계에서 과실 간 나타나는 외부적인 특징 차이를 구별하기 어렵기 때문에 쉽지 않다. 이 연구의 목적은 합성곱 신경망과 완전 연결 계층의 앙상블 모델을 이용하여 이미지에서 파프리카 과실을 찾아내고 과실 발달 단계를 예측하는 것이다. 실험용 온실에서 파프리카 (Capsicum annuum L.)를 4줄로 재배하였고, 각 줄에 양면에서 식물 이미지를 수집 하였다. 2020년 4월 6일부터 6월 24일까지 환경 데이터는 분 마다, 식물 생장 데이터는 월 마다 수집되었다. 과실 발달 단계는 이미지에서 합성곱 신경망을 이용하여 미성숙, 변화 중, 성숙 3 단계로 구분하였고, 미성숙 단계는 완전 연결 계층을 이용하여 다시 세부적으로 4 단계로 구분 하였다. 환경, 식물 생장 데이터 및 합성곱 신경망의 출력 정보가 완전 연결 계층에 입력으로 사용되었다. 즉, 총 6 개의 과실 발달 단계가 앙상블 모델을 이용하여 분류되었다. 앙상블 모델은 과실 발달 단계를 예측하는 데 좋은 성능을 보였다. 총 6 단계의 과실 발달 단계 분류에 평균 정확도는 F1 점수 = 0.77, IoU = 0.86이다. 합성곱 신경망만을 이용한 모델은 성숙 단계와 변화 중 단계를 잘 분류 할 수 있었지만 미성숙 단계를 구별하지 못하였다. 완전 연결 계층만을 이용한 모델은 미성숙 단계를 제외하고 과실 단계를 거의 분류 할 수 없었다. 과실 발달 단계의 분류에 가장 큰 영향을 미치는 요인은 합성곱 신경망의 출력 정보였고, 환경 및 식물 생장 데이터는 모델 정확도 향상에 기여했다. 본 연구 결과는 추후 다양한 환경에 이미지에서 개별 과실을 찾아내고, 정확한 과실 발달 단계를 예측함으로써 적절한 노동력 할당 및 전략적 관리에 도움이 될 수 있을 것으로 사료된다.ABSTRACT i CONTENTS iii LIST OF TABLES iv LIST OF FIGURES v INTRODUCTION 1 LITERATURE REVIEW 4 MATERIALS AND METHODS 9 RESULTS 24 DISCUSSION 34 CONCLUSION 39 LITERATURE CITED 40 ABSTRACT IN KOREAN 47Maste

SNU Open Repository and Archive

A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU

Author: Mohamed Raihani
Mustapha Norwati
Perumal Thinagaran
Shiri Farhad Mortezapour
Publication venue
Publication date: 27/05/2023
Field of study

Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and artificial intelligence (AI), outperforming traditional ML methods, especially in handling unstructured and large datasets. Its impact spans across various domains, including speech recognition, healthcare, autonomous vehicles, cybersecurity, predictive analytics, and more. However, the complexity and dynamic nature of real-world problems present challenges in designing effective deep learning models. Consequently, several deep learning models have been developed to address different problems and applications. In this article, we conduct a comprehensive survey of various deep learning models, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Models, Deep Reinforcement Learning (DRL), and Deep Transfer Learning. We examine the structure, applications, benefits, and limitations of each model. Furthermore, we perform an analysis using three publicly available datasets: IMDB, ARAS, and Fruit-360. We compare the performance of six renowned deep learning models: CNN, Simple RNN, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional GRU.Comment: 16 pages, 29 figure

arXiv.org e-Print Archive