2,164 research outputs found

    A deep representation for depth images from synthetic data

    Full text link
    Convolutional Neural Networks (CNNs) trained on large scale RGB databases have become the secret sauce in the majority of recent approaches for object categorization from RGB-D data. Thanks to colorization techniques, these methods exploit the filters learned from 2D images to extract meaningful representations in 2.5D. Still, the perceptual signature of these two kind of images is very different, with the first usually strongly characterized by textures, and the second mostly by silhouettes of objects. Ideally, one would like to have two CNNs, one for RGB and one for depth, each trained on a suitable data collection, able to capture the perceptual properties of each channel for the task at hand. This has not been possible so far, due to the lack of a suitable depth database. This paper addresses this issue, proposing to opt for synthetically generated images rather than collecting by hand a 2.5D large scale database. While being clearly a proxy for real data, synthetic images allow to trade quality for quantity, making it possible to generate a virtually infinite amount of data. We show that the filters learned from such data collection, using the very same architecture typically used on visual data, learns very different filters, resulting in depth features (a) able to better characterize the different facets of depth images, and (b) complementary with respect to those derived from CNNs pre-trained on 2D datasets. Experiments on two publicly available databases show the power of our approach

    Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

    Full text link
    The electrocardiogram (ECG) is one of the most extensively employed signals used in the diagnosis and prediction of cardiovascular diseases (CVDs). The ECG signals can capture the heart's rhythmic irregularities, commonly known as arrhythmias. A careful study of ECG signals is crucial for precise diagnoses of patients' acute and chronic heart conditions. In this study, we propose a two-dimensional (2-D) convolutional neural network (CNN) model for the classification of ECG signals into eight classes; namely, normal beat, premature ventricular contraction beat, paced beat, right bundle branch block beat, left bundle branch block beat, atrial premature contraction beat, ventricular flutter wave beat, and ventricular escape beat. The one-dimensional ECG time series signals are transformed into 2-D spectrograms through short-time Fourier transform. The 2-D CNN model consisting of four convolutional layers and four pooling layers is designed for extracting robust features from the input spectrograms. Our proposed methodology is evaluated on a publicly available MIT-BIH arrhythmia dataset. We achieved a state-of-the-art average classification accuracy of 99.11\%, which is better than those of recently reported results in classifying similar types of arrhythmias. The performance is significant in other indices as well, including sensitivity and specificity, which indicates the success of the proposed method.Comment: 14 pages, 5 figures, accepted for future publication in Remote Sensing MDPI Journa

    Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy

    Get PDF
    With the advent of agriculture 3.0 and 4.0, researchers are increasingly focusing on the development of innovative smart farming and precision agriculture technologies by introducing automation and robotics into the agricultural processes. Autonomous agricultural field machines have been gaining significant attention from farmers and industries to reduce costs, human workload, and required resources. Nevertheless, achieving sufficient autonomous navigation capabilities requires the simultaneous cooperation of different processes; localization, mapping, and path planning are just some of the steps that aim at providing to the machine the right set of skills to operate in semi-structured and unstructured environments. In this context, this study presents a low-cost local motion planner for autonomous navigation in vineyards based only on an RGB-D camera, low range hardware, and a dual layer control algorithm. The first algorithm exploits the disparity map and its depth representation to generate a proportional control for the robotic platform. Concurrently, a second back-up algorithm, based on representations learning and resilient to illumination variations, can take control of the machine in case of a momentaneous failure of the first block. Moreover, due to the double nature of the system, after initial training of the deep learning model with an initial dataset, the strict synergy between the two algorithms opens the possibility of exploiting new automatically labeled data, coming from the field, to extend the existing model knowledge. The machine learning algorithm has been trained and tested, using transfer learning, with acquired images during different field surveys in the North region of Italy and then optimized for on-device inference with model pruning and quantization. Finally, the overall system has been validated with a customized robot platform in the relevant environment

    Evaluating color texture descriptors under large variations of controlled lighting conditions

    Full text link
    The recognition of color texture under varying lighting conditions is still an open issue. Several features have been proposed for this purpose, ranging from traditional statistical descriptors to features extracted with neural networks. Still, it is not completely clear under what circumstances a feature performs better than the others. In this paper we report an extensive comparison of old and new texture features, with and without a color normalization step, with a particular focus on how they are affected by small and large variation in the lighting conditions. The evaluation is performed on a new texture database including 68 samples of raw food acquired under 46 conditions that present single and combined variations of light color, direction and intensity. The database allows to systematically investigate the robustness of texture descriptors across a large range of variations of imaging conditions.Comment: Submitted to the Journal of the Optical Society of America

    ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง๊ณผ ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต์˜ ์•™์ƒ๋ธ” ๋ชจ๋ธ์„ ์ด์šฉํ•œ ์ •๋ฐ€ํ•œ ํŒŒํ”„๋ฆฌ์นด ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„ ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๋†์—…์ƒ๋ช…๊ณผํ•™๋Œ€ํ•™ ๋†๋ฆผ์ƒ๋ฌผ์ž์›ํ•™๋ถ€, 2021. 2. ์†์ •์ต.Accurate detection of individual fruits and prediction of their development stages enable growers to efficiently allocate labor and manage strategically. However, the prediction of the fruit development stage is challenging, especially in sweet peppers, because the fruit harvest is discrete and its immature stage is indistinguishable. An ensemble model of convolutional and fully connected neural networks was developed to detect sweet pepper (Capsicum annuum L.) fruits in images and predict their development stages. The plants were grown in four rows in a greenhouse, and images were collected in each row. Plant growth and environmental data were collected every minute and month, respectively. For predicting the fruit stage, an ensemble of convolutional neural network (CNN) and multilayer perceptron (MLP) models were used. The fruit development stage was classified into immature, breaking, and mature stages with a CNN using images. Moreover, the immature stage was internally divided into four stages with an MLP. The plant growth and environmental data and the information from the CNN output were used for the MLP input. That is, a total of six stages were classified using the CNNโ€“MLP ensemble model. The ensemble model showed good agreement in predicting fruit development stages. The average accuracy of the six stages was F1 score = 0.77 and IoU = 0.86. The CNN-only model could classify the mature and breaking stages well, but the immature stages were not distinguished, while the MLP-only model could hardly classify the fruit stage except the immature stages. The most influential factors in classification were the data obtained from CNN and the plant growth and environment data, which contributed to the improvement of model accuracy. The ensemble models can help in appropriate labor allocation and strategic management by detecting individual fruits in images and predicting precise fruit development stages.์˜จ์‹ค์—์„œ๋Š” ๊ณ ๋ถ€๊ฐ€๊ฐ€์น˜์— ์—ด๋งค๋ฅผ ๋งบ๋Š” ์ž‘๋ฌผ์„ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. ๊ฐœ๋ณ„ ๊ณผ์‹ค์„ ๊ฐ์ง€ํ•˜๊ณ  ๊ทธ๊ฒƒ์˜ ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋ฅผ ์˜ˆ์ธกํ•จ์œผ๋กœ์จ ์žฌ๋ฐฐ์ž๊ฐ€ ๋…ธ๋™๋ ฅ์„ ์ ์žฌ์ ์†Œ์— ํ• ๋‹นํ•˜๊ณ , ์ „๋žต์ ์ธ ๊ด€๋ฆฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํŒŒํ”„๋ฆฌ์นด์˜ ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์€ ๊ณผ์‹ค ์ˆ˜ํ™•๋Ÿ‰์ด ๋ถˆ์—ฐ์†์ ์ด๊ณ , ๋ฏธ์„ฑ์ˆ™ ๋‹จ๊ณ„์—์„œ ๊ณผ์‹ค ๊ฐ„ ๋‚˜ํƒ€๋‚˜๋Š” ์™ธ๋ถ€์ ์ธ ํŠน์ง• ์ฐจ์ด๋ฅผ ๊ตฌ๋ณ„ํ•˜๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ์‰ฝ์ง€ ์•Š๋‹ค. ์ด ์—ฐ๊ตฌ์˜ ๋ชฉ์ ์€ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง๊ณผ ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต์˜ ์•™์ƒ๋ธ” ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์—์„œ ํŒŒํ”„๋ฆฌ์นด ๊ณผ์‹ค์„ ์ฐพ์•„๋‚ด๊ณ  ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์‹คํ—˜์šฉ ์˜จ์‹ค์—์„œ ํŒŒํ”„๋ฆฌ์นด (Capsicum annuum L.)๋ฅผ 4์ค„๋กœ ์žฌ๋ฐฐํ•˜์˜€๊ณ , ๊ฐ ์ค„์— ์–‘๋ฉด์—์„œ ์‹๋ฌผ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ง‘ ํ•˜์˜€๋‹ค. 2020๋…„ 4์›” 6์ผ๋ถ€ํ„ฐ 6์›” 24์ผ๊นŒ์ง€ ํ™˜๊ฒฝ ๋ฐ์ดํ„ฐ๋Š” ๋ถ„ ๋งˆ๋‹ค, ์‹๋ฌผ ์ƒ์žฅ ๋ฐ์ดํ„ฐ๋Š” ์›” ๋งˆ๋‹ค ์ˆ˜์ง‘๋˜์—ˆ๋‹ค. ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋Š” ์ด๋ฏธ์ง€์—์„œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•˜์—ฌ ๋ฏธ์„ฑ์ˆ™, ๋ณ€ํ™” ์ค‘, ์„ฑ์ˆ™ 3 ๋‹จ๊ณ„๋กœ ๊ตฌ๋ถ„ํ•˜์˜€๊ณ , ๋ฏธ์„ฑ์ˆ™ ๋‹จ๊ณ„๋Š” ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต์„ ์ด์šฉํ•˜์—ฌ ๋‹ค์‹œ ์„ธ๋ถ€์ ์œผ๋กœ 4 ๋‹จ๊ณ„๋กœ ๊ตฌ๋ถ„ ํ•˜์˜€๋‹ค. ํ™˜๊ฒฝ, ์‹๋ฌผ ์ƒ์žฅ ๋ฐ์ดํ„ฐ ๋ฐ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์˜ ์ถœ๋ ฅ ์ •๋ณด๊ฐ€ ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต์— ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ์ฆ‰, ์ด 6 ๊ฐœ์˜ ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๊ฐ€ ์•™์ƒ๋ธ” ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜๋˜์—ˆ๋‹ค. ์•™์ƒ๋ธ” ๋ชจ๋ธ์€ ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด 6 ๋‹จ๊ณ„์˜ ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„ ๋ถ„๋ฅ˜์— ํ‰๊ท  ์ •ํ™•๋„๋Š” F1 ์ ์ˆ˜ = 0.77, IoU = 0.86์ด๋‹ค. ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง๋งŒ์„ ์ด์šฉํ•œ ๋ชจ๋ธ์€ ์„ฑ์ˆ™ ๋‹จ๊ณ„์™€ ๋ณ€ํ™” ์ค‘ ๋‹จ๊ณ„๋ฅผ ์ž˜ ๋ถ„๋ฅ˜ ํ•  ์ˆ˜ ์žˆ์—ˆ์ง€๋งŒ ๋ฏธ์„ฑ์ˆ™ ๋‹จ๊ณ„๋ฅผ ๊ตฌ๋ณ„ํ•˜์ง€ ๋ชปํ•˜์˜€๋‹ค. ์™„์ „ ์—ฐ๊ฒฐ ๊ณ„์ธต๋งŒ์„ ์ด์šฉํ•œ ๋ชจ๋ธ์€ ๋ฏธ์„ฑ์ˆ™ ๋‹จ๊ณ„๋ฅผ ์ œ์™ธํ•˜๊ณ  ๊ณผ์‹ค ๋‹จ๊ณ„๋ฅผ ๊ฑฐ์˜ ๋ถ„๋ฅ˜ ํ•  ์ˆ˜ ์—†์—ˆ๋‹ค. ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„์˜ ๋ถ„๋ฅ˜์— ๊ฐ€์žฅ ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์š”์ธ์€ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์˜ ์ถœ๋ ฅ ์ •๋ณด์˜€๊ณ , ํ™˜๊ฒฝ ๋ฐ ์‹๋ฌผ ์ƒ์žฅ ๋ฐ์ดํ„ฐ๋Š” ๋ชจ๋ธ ์ •ํ™•๋„ ํ–ฅ์ƒ์— ๊ธฐ์—ฌํ–ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋Š” ์ถ”ํ›„ ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์— ์ด๋ฏธ์ง€์—์„œ ๊ฐœ๋ณ„ ๊ณผ์‹ค์„ ์ฐพ์•„๋‚ด๊ณ , ์ •ํ™•ํ•œ ๊ณผ์‹ค ๋ฐœ๋‹ฌ ๋‹จ๊ณ„๋ฅผ ์˜ˆ์ธกํ•จ์œผ๋กœ์จ ์ ์ ˆํ•œ ๋…ธ๋™๋ ฅ ํ• ๋‹น ๋ฐ ์ „๋žต์  ๊ด€๋ฆฌ์— ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ์‚ฌ๋ฃŒ๋œ๋‹ค.ABSTRACT i CONTENTS iii LIST OF TABLES iv LIST OF FIGURES v INTRODUCTION 1 LITERATURE REVIEW 4 MATERIALS AND METHODS 9 RESULTS 24 DISCUSSION 34 CONCLUSION 39 LITERATURE CITED 40 ABSTRACT IN KOREAN 47Maste

    A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU

    Full text link
    Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and artificial intelligence (AI), outperforming traditional ML methods, especially in handling unstructured and large datasets. Its impact spans across various domains, including speech recognition, healthcare, autonomous vehicles, cybersecurity, predictive analytics, and more. However, the complexity and dynamic nature of real-world problems present challenges in designing effective deep learning models. Consequently, several deep learning models have been developed to address different problems and applications. In this article, we conduct a comprehensive survey of various deep learning models, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Models, Deep Reinforcement Learning (DRL), and Deep Transfer Learning. We examine the structure, applications, benefits, and limitations of each model. Furthermore, we perform an analysis using three publicly available datasets: IMDB, ARAS, and Fruit-360. We compare the performance of six renowned deep learning models: CNN, Simple RNN, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional GRU.Comment: 16 pages, 29 figure
    • โ€ฆ
    corecore