18 research outputs found

    Real-time image detection for edge devices: a peach fruit detection application

    Get PDF
    Within the scope of precision agriculture, many applications have been developed to support decision making and yield enhancement. Fruit detection has attracted considerable attention from researchers, and it can be used offline. In contrast, some applications, such as robot vision in orchards, require computer vision models to run on edge devices while performing inferences at high speed. In this area, most modern applications use an integrated graphics processing unit (GPU). In this work, we propose the use of a tensor processing unit (TPU) accelerator with a Raspberry Pi target device and the state-of-the-art, lightweight, and hardware-aware MobileDet detector model. Our contribution is the extension of the possibilities of using accelerators (the TPU) for edge devices in precision agriculture. The proposed method was evaluated using a novel dataset of peaches with three cultivars, which will be made available for further studies. The model achieved an average precision (AP) of 88.2% and a performance of 19.84 frames per second (FPS) at an image size of 640 × 480. The results obtained show that the TPU accelerator can be an excellent alternative for processing on the edge in precision agriculture.info:eu-repo/semantics/publishedVersio

    Peaches Detection Using a Deep Learning Technique - A Contribution to Yield Estimation, Resources Management, and Circular Economy

    Get PDF
    Fruit detection is crucial for yield estimation and fruit picking system performance. Many state-of-the-art methods for fruit detection use convolutional neural networks (CNNs). This paper presents the results for peach detection by applying a faster R-CNN framework in images captured from an outdoor orchard. Although this method has been used in other studies to detect fruits, there is no research on peaches. Since the fruit colors, sizes, shapes, tree branches, fruit bunches, and distributions in trees are particular, the development of a fruit detection procedure is specific. The results show great potential in using this method to detect this type of fruit. A detection accuracy of 0.90 using the metric average precision (AP) was achieved for fruit detection. Precision agriculture applications, such as deep neural networks (DNNs), as proposed in this paper, can help to mitigate climate change, due to horticultural activities by accurate product prediction, leading to improved resource management (e.g., irrigation water, nutrients, herbicides, pesticides), and helping to reduce food loss and waste via improved agricultural activity scheduling.The authors are thankful to Fundação para a Ciência e Tecnologia (FCT) and R&D Unit “Center for Mechanical and Aerospace Science and Technologies” (C-MAST), under project UIDB/00151/2020, for the opportunity and the financial support to carry on this project. The contributions of Hugo Proença and Pedro Inácio in this work were supported by FCT/MEC through FEDER—PT2020 Partnership Agreement under Project UIDB//50008/2021.info:eu-repo/semantics/publishedVersio

    Cancer diagnosis using deep learning: A bibliographic review

    Get PDF
    In this paper, we first describe the basics of the field of cancer diagnosis, which includes steps of cancer diagnosis followed by the typical classification methods used by doctors, providing a historical idea of cancer classification techniques to the readers. These methods include Asymmetry, Border, Color and Diameter (ABCD) method, seven-point detection method, Menzies method, and pattern analysis. They are used regularly by doctors for cancer diagnosis, although they are not considered very efficient for obtaining better performance. Moreover, considering all types of audience, the basic evaluation criteria are also discussed. The criteria include the receiver operating characteristic curve (ROC curve), Area under the ROC curve (AUC), F1 score, accuracy, specificity, sensitivity, precision, dice-coefficient, average accuracy, and Jaccard index. Previously used methods are considered inefficient, asking for better and smarter methods for cancer diagnosis. Artificial intelligence and cancer diagnosis are gaining attention as a way to define better diagnostic tools. In particular, deep neural networks can be successfully used for intelligent image analysis. The basic framework of how this machine learning works on medical imaging is provided in this study, i.e., pre-processing, image segmentation and post-processing. The second part of this manuscript describes the different deep learning techniques, such as convolutional neural networks (CNNs), generative adversarial models (GANs), deep autoencoders (DANs), restricted Boltzmann’s machine (RBM), stacked autoencoders (SAE), convolutional autoencoders (CAE), recurrent neural networks (RNNs), long short-term memory (LTSM), multi-scale convolutional neural network (M-CNN), multi-instance learning convolutional neural network (MIL-CNN). For each technique, we provide Python codes, to allow interested readers to experiment with the cited algorithms on their own diagnostic problems. The third part of this manuscript compiles the successfully applied deep learning models for different types of cancers. Considering the length of the manuscript, we restrict ourselves to the discussion of breast cancer, lung cancer, brain cancer, and skin cancer. The purpose of this bibliographic review is to provide researchers opting to work in implementing deep learning and artificial neural networks for cancer diagnosis a knowledge from scratch of the state-of-the-art achievements

    Enhanced 3D Point Cloud from a Light Field Image

    Full text link
    The importance of three-dimensional (3D) point cloud technologies in the field of agriculture environmental research has increased in recent years. Obtaining dense and accurate 3D reconstructions of plants and urban areas provide useful information for remote sensing. In this paper, we propose a novel strategy for the enhancement of 3D point clouds from a single 4D light field (LF) image. Using a light field camera in this way creates an easy way for obtaining 3D point clouds from one snapshot and enabling diversity in monitoring and modelling applications for remote sensing. Considering an LF image and associated depth map as an input, we first apply histogram equalization and histogram stretching to enhance the separation between depth planes. We then apply multi-modal edge detection by using feature matching and fuzzy logic from the central sub-aperture LF image and the depth map. These two steps of depth map enhancement are significant parts of our novelty for this work. After combing the two previous steps and transforming the point–plane correspondence, we can obtain the 3D point cloud. We tested our method with synthetic and real world image databases. To verify the accuracy of our method, we compared our results with two different state-of-the-art algorithms. The results showed that our method can reliably mitigate noise and had the highest level of detail compared to other existing methods

    An Attention-Guided Framework for Explainable Biometric Presentation Attack Detection

    Get PDF
    Despite the high performances achieved using deep learning techniques in biometric systems, the inability to rationalise the decisions reached by such approaches is a significant drawback for the usability and security requirements of many applications. For Facial Biometric Presentation Attack Detection (PAD), deep learning approaches can provide good classification results but cannot answer the questions such as “Why did the system make this decision”? To overcome this limitation, an explainable deep neural architecture for Facial Biometric Presentation Attack Detection is introduced in this paper. Both visual and verbal explanations are produced using the saliency maps from a Grad-CAM approach and the gradient from a Long-Short-Term-Memory (LSTM) network with a modified gate function. These explanations have also been used in the proposed framework as additional information to further improve the classification performance. The proposed framework utilises both spatial and temporal information to help the model focus on anomalous visual characteristics that indicate spoofing attacks. The performance of the proposed approach is evaluated using the CASIA-FA, Replay Attack, MSU-MFSD, and HKBU MARs datasets and indicates the effectiveness of the proposed method for improving performance and producing usable explanations

    Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review

    Get PDF
    Modern hyperspectral imaging systems produce huge datasets potentially conveying a great abundance of information; such a resource, however, poses many challenges in the analysis and interpretation of these data. Deep learning approaches certainly offer a great variety of opportunities for solving classical imaging tasks and also for approaching new stimulating problems in the spatial–spectral domain. This is fundamental in the driving sector of Remote Sensing where hyperspectral technology was born and has mostly developed, but it is perhaps even more true in the multitude of current and evolving application sectors that involve these imaging technologies. The present review develops on two fronts: on the one hand, it is aimed at domain professionals who want to have an updated overview on how hyperspectral acquisition techniques can combine with deep learning architectures to solve specific tasks in different application fields. On the other hand, we want to target the machine learning and computer vision experts by giving them a picture of how deep learning technologies are applied to hyperspectral data from a multidisciplinary perspective. The presence of these two viewpoints and the inclusion of application fields other than Remote Sensing are the original contributions of this review, which also highlights some potentialities and critical issues related to the observed development trends

    ArithFusion: An Arithmetic Deep Model for Temporal Remote Sensing Image Fusion

    Get PDF
    Different satellite images may consist of variable numbers of channels which have different resolutions, and each satellite has a unique revisit period. For example, the Landsat-8 satellite images have 30 m resolution in their multispectral channels, the Sentinel-2 satellite images have 10 m resolution in the pan-sharp channel, and the National Agriculture Imagery Program (NAIP) aerial images have 1 m resolution. In this study, we propose a simple yet effective arithmetic deep model for multimodal temporal remote sensing image fusion. The proposed model takes both low- and high-resolution remote sensing images at t1 together with low-resolution images at a future time t2 from the same location as inputs and fuses them to generate high-resolution images for the same location at t2. We propose an arithmetic operation applied to the low-resolution images at the two time points in feature space to take care of temporal changes. We evaluated the proposed model on three modality pairs for multimodal temporal image fusion, including downsampled WorldView-2/original WorldView-2, Landsat-8/Sentinel-2, and Sentinel-2/NAIP. Experimental results show that our model outperforms traditional algorithms and recent deep learning-based models by large margins in most scenarios, achieving sharp fused images while appropriately addressing temporal changes

    Task-driven learned hyperspectral data reduction using end-to-end supervised deep learning

    Get PDF
    An important challenge in hyperspectral imaging tasks is to cope with the large number of spectral bins. Common spectral data reduction methods do not take prior knowledge about the task into account. Consequently, sparsely occurring features that may be essential for the imaging task may not be preserved in the data reduction step. Convolutional neural network (CNN) approaches are capable of learning the specific features relevant to the particular imaging task, but applying them directly to the spectral input data is constrained by the computational efficiency. We propose a novel supervised deep learning approach for combining data reduction and image analysis in an end-to-end architecture. In our approach, the neural network component that performs the reduction is trained such that image features most relevant for the task are preserved in the reduction step. Results for two convolutional neural network architectures and two types of generated datasets show that the proposed Data Reduction CNN (DRCNN) approach can produce more accurate results than existing popular data reduction methods, and can be used in a wide range of problem settings. The integration of knowledge about the task allows for more image compression and higher accuracies compared to standard data reduction methods

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity
    corecore