14 research outputs found

    Deep neural network for traffic sign recognition systems: An analysis of spatial transformers and stochastic optimisation methods

    Get PDF
    This paper presents a Deep Learning approach for traffic sign recognition systems. Several classification experiments are conducted over publicly available traffic sign datasets from Germany and Belgium using a Deep Neural Network which comprises Convolutional layers and Spatial Transformer Networks. Such trials are built to measure the impact of diverse factors with the end goal of designing a Convolutional Neural Network that can improve the state-of-the-art of traffic sign classification task. First, different adaptive and non-adaptive stochastic gradient descent optimisation algorithms such as SGD, SGD-Nesterov, RMSprop and Adam are evaluated. Subsequently, multiple combinations of Spatial Transformer Networks placed at distinct positions within the main neural network are analysed. The recognition rate of the proposed Convolutional Neural Network reports an accuracy of 99.71% in the German Traffic Sign Recognition Benchmark, outperforming previous state-of-the-art methods and also being more efficient in terms of memory requirements.Ministerio de Economía y Competitividad TIN2017-82113-C2-1-RMinisterio de Economía y Competitividad TIN2013-46801-C4-1-

    A Parametric Algorithm for Skyline Extraction

    Get PDF
    International audienceThis paper is dedicated to the problem of automatic skyline extraction in digital images. The study is motivated by the needs, expressed by urbanists, to describe in terms of geometrical features, the global shape created by man-made buildings in urban areas. Skyline extraction has been widely studied for navigation of Unmanned Aerial Vehicles (drones) or for geolocalization, both in natural and urban contexts. In most of these studies, the skyline is defined by the limit between sky and ground objects, and can thus be resumed to the sky segmentation problem in images. In our context, we need a more generic definition of skyline, which makes its extraction more complex and even variable. The skyline can be extracted for different depths, depending on the interest of the user (far horizon, intermediate buildings, near constructions , ...), and thus requires a human interaction. The main steps of our method are as follows: we use a Canny filter to extract edges and allow the user to interact with filter's parameters. With a high sensitivity , all the edges will be detected, whereas with lower values, only most contrasted contours will be kept by the filter. From the obtained edge map, an upper envelope is extracted, which is a disconnected approximation of the skyline. A graph is then constructed and a shortest path algorithm is used to link discontinuities. Our approach has been tested on several public domain urban and natural databases, and have proven to give better results that previously published methods

    Human Action Recognition with RGB-D Sensors

    Get PDF
    none3noHuman action recognition, also known as HAR, is at the foundation of many different applications related to behavioral analysis, surveillance, and safety, thus it has been a very active research area in the last years. The release of inexpensive RGB-D sensors fostered researchers working in this field because depth data simplify the processing of visual data that could be otherwise difficult using classic RGB devices. Furthermore, the availability of depth data allows to implement solutions that are unobtrusive and privacy preserving with respect to classic video-based analysis. In this scenario, the aim of this chapter is to review the most salient techniques for HAR based on depth signal processing, providing some details on a specific method based on temporal pyramid of key poses, evaluated on the well-known MSR Action3D dataset.Cippitelli, Enea; Gambi, Ennio; Spinsante, SusannaCippitelli, Enea; Gambi, Ennio; Spinsante, Susann

    Human Action Recognition with RGB-D Sensors

    Get PDF
    Human action recognition, also known as HAR, is at the foundation of many different applications related to behavioral analysis, surveillance, and safety, thus it has been a very active research area in the last years. The release of inexpensive RGB-D sensors fostered researchers working in this field because depth data simplify the processing of visual data that could be otherwise difficult using classic RGB devices. Furthermore, the availability of depth data allows to implement solutions that are unobtrusive and privacy preserving with respect to classic video-based analysis. In this scenario, the aim of this chapter is to review the most salient techniques for HAR based on depth signal processing, providing some details on a specific method based on temporal pyramid of key poses, evaluated on the well-known MSR Action3D dataset

    An Efficient Human Activity Recognition Technique Based on Deep Learning

    Get PDF
    In this paper, we present a new deep learning-based human activity recognition technique. First, we track and extract human body from each frame of the video stream. Next, we abstract human silhouettes and use them to create binary space-time maps (BSTMs) which summarize human activity within a defined time interval. Finally, we use convolutional neural network (CNN) to extract features from BSTMs and classify the activities. To evaluate our approach, we carried out several tests using three public datasets: Weizmann, Keck Gesture and KTH Database. Experimental results show that our technique outperforms conventional state-of-the-art methods in term of recognition accuracy and provides comparable performance against recent deep learning techniques. It’s simple to implement, requires less computing power, and can be used for multi-subject activity recognition

    On the efficacy of handcrafted and deep features for seed image classification

    Get PDF
    Computer vision techniques have become important in agriculture and plant sciences due to their wide variety of applications. In particular, the analysis of seeds can provide meaningful information on their evolution, the history of agriculture, the domestication of plants, and knowledge of diets in ancient times. This work aims to propose an exhaustive comparison of several different types of features in the context of multiclass seed classification, leveraging two public plant seeds data sets to classify their families or species. In detail, we studied possible optimisations of five traditional machine learning classifiers trained with seven different categories of handcrafted features. We also fine-tuned several well-known convolutional neural networks (CNNs) and the recently proposed SeedNet to determine whether and to what extent using their deep features may be advantageous over handcrafted features. The experimental results demonstrated that CNN features are appropriate to the task and representative of the multiclass scenario. In particular, SeedNet achieved a mean F-measure of 96%, at least. Nevertheless, several cases showed satisfactory performance from the handcrafted features to be considered a valid alternative. In detail, we found that the Ensemble strategy combined with all the handcrafted features can achieve 90.93% of mean F-measure, at least, with a considerably lower amount of times. We consider the obtained results an excellent preliminary step towards realising an automatic seeds recognition and classification framework

    Técnicas de visión por computador para la detección del verdor y la detección de obstáculos en campos de maíz

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 22/06/2017There is an increasing demand in the use of Computer Vision techniques in Precision Agriculture (PA) based on images captured with cameras on-board autonomous vehicles. Two techniques have been developed in this research. The rst for greenness identi cation and the second for obstacle detection in maize elds, including people and animals, for tractors in the RHEA (robot eets for highly e ective and forestry management) project, equipped with monocular cameras on-board the tractors. For vegetation identi cation in agricultural images the combination of colour vegetation indices (CVIs) with thresholding techniques is the usual strategy where the remaining elements on the image are also extracted. The main goal of this research line is the development of an alternative strategy for vegetation detection. To achieve our goal, we propose a methodology based on two well-known techniques in computer vision: Bag of Words representation (BoW) and Support Vector Machines (SVM). Then, each image is partitioned into several Regions Of Interest (ROIs). Afterwards, a feature descriptor is obtained for each ROI, then the descriptor is evaluated with a classi er model (previously trained to discriminate between vegetation and background) to determine whether or not the ROI is vegetation...Cada vez existe mayor demanda en el uso de t ecnicas de Visi on por Computador en Agricultura de Precisi on mediante el procesamiento de im agenes captadas por c amaras instaladas en veh culos aut onomos. En este trabajo de investigaci on se han desarrollado dos tipos de t ecnicas. Una para la identi caci on de plantas verdes y otra para la detecci on de obst aculos en campos de ma z, incluyendo personas y animales, para tractores del proyecto RHEA. El objetivo nal de los veh culos aut onomos fue la identi caci on y eliminaci on de malas hierbas en los campos de ma z. En im agenes agr colas la vegetaci on se detecta generalmente mediante ndices de vegetaci on y m etodos de umbralizaci on. Los ndices se calculan a partir de las propiedades espectrales en las im agenes de color. En esta tesis se propone un nuevo m etodo con tal n, lo que constituye un objetivo primordial de la investigaci on. La propuesta se basa en una estrategia conocida como \bolsa de palabras" conjuntamente con un modelo se aprendizaje supervisado. Ambas t ecnicas son ampliamente utilizadas en reconocimiento y clasi caci on de im agenes. La imagen se divide inicialmente en regiones homog eneas o de inter es (RIs). Dada una colecci on de RIs, obtenida de un conjunto de im agenes agr colas, se calculan sus caracter sticas locales que se agrupan por su similitud. Cada grupo representa una \palabra visual", y el conjunto de palabras visuales encontradas forman un \diccionario visual". Cada RI se representa por un conjunto de palabras visuales las cuales se cuanti can de acuerdo a su ocurrencia dentro de la regi on obteniendo as un vector-c odigo o \codebook", que es descriptor de la RI. Finalmente, se usan las M aquinas de Vectores Soporte para evaluar los vectores-c odigo y as , discriminar entre RIs que son vegetaci on del resto...Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

    Human action recognition and mobility assessment in smart environments with RGB-D sensors

    Get PDF
    Questa attività di ricerca è focalizzata sullo sviluppo di algoritmi e soluzioni per ambienti intelligenti sfruttando sensori RGB e di profondità. In particolare, gli argomenti affrontati fanno riferimento alla valutazione della mobilità di un soggetto e al riconoscimento di azioni umane. Riguardo il primo tema, l'obiettivo è quello di implementare algoritmi per l'estrazione di parametri oggettivi che possano supportare la valutazione di test di mobilità svolta da personale sanitario. Il primo algoritmo proposto riguarda l'estrazione di sei joints sul piano sagittale utilizzando i dati di profondità forniti dal sensore Kinect. La precisione in termini di stima degli angoli di busto e ginocchio nella fase di sit-to-stand viene valutata considerando come riferimento un sistema stereofotogrammetrico basato su marker. Un secondo algoritmo viene proposto per facilitare la realizzazione del test in ambiente domestico e per consentire l'estrazione di un maggior numero di parametri dall'esecuzione del test Timed Up and Go. I dati di Kinect vengono combinati con quelli di un accelerometro attraverso un algoritmo di sincronizzazione, costituendo un setup che può essere utilizzato anche per altre applicazioni che possono beneficiare dell'utilizzo congiunto di dati RGB, profondità ed inerziali. Vengono quindi proposti algoritmi di rilevazione della caduta che sfruttano la stessa configurazione del Timed Up and Go test. Per quanto riguarda il secondo argomento affrontato, l'obiettivo è quello di effettuare la classificazione di azioni che possono essere compiute dalla persona all'interno di un ambiente domestico. Vengono quindi proposti due algoritmi di riconoscimento attività i quali utilizzano i joints dello scheletro di Kinect e sfruttano un SVM multiclasse per il riconoscimento di azioni appartenenti a dataset pubblicamente disponibili, raggiungendo risultati confrontabili con lo stato dell'arte rispetto ai dataset CAD-60, KARD, MSR Action3D.This research activity is focused on the development of algorithms and solutions for smart environments exploiting RGB and depth sensors. In particular, the addressed topics refer to mobility assessment of a subject and to human action recognition. Regarding the first topic, the goal is to implement algorithms for the extraction of objective parameters that can support the assessment of mobility tests performed by healthcare staff. The first proposed algorithm regards the extraction of six joints on the sagittal plane using depth data provided by Kinect sensor. The accuracy in terms of estimation of torso and knee angles in the sit-to-stand phase is evaluated considering a marker-based stereometric system as a reference. A second algorithm is proposed to simplify the test implementation in home environment and to allow the extraction of a greater number of parameters from the execution of the Timed Up and Go test. Kinect data are combined with those of an accelerometer through a synchronization algorithm constituting a setup that can be used also for other applications that benefit from the joint usage of RGB, depth and inertial data. Fall detection algorithms exploiting the same configuration of the Timed Up and Go test are therefore proposed. Regarding the second topic addressed, the goal is to perform the classification of human actions that can be carried out in home environment. Two algorithms for human action recognition are therefore proposed, which exploit skeleton joints of Kinect and a multi-class SVM for the recognition of actions belonging to publicly available datasets, achieving results comparable with the state of the art in the datasets CAD-60, KARD, MSR Action3D

    Human action recognition and mobility assessment in smart environments with RGB-D sensors

    Get PDF
    openQuesta attività di ricerca è focalizzata sullo sviluppo di algoritmi e soluzioni per ambienti intelligenti sfruttando sensori RGB e di profondità. In particolare, gli argomenti affrontati fanno riferimento alla valutazione della mobilità di un soggetto e al riconoscimento di azioni umane. Riguardo il primo tema, l'obiettivo è quello di implementare algoritmi per l'estrazione di parametri oggettivi che possano supportare la valutazione di test di mobilità svolta da personale sanitario. Il primo algoritmo proposto riguarda l'estrazione di sei joints sul piano sagittale utilizzando i dati di profondità forniti dal sensore Kinect. La precisione in termini di stima degli angoli di busto e ginocchio nella fase di sit-to-stand viene valutata considerando come riferimento un sistema stereofotogrammetrico basato su marker. Un secondo algoritmo viene proposto per facilitare la realizzazione del test in ambiente domestico e per consentire l'estrazione di un maggior numero di parametri dall'esecuzione del test Timed Up and Go. I dati di Kinect vengono combinati con quelli di un accelerometro attraverso un algoritmo di sincronizzazione, costituendo un setup che può essere utilizzato anche per altre applicazioni che possono beneficiare dell'utilizzo congiunto di dati RGB, profondità ed inerziali. Vengono quindi proposti algoritmi di rilevazione della caduta che sfruttano la stessa configurazione del Timed Up and Go test. Per quanto riguarda il secondo argomento affrontato, l'obiettivo è quello di effettuare la classificazione di azioni che possono essere compiute dalla persona all'interno di un ambiente domestico. Vengono quindi proposti due algoritmi di riconoscimento attività i quali utilizzano i joints dello scheletro di Kinect e sfruttano un SVM multiclasse per il riconoscimento di azioni appartenenti a dataset pubblicamente disponibili, raggiungendo risultati confrontabili con lo stato dell'arte rispetto ai dataset CAD-60, KARD, MSR Action3D.This research activity is focused on the development of algorithms and solutions for smart environments exploiting RGB and depth sensors. In particular, the addressed topics refer to mobility assessment of a subject and to human action recognition. Regarding the first topic, the goal is to implement algorithms for the extraction of objective parameters that can support the assessment of mobility tests performed by healthcare staff. The first proposed algorithm regards the extraction of six joints on the sagittal plane using depth data provided by Kinect sensor. The accuracy in terms of estimation of torso and knee angles in the sit-to-stand phase is evaluated considering a marker-based stereometric system as a reference. A second algorithm is proposed to simplify the test implementation in home environment and to allow the extraction of a greater number of parameters from the execution of the Timed Up and Go test. Kinect data are combined with those of an accelerometer through a synchronization algorithm constituting a setup that can be used also for other applications that benefit from the joint usage of RGB, depth and inertial data. Fall detection algorithms exploiting the same configuration of the Timed Up and Go test are therefore proposed. Regarding the second topic addressed, the goal is to perform the classification of human actions that can be carried out in home environment. Two algorithms for human action recognition are therefore proposed, which exploit skeleton joints of Kinect and a multi-class SVM for the recognition of actions belonging to publicly available datasets, achieving results comparable with the state of the art in the datasets CAD-60, KARD, MSR Action3D.INGEGNERIA DELL'INFORMAZIONECippitelli, EneaCippitelli, Ene