22 research outputs found
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
International audienceDue to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have fo-cused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture long-range joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
International audienceDue to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have fo-cused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture long-range joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset
AiRound and CV-BrCT: Novel Multiview Datasets for Scene Classification
It is undeniable that aerial/satellite images can provide useful information for a large variety of tasks. But, since these images are always taken from above, some applications can benefit from complementary information provided by other perspective views of the scene, such as ground-level images. Despite a large number of public repositories for both georeferenced photographs and aerial images, there is a lack of benchmark datasets that allow the development of approaches that exploit the benefits and complementarity of aerial/ground imagery. In this article, we present two new publicly available datasets named AiRound and CV-BrCT. The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world. Each triplet is composed of an aerial RGB image, a ground-level perspective image, and a Sentinel-2 sample. The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil. We design an extensive set of experiments concerning multiview scene classification, using early and late fusion. Such experiments were conducted to show that image classification can be enhanced using multiview data
Semantic segmentation of tree-canopy in urban environment with pixel-wise deep learning
Urban forests are an important part of any city, given that they provide several environmental benefits, such as improving urban drainage, climate regulation, public health, biodiversity, and others. However, tree detection in cities is challenging, given the irregular shape, size, occlusion, and complexity of urban areas. With the advance of environmental technologies, deep learning segmentation mapping methods can map urban forests accurately. We applied a region-based CNN object instance segmentation algorithm for the semantic segmentation of tree canopies in urban environments based on aerial RGB imagery. To the best of our knowledge, no study investigated the performance of deep learning-based methods for segmentation tasks inside the Cerrado biome, specifically for urban tree segmentation. Five state-of-the-art architectures were evaluated, namely: Fully Convolutional Network; U-Net; SegNet; Dynamic Dilated Convolution Network and DeepLabV3+. The experimental analysis showed the effectiveness of these methods reporting results such as pixel accuracy of 96,35%, an average accuracy of 91.25%, F1-score of 91.40%, Kappa of 82.80% and IoU of 73.89%. We also determined the inference time needed per area, and the deep learning methods investigated after the training proved to be suitable to solve this task, providing fast and effective solutions with inference time varying from 0.042 to 0.153 minutes per hectare. We conclude that the semantic segmentation of trees inside urban environments is highly achievable with deep neural networks. This information could be of high importance to decision-making and may contribute to the management of urban systems. It should be also important to mention that the dataset used in this work is available on our website
Classification semi-automatique des images de télédétection
A huge effort has been made in the development of image classification systemswith the objective of creating high-quality thematic maps and to establishprecise inventories about land cover use. The peculiarities of Remote SensingImages (RSIs) combined with the traditional image classification challengesmake RSI classification a hard task. Many of the problems are related to therepresentation scale of the data, and to both the size and therepresentativeness of used training set.In this work, we addressed four research issues in order to develop effectivesolutions for interactive classification of remote sensing images.The first research issue concerns the fact that image descriptorsproposed in the literature achieve good results in various applications, butmany of them have never been used in remote sensing classification tasks.We have tested twelve descriptors that encodespectral/color properties and seven texture descriptors. We have also proposeda methodology based on the K-Nearest Neighbor (KNN) classifier for evaluationof descriptors in classification context. Experiments demonstrate that JointAuto-Correlogram (JAC), Color Bitmap, Invariant Steerable Pyramid Decomposition(SID), and Quantized Compound Change Histogram (QCCH) yield the best results incoffee and pasture recognition tasks.The second research issue refers to the problem of selecting the scaleof segmentation for object-based remote sensing classification. Recentlyproposed methods exploit features extracted from segmented objects to improvehigh-resolution image classification. However, the definition of the scale ofsegmentation is a challenging task. We have proposedtwo multiscale classification approaches based on boosting of weak classifiers.The first approach, Multiscale Classifier (MSC), builds a strongclassifier that combines features extracted from multiple scales ofsegmentation. The other, Hierarchical Multiscale Classifier (HMSC), exploits thehierarchical topology of segmented regions to improve training efficiencywithout accuracy loss when compared to the MSC. Experiments show that it isbetter to use multiple scales than use only one segmentation scale result. Wehave also analyzed and discussed about the correlation among the useddescriptors and the scales of segmentation.The third research issue concerns the selection of training examples and therefinement of classification results through multiscale segmentation. We have proposed an approach forinteractive multiscale classification of remote sensing images.It is an active learning strategy that allows the classification resultrefinement by the user along iterations. Experimentalresults show that the combination of scales produces better results thanisolated scales in a relevance feedback process. Furthermore, the interactivemethod achieves good results with few user interactions. The proposed methodneeds only a small portion of the training set to build classifiers that are asstrong as the ones generated by a supervised method that uses the whole availabletraining set.The fourth research issue refers to the problem of extracting features of ahierarchy of regions for multiscale classification. We have proposed a strategythat exploits the existing relationships among regions in a hierarchy. Thisapproach, called BoW-Propagation, exploits the bag-of-visual-word model topropagate features along multiple scales. We also extend this idea topropagate histogram-based global descriptors, the H-Propagation method. The proposedmethods speed up the feature extraction process and yield good results when compared with globallow-level extraction approaches.L'objectif de cette thèse est de développer des solutions efficaces pour laclassification interactive des images de télédétection. Cet objectif a étéréalisé en répondant à quatre questions de recherche.La première question porte sur le fait que les descripteursd'images proposées dans la littérature obtiennent de bons résultats dansdiverses applications, mais beaucoup d'entre eux n'ont jamais été utilisés pour la classification des images de télédétection. Nous avons testé douzedescripteurs qui codent les propriétés spectrales et la couleur, ainsi que septdescripteurs de texture. Nous avons également proposé une méthodologie baséesur le classificateur KNN (K plus proches voisins) pour l'évaluation desdescripteurs dans le contexte de la classification. Les descripteurs Joint Auto-Correlogram (JAC),Color Bitmap, Invariant Steerable Pyramid Decomposition (SID) etQuantized Compound Change Histogram (QCCH), ont obtenu les meilleursrésultats dans les expériences de reconnaissance des plantations de café et depâturages.La deuxième question se rapporte au choix del'échelle de segmentation pour la classification d'images baséesur objets.Certaines méthodes récemment proposées exploitent des caractéristiques extraitesdes objets segmentés pour améliorer classification des images hauterésolution. Toutefois, le choix d'une bonne échelle de segmentation est unetâche difficile.Ainsi, nous avons proposé deux approches pour la classification multi-échelles fondées sur le les principes du Boosting, qui permet de combiner desclassifieurs faibles pour former un classifieur fort.La première approche, Multiscale Classifier (MSC), construit unclassifieur fort qui combine des caractéristiques extraites de plusieurséchelles de segmentation. L'autre, Hierarchical Multiscale Classifier(HMSC), exploite la topologie hiérarchique de régions segmentées afind'améliorer l'efficacité des classifications sans perte de précision parrapport au MSC. Les expériences montrent qu'il est préférable d'utiliser des plusieurs échelles plutôt qu'une seul échelle de segmentation. Nous avons également analysé et discuté la corrélation entre lesdescripteurs et des échelles de segmentation.La troisième question concerne la sélection des exemplesd'apprentissage et l'amélioration des résultats de classification basés sur lasegmentation multiéchelle. Nous avons proposé une approche pour laclassification interactive multi-échelles des images de télédétection. Ils'agit d'une stratégie d'apprentissage actif qui permet le raffinement desrésultats de classification par l'utilisateur. Les résultats des expériencesmontrent que la combinaison des échelles produit de meilleurs résultats que leschaque échelle isolément dans un processus de retour de pertinence. Par ailleurs,la méthode interactive permet d'obtenir de bons résultats avec peud'interactions de l'utilisateur. Il n'a besoin que d'une faible partie del'ensemble d'apprentissage pour construire des classificateurs qui sont aussiforts que ceux générés par une méthode supervisée qui utilise l'ensembled'apprentissage complet.La quatrième question se réfère au problème de l'extraction descaractéristiques d'un hiérarchie des régions pour la classificationmulti-échelles. Nous avons proposé une stratégie qui exploite les relationsexistantes entre les régions dans une hiérarchie. Cette approche, appelée BoW-Propagation, exploite le modèle de bag-of-visual-word pour propagerles caractéristiques entre les échelles de la hiérarchie. Nous avons égalementétendu cette idée pour propager des descripteurs globaux basés sur leshistogrammes, l'approche H-Propagation. Ces approches accélèrent leprocessus d'extraction et donnent de bons résultats par rapport à l'extractionde descripteurs globaux
Semi-automatic recognition and vectorization of regions in remote sensig images
Orientador: Ricardo da Silva TorresDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O uso de imagens de sensoriamento remoto (ISRs) como fonte de informação em aplicações voltadas para o agro-negócio e bastante comum. Nessas aplicações, saber como é a ocupação espacial é fundamental. Entretanto, reconhecer e diferenciar regiões de culturas agrícolas em ISRs ainda não é uma tarefa trivial. Embora existam métodos automáticos propostos para isso, os usuários preferem muitas vezes fazer o reconhecimento manualmente. Isso acontece porque tais métodos normalmente são feitos para resolver problemas específicos, ou quando são de propósito geral, não produzem resultados satisfatórios fazendo com que, invariavelmente, o usuário tenha que revisar os resultados manualmente. A pesquisa realizada objetivou a especificação e implementação parcial de um sistema para o reconhecimento semi-automático e vetorização de regiões em imagens de sensoriamento remoto. Para isso, foi usada uma estratégia interativa, chamada realimentação de relevância, que se baseia no fato de o sistema de classificação poder aprender quais são as regiões de interesse utilizando indicações de relevância feitas pelo usuário do sistema ao longo de iterações. A idéia é utilizar descritores de imagens para codificar informações espectrais e de textura de partições das imagens e utilizar realimentação de relevância com Programação Genética (PG) para combinar as características dos descritores. PG é uma técnica de aprendizado de máquina baseada na teoria da evolução. As principais contribuições deste trabalho são: estudo comparativo de técnicas de vetorização de imagens; adaptação do modelo de recuperação de imagens por conteúdo proposto recentemente para realização de realimentação de relevância usando regiões de imagem; adaptação do modelo de realimentação de relevância para o reconhecimento de regiões em ISRs; implementação parcial de um sistema de reconhecimento semi-automático e vetorização de regiões em ISRs; proposta de metodologia de validação do sistema desenvolvido.Abstract: The use of remote sensing images as a source of information in agrobusiness applications is very common. In these applications, it is fundamental to know how the space occupation is. However, the identification and recognition of crop regions in remote sensing images are not trivial tasks yet. Although there are automatic methods proposed to that, users prefer sometimes to identify regions manually. That happens because these methods are usually developed to solve specific problems, or, when they have a general purpose, they do not yield satisfying results. This work presents a semi-automatic method to vectorize regions from remote sensing images using relevance feedback based on genetic programming (GP). Relevance feedback is a technique used in content-based image retrieval (CBIR). Its objective is to agregate user preferences to the search process. The proposed solution consists in using image descriptors to encode texture and spectral features from the images, applying relevance feedback based on GP to combine these features with information obtained from the users interactions and, finally, segment the image. Finally, segmented image (raster) is converted into a vector representation. The main contributions of this work are: comparative study of image vectorization techniques; extension of a recently proposed relevance feedback approach for dealing with image regions; extension of the relevance feedback model for region recognition in remote sensing images; parcial implementation of the semi-automatic and vectorization system of remote sensing images regions; proposal a validation methodology.MestradoMestre em Ciência da Computaçã
Classificação semi-automática de imagens de sensorimento remoto
Orientadores: Ricardo da Silva Torres, Alexandre Xavier FalcãoTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Um grande esforço tem sido feito para desenvolver sistemas de classificação de imagens capazes de criar mapas temáticos de alta qualidade e estabelecer inventários precisos sobre o uso do solo. As peculiaridades das imagens de sensoriamento remoto (ISR), combinados com os desafios tradicionais de classificação de imagens, tornam a classificação de ISRs uma tarefa difícil. Grande parte dos desafios de pesquisa estão relacionados à escala de representação dos dados e, ao mesmo tempo, à dimensão e à representatividade do conjunto de treinamento utilizado. O principal foco desse trabalho está nos problemas relacionados à representação dos dados e à extração de características. O objetivo é desenvolver soluções efetivas para classificação interativa de imagens de sensoriamento remoto. Esse objetivo foi alcançado a partir do desenvolvimento de quatro linhas de pesquisa. A primeira linha de pesquisa está relacionada ao fato de embora descritores de imagens propostos na literatura obterem bons resultados em várias aplicações, muitos deles nunca foram usados para classificação de imagens de sensoriamento remoto. Nessa tese, foram testados doze descritores que codificam propriedades espectrais e sete descritores de textura. Também foi proposta uma metodologia baseada no classificador K-Vizinhos mais Próximos (K-nearest neighbors - KNN) para avaliação de descritores no contexto de classificação. Os descritores Joint Auto-Correlogram (JAC), Color Bitmap, Invariant Steerable Pyramid Decomposition (SID) e Quantized Compound Change Histogram (QCCH), apresentaram os melhores resultados experimentais na identificação de alvos de café e pastagem. A segunda linha de pesquisa se refere ao problema de seleção de escalas de segmentação para classificação de imagens de sensoriamento baseada em objetos. Métodos propostos recentemente exploram características extraídas de objetos segmentados para melhorar a classificação de imagens de alta resolução. Entretanto, definir uma escala de segmentação adequada é uma tarefa desafiadora. Nessa tese, foram propostas duas abordagens de classificação multiescala baseadas no algoritmo Adaboost. A primeira abordagem, Multiscale Classifier (MSC), constrói um classificador forte que combina características extraídas de múltiplas escalas de segmentação. A outra, Hierarchical Multiscale Classifier (HMSC), explora a relação hierárquica das regiões segmentadas para melhorar a eficiência sem reduzir a qualidade da classificação xi quando comparada à abordagem MSC. Os experimentos realizados mostram que é melhor usar múltiplas escalas do que utilizar apenas uma escala de segmentação. A correlação entre os descritores e as escalas de segmentação também é analisada e discutida. A terceira linha de pesquisa trata da seleção de amostras de treinamento e do refinamento dos resultados da classificação utilizando segmentação multiescala. Para isso, foi proposto um método interativo para classificação multiescala de imagens de sensoriamento remoto. Esse método utiliza uma estratégia baseada em aprendizado ativo que permite o refinamento dos resultados de classificação pelo usuário ao longo de interações. Os resultados experimentais mostraram que a combinação de escalas produzem melhores resultados do que a utilização de escalas isoladas em um processo de realimentação de relevância. Além disso, o método interativo obtém bons resultados com poucas interações. O método proposto necessita apenas de uma pequena porção do conjunto de treinamento para construir classificadores tão fortes quanto os gerados por um método supervisionado utilizando todo o conjunto de treinamento disponível. A quarta linha de pesquisa se refere à extração de características de uma hierarquia de regiões para classificação multiescala. Assim, foi proposta uma abordagem que explora as relações existentes entre as regiões da hierarquia. Essa abordagem, chamada BoW-Propagation, utiliza o modelo bag-of-visual-word para propagar características ao longo de múltiplas escalas. Essa ideia foi estendida para propagar descritores globais baseados em histogramas, a abordagem H-Propagation. As abordagens propostas aceleram o processo de extração e obtém bons resultados quando comparadas a descritores globaisAbstract: A huge effort has been made in the development of image classification systems with the objective of creating high-quality thematic maps and to establish precise inventories about land cover use. The peculiarities of Remote Sensing Images (RSIs) combined with the traditional image classification challenges make RSI classification a hard task. Many of the problems are related to the representation scale of the data, and to both the size and the representativeness of used training set. In this work, we addressed four research issues in order to develop effective solutions for interactive classification of remote sensing images. The first research issue concerns the fact that image descriptors proposed in the literature achieve good results in various applications, but many of them have never been used in remote sensing classification tasks. We have tested twelve descriptors that encode spectral/color properties and seven texture descriptors. We have also proposed a methodology based on the K-Nearest Neighbor (KNN) classifier for evaluation of descriptors in classification context. Experiments demonstrate that Joint Auto-Correlogram (JAC), Color Bitmap, Invariant Steerable Pyramid Decomposition (SID), and Quantized Compound Change Histogram (QCCH) yield the best results in coffee and pasture recognition tasks. The second research issue refers to the problem of selecting the scale of segmentation for object-based remote sensing classification. Recently proposed methods exploit features extracted from segmented objects to improve high-resolution image classification. However, the definition of the scale of segmentation is a challenging task. We have proposed two multiscale classification approaches based on boosting of weak classifiers. The first approach, Multiscale Classifier (MSC), builds a strong classifier that combines features extracted from multiple scales of segmentation. The other, Hierarchical Multiscale Classifier (HMSC), exploits the hierarchical topology of segmented regions to improve training efficiency without accuracy loss when compared to the MSC. Experiments show that it is better to use multiple scales than use only one segmentation scale result. We have also analyzed and discussed about the correlation among the used descriptors and the scales of segmentation. The third research issue concerns the selection of training examples and the refinement of classification results through multiscale segmentation. We have proposed an approach for xix interactive multiscale classification of remote sensing images. It is an active learning strategy that allows the classification result refinement by the user along iterations. Experimental results show that the combination of scales produces better results than isolated scales in a relevance feedback process. Furthermore, the interactive method achieves good results with few user interactions. The proposed method needs only a small portion of the training set to build classifiers that are as strong as the ones generated by a supervised method that uses the whole available training set. The fourth research issue refers to the problem of extracting features of a hierarchy of regions for multiscale classification. We have proposed a strategy that exploits the existing relationships among regions in a hierarchy. This approach, called BoW-Propagation, exploits the bag-of-visual-word model to propagate features along multiple scales. We also extend this idea to propagate histogram-based global descriptors, the H-Propagation method. The proposed methods speed up the feature extraction process and yield good results when compared with global low-level extraction approachesDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã