145 research outputs found

    Using Prior Knowledge for Verification and Elimination of Stationary and Variable Objects in Real-time Images

    Get PDF
    With the evolving technologies in the autonomous vehicle industry, now it has become possible for automobile passengers to sit relaxed instead of driving the car. Technologies like object detection, object identification, and image segmentation have enabled an autonomous car to identify and detect an object on the road in order to drive safely. While an autonomous car drives by itself on the road, the types of objects surrounding the car can be dynamic (e.g., cars and pedestrians), stationary (e.g., buildings and benches), and variable (e.g., trees) depending on if the location or shape of an object changes or not. Different from the existing image-based approaches to detect and recognize objects in the scene, in this research 3D virtual world is employed to verify and eliminate stationary and variable objects to allow the autonomous car to focus on dynamic objects that may cause danger to its driving. This methodology takes advantage of prior knowledge of stationary and variable objects presented in a virtual city and verifies their existence in a real-time scene by matching keypoints between the virtual and real objects. In case of a stationary or variable object that does not exist in the virtual world due to incomplete pre-existing information, this method uses machine learning for object detection. Verified objects are then removed from the real-time image with a combined algorithm using contour detection and class activation map (CAM), which helps to enhance the efficiency and accuracy when recognizing moving objects

    Methodology for generating synthetic labeled datasets for visual container inspection

    Full text link
    Nowadays, containerized freight transport is one of the most important transportation systems that is undergoing an automation process due to the Deep Learning success. However, it suffers from a lack of annotated data in order to incorporate state-of-the-art neural network models to its systems. In this paper we present an innovative methodology to generate a realistic, varied, balanced, and labelled dataset for visual inspection task of containers in a dock environment. In addition, we validate this methodology with multiple visual tasks recurrently found in the state of the art. We prove that the generated synthetic labelled dataset allows to train a deep neural network that can be used in a real world scenario. On the other side, using this methodology we provide the first open synthetic labelled dataset called SeaFront available in: https://datasets.vicomtech.org/di21-seafront/readme.txt

    Panoptic Segmentation Meets Remote Sensing

    Full text link
    Panoptic segmentation combines instance and semantic predictions, allowing the detection of "things" and "stuff" simultaneously. Effectively approaching panoptic segmentation in remotely sensed data can be auspicious in many challenging problems since it allows continuous mapping and specific target counting. Several difficulties have prevented the growth of this task in remote sensing: (a) most algorithms are designed for traditional images, (b) image labelling must encompass "things" and "stuff" classes, and (c) the annotation format is complex. Thus, aiming to solve and increase the operability of panoptic segmentation in remote sensing, this study has five objectives: (1) create a novel data preparation pipeline for panoptic segmentation, (2) propose an annotation conversion software to generate panoptic annotations; (3) propose a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5) evaluate difficulties of this task in the urban setting. We used an aerial image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline considers three image inputs, and the proposed software uses point shapefiles for creating samples in the COCO format. Our study generated 3,400 samples with 512x512 pixel dimensions. We used the Panoptic-FPN with two backbones (ResNet-50 and ResNet-101), and the model evaluation considered semantic instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean IoU, box AP, and PQ. Our study presents the first effective pipeline for panoptic segmentation and an extensive database for other researchers to use and deal with other data or related problems requiring a thorough scene understanding.Comment: 40 pages, 10 figures, submitted to journa

    Unsupervised Segmentation in Real-World Images via Spelke Object Inference

    Full text link
    Self-supervised, category-agnostic segmentation of real-world images is a challenging open problem in computer vision. Here, we show how to learn static grouping priors from motion self-supervision by building on the cognitive science concept of a Spelke Object: a set of physical stuff that moves together. We introduce the Excitatory-Inhibitory Segment Extraction Network (EISEN), which learns to extract pairwise affinity graphs for static scenes from motion-based training signals. EISEN then produces segments from affinities using a novel graph propagation and competition network. During training, objects that undergo correlated motion (such as robot arms and the objects they move) are decoupled by a bootstrapping process: EISEN explains away the motion of objects it has already learned to segment. We show that EISEN achieves a substantial improvement in the state of the art for self-supervised image segmentation on challenging synthetic and real-world robotics datasets.Comment: 25 pages, 10 figure

    An Evaluation of Deep Learning-Based Object Identification

    Get PDF
    Identification of instances of semantic objects of a particular class, which has been heavily incorporated in people's lives through applications like autonomous driving and security monitoring, is one of the most crucial and challenging areas of computer vision. Recent developments in deep learning networks for detection have improved object detector accuracy. To provide a detailed review of the current state of object detection pipelines, we begin by analyzing the methodologies employed by classical detection models and providing the benchmark datasets used in this study. After that, we'll have a look at the one- and two-stage detectors in detail, before concluding with a summary of several object detection approaches. In addition, we provide a list of both old and new apps. It's not just a single branch of object detection that is examined. Finally, we look at how to utilize various object detection algorithms to create a system that is both efficient and effective. and identify a number of emerging patterns in order to better understand the using the most recent algorithms and doing more study

    INQUIRIES IN INTELLIGENT INFORMATION SYSTEMS: NEW TRAJECTORIES AND PARADIGMS

    Get PDF
    Rapid Digital transformation drives organizations to continually revitalize their business models so organizations can excel in such aggressive global competition. Intelligent Information Systems (IIS) have enabled organizations to achieve many strategic and market leverages. Despite the increasing intelligence competencies offered by IIS, they are still limited in many cognitive functions. Elevating the cognitive competencies offered by IIS would impact the organizational strategic positions. With the advent of Deep Learning (DL), IoT, and Edge Computing, IISs has witnessed a leap in their intelligence competencies. DL has been applied to many business areas and many industries such as real estate and manufacturing. Moreover, despite the complexity of DL models, many research dedicated efforts to apply DL to limited computational devices, such as IoTs. Applying deep learning for IoTs will turn everyday devices into intelligent interactive assistants. IISs suffer from many challenges that affect their service quality, process quality, and information quality. These challenges affected, in turn, user acceptance in terms of satisfaction, use, and trust. Moreover, Information Systems (IS) has conducted very little research on IIS development and the foreseeable contribution for the new paradigms to address IIS challenges. Therefore, this research aims to investigate how the employment of new AI paradigms would enhance the overall quality and consequently user acceptance of IIS. This research employs different AI paradigms to develop two different IIS. The first system uses deep learning, edge computing, and IoT to develop scene-aware ridesharing mentoring. The first developed system enhances the efficiency, privacy, and responsiveness of current ridesharing monitoring solutions. The second system aims to enhance the real estate searching process by formulating the search problem as a Multi-criteria decision. The system also allows users to filter properties based on their degree of damage, where a deep learning network allocates damages in 12 each real estate image. The system enhances real-estate website service quality by enhancing flexibility, relevancy, and efficiency. The research contributes to the Information Systems research by developing two Design Science artifacts. Both artifacts are adding to the IS knowledge base in terms of integrating different components, measurements, and techniques coherently and logically to effectively address important issues in IIS. The research also adds to the IS environment by addressing important business requirements that current methodologies and paradigms are not fulfilled. The research also highlights that most IIS overlook important design guidelines due to the lack of relevant evaluation metrics for different business problems

    Deep learning & remote sensing : pushing the frontiers in image segmentation

    Get PDF
    Dissertação (Mestrado em Informática) — Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, Brasília, 2022.A segmentação de imagens visa simplificar o entendimento de imagens digitais e métodos de aprendizado profundo usando redes neurais convolucionais permitem a exploração de diferentes tarefas (e.g., segmentação semântica, instância e panóptica). A segmentação semântica atribui uma classe a cada pixel em uma imagem, a segmentação de instância classifica objetos a nível de pixel com um identificador exclusivo para cada alvo e a segmentação panóptica combina instâncias com diferentes planos de fundo. Os dados de sensoriamento remoto são muito adequados para desenvolver novos algoritmos. No entanto, algumas particularidades impedem que o sensoriamento remoto com imagens orbitais e aéreas cresça quando comparado às imagens tradicionais (e.g., fotos de celulares): (1) as imagens são muito extensas, (2) apresenta características diferentes (e.g., número de canais e formato de imagem), (3) um grande número de etapas de préprocessamento e pós-processamento (e.g., extração de quadros e classificação de cenas grandes) e (4) os softwares para rotulagem e treinamento de modelos não são compatíveis. Esta dissertação visa avançar nas três principais categorias de segmentação de imagens. Dentro do domínio de segmentação de instâncias, propusemos três experimentos. Primeiro, aprimoramos a abordagem de segmentação de instância baseada em caixa para classificar cenas grandes. Em segundo lugar, criamos um método sem caixas delimitadoras para alcançar resultados de segmentação de instâncias usando modelos de segmentação semântica em um cenário com objetos esparsos. Terceiro, aprimoramos o método anterior para cenas aglomeradas e desenvolvemos o primeiro estudo considerando aprendizado semissupervisionado usando sensoriamento remoto e dados GIS. Em seguida, no domínio da segmentação panóptica, apresentamos o primeiro conjunto de dados de segmentação panóptica de sensoriamento remoto e dispomos de uma metodologia para conversão de dados GIS no formato COCO. Como nosso primeiro estudo considerou imagens RGB, estendemos essa abordagem para dados multiespectrais. Por fim, melhoramos o método box-free inicialmente projetado para segmentação de instâncias para a tarefa de segmentação panóptica. Esta dissertação analisou vários métodos de segmentação e tipos de imagens, e as soluções desenvolvidas permitem a exploração de novas tarefas , a simplificação da rotulagem de dados e uma forma simplificada de obter previsões de instância e panópticas usando modelos simples de segmentação semântica.Image segmentation aims to simplify the understanding of digital images. Deep learning-based methods using convolutional neural networks have been game-changing, allowing the exploration of different tasks (e.g., semantic, instance, and panoptic segmentation). Semantic segmentation assigns a class to every pixel in an image, instance segmentation classifies objects at a pixel level with a unique identifier for each target, and panoptic segmentation combines instancelevel predictions with different backgrounds. Remote sensing data largely benefits from those methods, being very suitable for developing new DL algorithms and creating solutions using top-view images. However, some peculiarities prevent remote sensing using orbital and aerial imagery from growing when compared to traditional ground-level images (e.g., camera photos): (1) The images are extensive, (2) it presents different characteristics (e.g., number of channels and image format), (3) a high number of pre-processes and post-processes steps (e.g., extracting patches and classifying large scenes), and (4) most open software for labeling and deep learning applications are not friendly to remote sensing due to the aforementioned reasons. This dissertation aimed to improve all three main categories of image segmentation. Within the instance segmentation domain, we proposed three experiments. First, we enhanced the box-based instance segmentation approach for classifying large scenes, allowing practical pipelines to be implemented. Second, we created a bounding-box free method to reach instance segmentation results by using semantic segmentation models in a scenario with sparse objects. Third, we improved the previous method for crowded scenes and developed the first study considering semi-supervised learning using remote sensing and GIS data. Subsequently, in the panoptic segmentation domain, we presented the first remote sensing panoptic segmentation dataset containing fourteen classes and disposed of software and methodology for converting GIS data into the panoptic segmentation format. Since our first study considered RGB images, we extended our approach to multispectral data. Finally, we leveraged the box-free method initially designed for instance segmentation to the panoptic segmentation task. This dissertation analyzed various segmentation methods and image types, and the developed solutions enable the exploration of new tasks (such as panoptic segmentation), the simplification of labeling data (using the proposed semi-supervised learning procedure), and a simplified way to obtain instance and panoptic predictions using simple semantic segmentation models

    Inteligência artificial e sistemas de irrigação por pivô central : desenvolvimento de estratégias e técnicas para o aprimoramento do mapeamento automático

    Get PDF
    Tese (doutorado) — Universidade de Brasília, Instituto de Ciências Humanas, Departamento de Geografia, Programa de Pós-Graduação em Geografia, 2022.A irrigação é o principal responsável pelo aumento da produtividade dos cultivos. Os sistemas de irrigação por pivô central (SIPC) são líderes em irrigação mecanizada no Brasil, com expressivo crescimento nas últimas décadas e projeção de aumento de mais de 134% de área até 2040. O método mais utilizado para identificação de SIPC é baseado na interpretação visual e mapeamento manual das feições circulares, tornando a tarefa demorada e trabalhosa. Nesse contexto, métodos baseados em Deep Learning (DL) apresentam grande potencial na classificação de imagens de sensoriamento remoto, utilizando Convolutional Neural Networks (CNN’s). O uso de DL provoca uma revolução na classificação de imagens, superando métodos tradicionais e alcançando maior precisão e eficiência, permitindo monitoramento regional e contínuo com baixo custo e agilidade. Essa pesquisa teve como objetivo aplicação de técnicas de DL utilizando algoritmos baseados em CNN’s para identificação de SIPC em imagens de sensoriamento remoto. O presente trabalho foi dividido em três capítulos principais: (a) identificação de SIPC em imagens Landsat-8/OLI, utilizando segmentação semântica com três algoritmos de CNN (U-Net, Deep ResUnet e SharpMask); (b) detecção de SIPC usando segmentação de instâncias de imagens multitemporais Sentinel-1/SAR (duas polarizações, VV e VH) utilizando o algoritmo Mask-RCNN, com o backbone ResNeXt-101-32x8d; e (c) detecção de SIPC utilizando imagens multitemporais Sentinel-2/MSI com diferentes percentuais de nuvens e segmentação de instâncias utilizando Mask-RCNN, com o backbone ResNext-101. As etapas metodológicas foram distintas entre os capítulos e todas apresentaram altos valores de métricas e grande capacidade de detecção de SIPC. As classificações utilizando imagens Landsat-8/OLI, e os algoritmos U-Net, Depp ResUnet e SharpMask tiveram respectivamente 0,96, 0,95 e 0,92 de coeficientes Kappa. As classificações usando imagens Sentinel-1/SAR apresentaram melhores métricas na combinação das duas polarizações VV+VH (75%AP, 91%AP50 e 86%AP75). A classificação de imagens Sentinel-2/MSI com nuvens apresentou métricas no conjunto de 6 imagens sem nuvens (80%AP e 93%AP50) bem próximas aos valores do conjunto de imagens com cenário extremo de nuvens (74%AP e 88%AP50), demonstrando que a utilização de imagens multitemporais, aumenta o poder preditivo no aprendizado. Uma contribuição significativa da pesquisa foi a proposição de reconstrução de imagens de grandes áreas, utilizando o algoritmo de janela deslizante, permitindo várias sobreposições de imagens classificadas e uma melhor estimativa de pivô por pixel. O presente estudo possibilitou o estabelecimento de metodologia adequada para detecção automática de pivô central utilizando três tipos diferentes de imagens de sensoriamento remoto, que estão disponíveis gratuitamente, além de um banco de dados com vetores de SIPC no Brasil Central.Irrigation is primarily responsible for increasing crop productivity. Center pivot irrigation systems (CPIS) are leaders in mechanized irrigation in Brazil, with significant growth in recent decades and a projected increase of more than 134% in area by 2040. The most used method for identifying CPIS is based on the interpretation visual and manual mapping of circular features, making the task time-consuming and laborious. In this context, methods based on Deep Learning (DL) have great potential in the classification of remote sensing images, using Convolutional Neural Networks (CNN's). The use of Deep Learning causes a revolution in image classification, surpassing traditional methods and achieving greater precision and efficiency, allowing regional and continuous monitoring with low cost and agility. This research aimed to apply DL techniques using algorithms based on CNN's to identify CIPS in remote sensing images. The present work was divided into three main chapters: (a) identification of CIPS in Landsat-8/OLI images, using semantic segmentation with three CNN algorithms (UNet, Deep ResUnet and SharpMask); (b) CPIS detection using Sentinel-1/SAR multitemporal image instance segmentation (two polarizations, VV and VH) using the Mask-RCNN algorithm, with the ResNeXt-101-32x8d backbone; and (c) SIPC detection using Sentinel2/MSI multitemporal images with different percentages of clouds and instance segmentation using Mask-RCNN, with the ResNext-101 backbone. The methodological steps were different between the chapters and all presented high metric values and great CPIS detection capacity. The classifications using Landsat-8/OLI images, and the U-Net, Depp ResUnet and SharpMask algorithms had respectively 0.96, 0.95 and 0.92 of Kappa coefficients. Classifications using Sentinel-1/SAR images showed better metrics in the combination of the two VV+VH polarizations (75%AP, 91%AP50 and 86%AP75). The classification of Sentinel-2/MSI images with clouds presented metrics in the set of 6 images without clouds (80%AP and 93%AP50) very close to the values of the set of images with extreme cloud scenario (74%AP and 88%AP50), demonstrating that the use of multitemporal images increases the predictive power in learning. A significant contribution of the research was the proposition of reconstruction of images of large areas, using the sliding window algorithm, allowing several overlaps of classified images and a better estimation of pivot per pixel. The present study made it possible to establish an adequate methodology for automatic center pivot detection using three different types of remote sensing images, which are freely available, in addition to a database with CPIS vectors in Central Brazil

    Online Audio-Visual Multi-Source Tracking and Separation: A Labeled Random Finite Set Approach

    Get PDF
    The dissertation proposes an online solution for separating an unknown and time-varying number of moving sources using audio and visual data. The random finite set framework is used for the modeling and fusion of audio and visual data. This enables an online tracking algorithm to estimate the source positions and identities for each time point. With this information, a set of beamformers can be designed to separate each desired source and suppress the interfering sources
    • …
    corecore