145 research outputs found
Using Prior Knowledge for Verification and Elimination of Stationary and Variable Objects in Real-time Images
With the evolving technologies in the autonomous vehicle industry, now it has become possible for automobile passengers to sit relaxed instead of driving the car. Technologies like object detection, object identification, and image segmentation have enabled an autonomous car to identify and detect an object on the road in order to drive safely. While an autonomous car drives by itself on the road, the types of objects surrounding the car can be dynamic (e.g., cars and pedestrians), stationary (e.g., buildings and benches), and variable (e.g., trees) depending on if the location or shape of an object changes or not. Different from the existing image-based approaches to detect and recognize objects in the scene, in this research 3D virtual world is employed to verify and eliminate stationary and variable objects to allow the autonomous car to focus on dynamic objects that may cause danger to its driving. This methodology takes advantage of prior knowledge of stationary and variable objects presented in a virtual city and verifies their existence in a real-time scene by matching keypoints between the virtual and real objects. In case of a stationary or variable object that does not exist in the virtual world due to incomplete pre-existing information, this method uses machine learning for object detection. Verified objects are then removed from the real-time image with a combined algorithm using contour detection and class activation map (CAM), which helps to enhance the efficiency and accuracy when recognizing moving objects
Methodology for generating synthetic labeled datasets for visual container inspection
Nowadays, containerized freight transport is one of the most important
transportation systems that is undergoing an automation process due to the Deep
Learning success. However, it suffers from a lack of annotated data in order to
incorporate state-of-the-art neural network models to its systems. In this
paper we present an innovative methodology to generate a realistic, varied,
balanced, and labelled dataset for visual inspection task of containers in a
dock environment. In addition, we validate this methodology with multiple
visual tasks recurrently found in the state of the art. We prove that the
generated synthetic labelled dataset allows to train a deep neural network that
can be used in a real world scenario. On the other side, using this methodology
we provide the first open synthetic labelled dataset called SeaFront available
in: https://datasets.vicomtech.org/di21-seafront/readme.txt
Panoptic Segmentation Meets Remote Sensing
Panoptic segmentation combines instance and semantic predictions, allowing
the detection of "things" and "stuff" simultaneously. Effectively approaching
panoptic segmentation in remotely sensed data can be auspicious in many
challenging problems since it allows continuous mapping and specific target
counting. Several difficulties have prevented the growth of this task in remote
sensing: (a) most algorithms are designed for traditional images, (b) image
labelling must encompass "things" and "stuff" classes, and (c) the annotation
format is complex. Thus, aiming to solve and increase the operability of
panoptic segmentation in remote sensing, this study has five objectives: (1)
create a novel data preparation pipeline for panoptic segmentation, (2) propose
an annotation conversion software to generate panoptic annotations; (3) propose
a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5)
evaluate difficulties of this task in the urban setting. We used an aerial
image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline
considers three image inputs, and the proposed software uses point shapefiles
for creating samples in the COCO format. Our study generated 3,400 samples with
512x512 pixel dimensions. We used the Panoptic-FPN with two backbones
(ResNet-50 and ResNet-101), and the model evaluation considered semantic
instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean
IoU, box AP, and PQ. Our study presents the first effective pipeline for
panoptic segmentation and an extensive database for other researchers to use
and deal with other data or related problems requiring a thorough scene
understanding.Comment: 40 pages, 10 figures, submitted to journa
Unsupervised Segmentation in Real-World Images via Spelke Object Inference
Self-supervised, category-agnostic segmentation of real-world images is a
challenging open problem in computer vision. Here, we show how to learn static
grouping priors from motion self-supervision by building on the cognitive
science concept of a Spelke Object: a set of physical stuff that moves
together. We introduce the Excitatory-Inhibitory Segment Extraction Network
(EISEN), which learns to extract pairwise affinity graphs for static scenes
from motion-based training signals. EISEN then produces segments from
affinities using a novel graph propagation and competition network. During
training, objects that undergo correlated motion (such as robot arms and the
objects they move) are decoupled by a bootstrapping process: EISEN explains
away the motion of objects it has already learned to segment. We show that
EISEN achieves a substantial improvement in the state of the art for
self-supervised image segmentation on challenging synthetic and real-world
robotics datasets.Comment: 25 pages, 10 figure
An Evaluation of Deep Learning-Based Object Identification
Identification of instances of semantic objects of a particular class, which has been heavily incorporated in people's lives through applications like autonomous driving and security monitoring, is one of the most crucial and challenging areas of computer vision. Recent developments in deep learning networks for detection have improved object detector accuracy. To provide a detailed review of the current state of object detection pipelines, we begin by analyzing the methodologies employed by classical detection models and providing the benchmark datasets used in this study. After that, we'll have a look at the one- and two-stage detectors in detail, before concluding with a summary of several object detection approaches. In addition, we provide a list of both old and new apps. It's not just a single branch of object detection that is examined. Finally, we look at how to utilize various object detection algorithms to create a system that is both efficient and effective. and identify a number of emerging patterns in order to better understand the using the most recent algorithms and doing more study
INQUIRIES IN INTELLIGENT INFORMATION SYSTEMS: NEW TRAJECTORIES AND PARADIGMS
Rapid Digital transformation drives organizations to continually revitalize their business models so organizations can excel in such aggressive global competition. Intelligent Information Systems (IIS) have enabled organizations to achieve many strategic and market leverages. Despite the increasing intelligence competencies offered by IIS, they are still limited in many cognitive functions. Elevating the cognitive competencies offered by IIS would impact the organizational strategic positions.
With the advent of Deep Learning (DL), IoT, and Edge Computing, IISs has witnessed a leap in their intelligence competencies. DL has been applied to many business areas and many industries such as real estate and manufacturing. Moreover, despite the complexity of DL models, many research dedicated efforts to apply DL to limited computational devices, such as IoTs. Applying deep learning for IoTs will turn everyday devices into intelligent interactive assistants.
IISs suffer from many challenges that affect their service quality, process quality, and information quality. These challenges affected, in turn, user acceptance in terms of satisfaction, use, and trust. Moreover, Information Systems (IS) has conducted very little research on IIS development and the foreseeable contribution for the new paradigms to address IIS challenges. Therefore, this research aims to investigate how the employment of new AI paradigms would enhance the overall quality and consequently user acceptance of IIS.
This research employs different AI paradigms to develop two different IIS. The first system uses deep learning, edge computing, and IoT to develop scene-aware ridesharing mentoring. The first developed system enhances the efficiency, privacy, and responsiveness of current ridesharing monitoring solutions. The second system aims to enhance the real estate searching process by formulating the search problem as a Multi-criteria decision. The system also allows users to filter properties based on their degree of damage, where a deep learning network allocates damages in
12
each real estate image. The system enhances real-estate website service quality by enhancing flexibility, relevancy, and efficiency.
The research contributes to the Information Systems research by developing two Design Science artifacts. Both artifacts are adding to the IS knowledge base in terms of integrating different components, measurements, and techniques coherently and logically to effectively address important issues in IIS. The research also adds to the IS environment by addressing important business requirements that current methodologies and paradigms are not fulfilled. The research also highlights that most IIS overlook important design guidelines due to the lack of relevant evaluation metrics for different business problems
Deep learning & remote sensing : pushing the frontiers in image segmentation
Dissertação (Mestrado em Informática) — Universidade de BrasÃlia, Instituto de Ciências Exatas, Departamento de Ciência da Computação, BrasÃlia, 2022.A segmentação de imagens visa simplificar o entendimento de imagens digitais e métodos de
aprendizado profundo usando redes neurais convolucionais permitem a exploração de diferentes
tarefas (e.g., segmentação semântica, instância e panóptica). A segmentação semântica atribui
uma classe a cada pixel em uma imagem, a segmentação de instância classifica objetos a nÃvel
de pixel com um identificador exclusivo para cada alvo e a segmentação panóptica combina
instâncias com diferentes planos de fundo. Os dados de sensoriamento remoto são muito adequados para desenvolver novos algoritmos. No entanto, algumas particularidades impedem que o
sensoriamento remoto com imagens orbitais e aéreas cresça quando comparado à s imagens tradicionais (e.g., fotos de celulares): (1) as imagens são muito extensas, (2) apresenta caracterÃsticas
diferentes (e.g., número de canais e formato de imagem), (3) um grande número de etapas de préprocessamento e pós-processamento (e.g., extração de quadros e classificação de cenas grandes) e
(4) os softwares para rotulagem e treinamento de modelos não são compatÃveis. Esta dissertação
visa avançar nas três principais categorias de segmentação de imagens. Dentro do domÃnio de
segmentação de instâncias, propusemos três experimentos. Primeiro, aprimoramos a abordagem
de segmentação de instância baseada em caixa para classificar cenas grandes. Em segundo
lugar, criamos um método sem caixas delimitadoras para alcançar resultados de segmentação
de instâncias usando modelos de segmentação semântica em um cenário com objetos esparsos.
Terceiro, aprimoramos o método anterior para cenas aglomeradas e desenvolvemos o primeiro
estudo considerando aprendizado semissupervisionado usando sensoriamento remoto e dados
GIS. Em seguida, no domÃnio da segmentação panóptica, apresentamos o primeiro conjunto de
dados de segmentação panóptica de sensoriamento remoto e dispomos de uma metodologia para
conversão de dados GIS no formato COCO. Como nosso primeiro estudo considerou imagens
RGB, estendemos essa abordagem para dados multiespectrais. Por fim, melhoramos o método
box-free inicialmente projetado para segmentação de instâncias para a tarefa de segmentação
panóptica. Esta dissertação analisou vários métodos de segmentação e tipos de imagens, e as
soluções desenvolvidas permitem a exploração de novas tarefas , a simplificação da rotulagem
de dados e uma forma simplificada de obter previsões de instância e panópticas usando modelos
simples de segmentação semântica.Image segmentation aims to simplify the understanding of digital images. Deep learning-based
methods using convolutional neural networks have been game-changing, allowing the exploration
of different tasks (e.g., semantic, instance, and panoptic segmentation). Semantic segmentation
assigns a class to every pixel in an image, instance segmentation classifies objects at a pixel
level with a unique identifier for each target, and panoptic segmentation combines instancelevel predictions with different backgrounds. Remote sensing data largely benefits from those
methods, being very suitable for developing new DL algorithms and creating solutions using
top-view images. However, some peculiarities prevent remote sensing using orbital and aerial
imagery from growing when compared to traditional ground-level images (e.g., camera photos):
(1) The images are extensive, (2) it presents different characteristics (e.g., number of channels
and image format), (3) a high number of pre-processes and post-processes steps (e.g., extracting
patches and classifying large scenes), and (4) most open software for labeling and deep learning applications are not friendly to remote sensing due to the aforementioned reasons. This
dissertation aimed to improve all three main categories of image segmentation. Within the instance segmentation domain, we proposed three experiments. First, we enhanced the box-based
instance segmentation approach for classifying large scenes, allowing practical pipelines to be
implemented. Second, we created a bounding-box free method to reach instance segmentation
results by using semantic segmentation models in a scenario with sparse objects. Third, we
improved the previous method for crowded scenes and developed the first study considering
semi-supervised learning using remote sensing and GIS data. Subsequently, in the panoptic
segmentation domain, we presented the first remote sensing panoptic segmentation dataset containing fourteen classes and disposed of software and methodology for converting GIS data into
the panoptic segmentation format. Since our first study considered RGB images, we extended
our approach to multispectral data. Finally, we leveraged the box-free method initially designed
for instance segmentation to the panoptic segmentation task. This dissertation analyzed various
segmentation methods and image types, and the developed solutions enable the exploration of
new tasks (such as panoptic segmentation), the simplification of labeling data (using the proposed semi-supervised learning procedure), and a simplified way to obtain instance and panoptic
predictions using simple semantic segmentation models
Inteligência artificial e sistemas de irrigação por pivô central : desenvolvimento de estratégias e técnicas para o aprimoramento do mapeamento automático
Tese (doutorado) — Universidade de BrasÃlia, Instituto de Ciências Humanas, Departamento de Geografia, Programa de Pós-Graduação em Geografia, 2022.A irrigação é o principal responsável pelo aumento da produtividade dos cultivos. Os sistemas
de irrigação por pivô central (SIPC) são lÃderes em irrigação mecanizada no Brasil, com
expressivo crescimento nas últimas décadas e projeção de aumento de mais de 134% de área
até 2040. O método mais utilizado para identificação de SIPC é baseado na interpretação visual
e mapeamento manual das feições circulares, tornando a tarefa demorada e trabalhosa. Nesse
contexto, métodos baseados em Deep Learning (DL) apresentam grande potencial na
classificação de imagens de sensoriamento remoto, utilizando Convolutional Neural Networks
(CNN’s). O uso de DL provoca uma revolução na classificação de imagens, superando métodos
tradicionais e alcançando maior precisão e eficiência, permitindo monitoramento regional e
contÃnuo com baixo custo e agilidade. Essa pesquisa teve como objetivo aplicação de técnicas
de DL utilizando algoritmos baseados em CNN’s para identificação de SIPC em imagens de
sensoriamento remoto. O presente trabalho foi dividido em três capÃtulos principais: (a)
identificação de SIPC em imagens Landsat-8/OLI, utilizando segmentação semântica com três
algoritmos de CNN (U-Net, Deep ResUnet e SharpMask); (b) detecção de SIPC usando
segmentação de instâncias de imagens multitemporais Sentinel-1/SAR (duas polarizações, VV
e VH) utilizando o algoritmo Mask-RCNN, com o backbone ResNeXt-101-32x8d; e (c)
detecção de SIPC utilizando imagens multitemporais Sentinel-2/MSI com diferentes
percentuais de nuvens e segmentação de instâncias utilizando Mask-RCNN, com o backbone
ResNext-101. As etapas metodológicas foram distintas entre os capÃtulos e todas apresentaram
altos valores de métricas e grande capacidade de detecção de SIPC. As classificações utilizando
imagens Landsat-8/OLI, e os algoritmos U-Net, Depp ResUnet e SharpMask tiveram
respectivamente 0,96, 0,95 e 0,92 de coeficientes Kappa. As classificações usando imagens
Sentinel-1/SAR apresentaram melhores métricas na combinação das duas polarizações VV+VH
(75%AP, 91%AP50 e 86%AP75). A classificação de imagens Sentinel-2/MSI com nuvens
apresentou métricas no conjunto de 6 imagens sem nuvens (80%AP e 93%AP50) bem próximas
aos valores do conjunto de imagens com cenário extremo de nuvens (74%AP e 88%AP50),
demonstrando que a utilização de imagens multitemporais, aumenta o poder preditivo no
aprendizado. Uma contribuição significativa da pesquisa foi a proposição de reconstrução de
imagens de grandes áreas, utilizando o algoritmo de janela deslizante, permitindo várias
sobreposições de imagens classificadas e uma melhor estimativa de pivô por pixel. O presente
estudo possibilitou o estabelecimento de metodologia adequada para detecção automática de
pivô central utilizando três tipos diferentes de imagens de sensoriamento remoto, que estão disponÃveis gratuitamente, além de um banco de dados com vetores de SIPC no Brasil Central.Irrigation is primarily responsible for increasing crop productivity. Center pivot irrigation
systems (CPIS) are leaders in mechanized irrigation in Brazil, with significant growth in recent
decades and a projected increase of more than 134% in area by 2040. The most used method
for identifying CPIS is based on the interpretation visual and manual mapping of circular
features, making the task time-consuming and laborious. In this context, methods based on
Deep Learning (DL) have great potential in the classification of remote sensing images, using
Convolutional Neural Networks (CNN's). The use of Deep Learning causes a revolution in
image classification, surpassing traditional methods and achieving greater precision and
efficiency, allowing regional and continuous monitoring with low cost and agility. This research
aimed to apply DL techniques using algorithms based on CNN's to identify CIPS in remote
sensing images. The present work was divided into three main chapters: (a) identification of
CIPS in Landsat-8/OLI images, using semantic segmentation with three CNN algorithms (UNet, Deep ResUnet and SharpMask); (b) CPIS detection using Sentinel-1/SAR multitemporal
image instance segmentation (two polarizations, VV and VH) using the Mask-RCNN
algorithm, with the ResNeXt-101-32x8d backbone; and (c) SIPC detection using Sentinel2/MSI multitemporal images with different percentages of clouds and instance segmentation
using Mask-RCNN, with the ResNext-101 backbone. The methodological steps were different
between the chapters and all presented high metric values and great CPIS detection capacity.
The classifications using Landsat-8/OLI images, and the U-Net, Depp ResUnet and SharpMask
algorithms had respectively 0.96, 0.95 and 0.92 of Kappa coefficients. Classifications using
Sentinel-1/SAR images showed better metrics in the combination of the two VV+VH
polarizations (75%AP, 91%AP50 and 86%AP75). The classification of Sentinel-2/MSI images
with clouds presented metrics in the set of 6 images without clouds (80%AP and 93%AP50)
very close to the values of the set of images with extreme cloud scenario (74%AP and
88%AP50), demonstrating that the use of multitemporal images increases the predictive power
in learning. A significant contribution of the research was the proposition of reconstruction of
images of large areas, using the sliding window algorithm, allowing several overlaps of
classified images and a better estimation of pivot per pixel. The present study made it possible
to establish an adequate methodology for automatic center pivot detection using three different
types of remote sensing images, which are freely available, in addition to a database with CPIS
vectors in Central Brazil
Online Audio-Visual Multi-Source Tracking and Separation: A Labeled Random Finite Set Approach
The dissertation proposes an online solution for separating an unknown and time-varying number of moving sources using audio and visual data. The random finite set framework is used for the modeling and fusion of audio and visual data. This enables an online tracking algorithm to estimate the source positions and identities for each time point. With this information, a set of beamformers can be designed to separate each desired source and suppress the interfering sources
- …