36 research outputs found
Shadow removal utilizing multiplicative fusion of texture and colour features for surveillance image
Automated surveillance systems often identify shadows as parts of a moving object which jeopardized subsequent image processing tasks such as object identification and tracking. In this thesis, an improved shadow elimination method for an indoor surveillance system is presented. This developed method is a fusion of several image processing methods. Firstly, the image is segmented using the Statistical Region Merging algorithm to obtain the segmented potential shadow regions. Next, multiple shadow identification features which include Normalized Cross-Correlation, Local Color Constancy and Hue-Saturation-Value shadow cues are applied on the images to generate feature maps. These feature maps are used for identifying and removing cast shadows according to the segmented regions. The video dataset used is the Autonomous Agents for On-Scene Networked Incident Management which covers both indoor and outdoor video scenes. The benchmarking result indicates that the developed method is on-par with several normally used shadow detection methods. The developed method yields a mean score of 85.17% for the video sequence in which the strongest shadow is present and a mean score of 89.93% for the video having the most complex textured background. This research contributes to the development and improvement of a functioning shadow eliminator method that is able to cope with image noise and various illumination changes
Improved 3D MR Image Acquisition and Processing in Congenital Heart Disease
Congenital heart disease (CHD) is the most common type of birth defect, affecting about 1% of the population. MRI is an essential tool in the assessment of CHD, including diagnosis, intervention planning and follow-up. Three-dimensional MRI can provide particularly rich visualization and information. However, it is often complicated by long scan times, cardiorespiratory motion, injection of contrast agents, and complex and time-consuming postprocessing. This thesis comprises four pieces of work that attempt to respond to some of these challenges.
The first piece of work aims to enable fast acquisition of 3D time-resolved cardiac imaging during free breathing. Rapid imaging was achieved using an efficient spiral sequence and a sparse parallel imaging reconstruction. The feasibility of this approach was demonstrated on a population of 10 patients with CHD, and areas of improvement were identified.
The second piece of work is an integrated software tool designed to simplify and accelerate the development of machine learning (ML) applications in MRI research. It also exploits the strengths of recently developed ML libraries for efficient MR image reconstruction and processing.
The third piece of work aims to reduce contrast dose in contrast-enhanced MR angiography (MRA). This would reduce risks and costs associated with contrast agents. A deep learning-based contrast enhancement technique was developed and shown to improve image quality in real low-dose MRA in a population of 40 children and adults with CHD.
The fourth and final piece of work aims to simplify the creation of computational models for hemodynamic assessment of the great arteries. A deep learning technique for 3D segmentation of the aorta and the pulmonary arteries was developed and shown to enable accurate calculation of clinically relevant biomarkers in a population of 10 patients with CHD
Model-Based Environmental Visual Perception for Humanoid Robots
The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
Simulated cognitive topologies: automatically generating highly contextual maps for complex journeys
As people traverse complex journeys, they engage in a number of information
interactions across spatial scales and levels of abstraction. Journey complexity
is characterised by factors including the number of actions required, and by
variation in the contextual basis of reasoning such as a transition between
different modes of transport. The high-level task of an A to B journey
decomposes into a sequence of lower-level navigational sub-tasks, with the
representation of geographic entities that support navigation during, between
and across sub-tasks, varying relative to the nature of the task and the
character of the geography. For example, transitioning from or to a particular
mode of transport has a direct bearing on the natural level of representational
abstraction that supports the task, as well as on the overall extent of the task’s
region of influence on the traveller’s focus. Modern mobile technologies send
data to a device that can in theory be context-specific in terms of explicitly
reflecting a traveller’s heterogeneous information requirements, however the
extent to which context is explicitly reflected in the selection and display of
navigational information remains limited in practice, with a rigid, predetermined scale-based hierarchy of cartographic views remaining the
underlying representational paradigm.
The core subject of the research is the context-dependent selection and display
of navigational information, and while there are many and varied
considerations in developing techniques to address selection and display, the
central challenge can simply be articulated as how to determine the
probability, given the traveller’s current context, that a feature should be in
the current map view. Clearly this central challenge extends to all features in
the spatial extent, and so from a practical perspective, research questions
centre around the initial selection of a subset of features, and around
determining an overall probability distribution over the subset given the
significance of features within the hierarchically ordered sequence of tasks.
In this thesis research is presented around the use of graph structures as a
practical basis for modeling urban geography to support heterogenous selections across viewing scales, and ultimately for displaying highly context-specific cartographic views. Through an iterative, empirical research
methodology, a formalised approach based on routing networks is presented,
which serves as the basis for modeling, selection and display.
Findings are presented from a series of 7 situated navigation studies that
included research with an existing navigation application as well as
experimental research stimuli. Hypotheses were validated and refined over the
course of the studies, with a focus on journey-specific regions that form around
the navigable network. Empirical data includes sketch maps, textual
descriptions, video and device interactions over the course of complex
navigation exercises. Study findings support the proposed graph architecture,
including subgraph classes that approximate cognitive structures central to
natural comprehension and reasoning. Empirical findings lead to the central
argument of a model based on causal mechanisms, in which relations are
formalised between task, selection and abstraction.
A causal framework for automatically determining map content for a given
journey context is presented, with the approach involving a conceptual shift
from treating geographic features as spatially indexed records, to treating them
as variables with a finite number of possible states. Causal nets serve as the
practical basis of reasoning, with geographic features being represented by
variables in these causal structures. The central challenge of finding the
probability that a variable in a causal net is in a particular state is addressed
through a causal model in which journey context serves as the evidence that
propagates over the net. In this way, complex heterogeneous selections for
interactive multi-scale information spaces are expressed as probability
distributions determined through message propagation.
The thesis concludes with a discussion around the implications of the approach
for the presentation of navigational information, and it is shown how the
framework can support context-specific selection and disambiguation of map
content, demonstrated through the central use case of navigating complex
urban journeys
Abstraction, Imagery, and Control in Cognitive Architecture.
This dissertation presents a theory describing the components of a cognitive architecture supporting intelligent behavior in spatial tasks. In this theory, an abstract symbolic representation serves as the basis for decisions. As a means to support abstract decision-making, imagery processes are also present. Here, a concrete (highly detailed) representation of the state of the problem is maintained in parallel with the abstract representation. Perceptual and action systems are decomposed into parts that operate between the environment and the concrete representation, and parts that operate between the concrete and abstract representations. Control processes can issue actions as a continuous function of information in the concrete representation, and actions can be simulated (imagined) in terms of it. The agent can then derive useful abstract information by applying perceptual processes to the resulting concrete state.
This theory addresses two challenges in architecture design that arise due to the diversity and complexity of spatial tasks that an intelligent agent must address. The perceptual abstraction problem results from the difficulty of creating a single perception system able to induce appropriate abstract representations in each of the many tasks an agent might encounter, and the irreducibility problem arises because some tasks are resistant to being abstracted at all. Imagery works to mitigate the perceptual abstraction problem by allowing a given perception system to work in more tasks, as perception can be dynamically combined with imagery. Continuous control, and the simulation thereof via imagery, works to mitigate the irreducibility problem. The use of imagery to address these challenges differs from other approaches in AI, where imagery is considered as an alternative to abstract representation, rather than as a means to it.
A detailed implementation of the theory is described, which is an extension of the Soar cognitive architecture. Agents instantiated in this architecture are demonstrated, including agents that use reinforcement learning and imagery to play arcade games, and an agent that performs sampling-based motion planning for a car-like vehicle. The performance of these agents is discussed in the context of the underlying architectural theory. Connections between this work and psychological theories of mental imagery are also discussed.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78795/1/swinterm_1.pd
Sistemas de posicionamento baseados em comunicação por luz para ambientes interiores
The demand for highly precise indoor positioning systems (IPSs) is growing
rapidly due to its potential in the increasingly popular techniques of the
Internet of Things, smart mobile devices, and artificial intelligence. IPS
becomes a promising research domain that is getting wide attention due to its
benefits in several working scenarios, such as, industries, indoor public
locations, and autonomous navigation. Moreover, IPS has a prominent
contribution in day-to-day activities in organizations such as health care
centers, airports, shopping malls, manufacturing, underground locations, etc.,
for safe operating environments. In indoor environments, both radio frequency
(RF) and optical wireless communication (OWC) based technologies could be
adopted for localization. Although the RF-based global positioning system,
such as, Global positioning system offers higher penetration rates with
reduced accuracy (i.e., in the range of a few meters), it does not work well in
indoor environments (and not at all in certain cases such as tunnels, mines,
etc.) due to the very weak signal and no direct access to the satellites. On the
other hand, the light-based system known as a visible light positioning (VLP)
system, as part of the OWC systems, uses the pre-existing light-emitting
diodes (LEDs)-based lighting infrastructure, could be used at low cost and
high accuracy compared with the RF-based systems. VLP is an emerging
technology promising high accuracy, high security, low deployment cost,
shorter time response, and low relative complexity when compared with RFbased
positioning.
However, in indoor VLP systems, there are some concerns such as,
multipath reflection, transmitter tilting, transmitter’s position, and orientation
uncertainty, human shadowing/blocking, and noise causing the increase in
the positioning error, thereby reducing the positioning accuracy of the system.
Therefore, it is imperative to capture the characteristics of different VLP
channel and properly model them for the dual purpose of illumination and
localization. In this thesis, firstly, the impact of transmitter tilting angles and
multipath reflections are studied and for the first time, it is demonstrated that
tilting the transmitter can be beneficial in VLP systems considering both line of
sight (LOS) and non-line of sight transmission paths. With the transmitters
oriented towards the center of the receiving plane, the received power level is
maximized due to the LOS components. It is also shown that the proposed
scheme offers a significant accuracy improvement of up to ~66% compared
with a typical non-tilted transmitter VLP. The effect of tilting the transmitter on
the lighting uniformity is also investigated and results proved that the
uniformity achieved complies with the European Standard EN 12464-1.
After that, the impact of transmitter position and orientation uncertainty on
the accuracy of the VLP system based on the received signal strength (RSS)
is investigated. Simulation results show that the transmitter uncertainties have
a severe impact on the positioning error, which can be leveraged through the
usage of more transmitters. Concerning a smaller transmitter’s position
epochs, and the size of the training set. It is shown that,
the ANN with Bayesian regularization outperforms the traditional RSS
technique using the non-linear least square estimation for all values of signal
to noise ratio.
Furthermore, a novel indoor VLP system is proposed based on support
vector machines and polynomial regression considering two different
multipath environments of an empty room and a furnished room. The results
show that, in an empty room, the positioning accuracy improvement for the
positioning error of 2.5 cm are 36.1, 58.3, and 72.2 % for three different
scenarios according to the regions’ distribution in the room. For the furnished
room, a positioning relative accuracy improvement of 214, 170, and 100 % is
observed for positioning error of 0.1, 0.2, and 0.3 m, respectively. Ultimately,
an indoor VLP system based on convolutional neural networks (CNN) is
proposed and demonstrated experimentally in which LEDs are used as
transmitters and a rolling shutter camera is used as receiver. A detection
algorithm named single shot detector (SSD) is used which relies on CNN (i.e.,
MobileNet or ResNet) for classification as well as position estimation of each
LED in the image. The system is validated using a real-world size test setup
containing eight LED luminaries. The obtained results show that the maximum
average root mean square positioning error achieved is 4.67 and 5.27 cm with
SSD MobileNet and SSD ResNet models, respectively. The validation results
show that the system can process 67 images per second, allowing real-time
positioning.A procura por sistemas de posicionamento interior (IPSs) de alta precisão tem
crescido rapidamente devido ao seu interesse nas técnicas cada vez mais
populares da Internet das Coisas, dispositivos móveis inteligentes e
inteligência artificial. O IPS tornou-se um domínio de pesquisa promissor que
tem atraído grande atenção devido aos seus benefícios em vários cenários de
trabalho, como indústrias, locais públicos e navegação autónoma. Além disso,
o IPS tem uma contribuição destacada no dia a dia de organizações, como,
centros de saúde, aeroportos, supermercados, fábricas, locais subterrâneos,
etc. As tecnologias baseadas em radiofrequência (RF) e comunicação óptica
sem fio (OWC) podem ser adotadas para localização em ambientes interiores.
Embora o sistema de posicionamento global (GPS) baseado em RF ofereça
taxas de penetração mais altas com precisão reduzida (ou seja, na faixa de
alguns metros), não funciona bem em ambientes interiores (e não funciona
bem em certos casos como túneis, minas, etc.) devido ao sinal muito fraco e
falta de acesso direto aos satélites. Por outro lado, o sistema baseado em luz
conhecido como sistema de posicionamento de luz visível (VLP), como parte
dos sistemas OWC, usa a infraestrutura de iluminação baseada em díodos
emissores de luz (LEDs) pré-existentes, é um sistemas de baixo custo e alta
precisão quando comprado com os sistemas baseados em RF. O VLP é uma
tecnologia emergente que promete alta precisão, alta segurança, baixo custo
de implantação, menor tempo de resposta e baixa complexidade relativa
quando comparado ao posicionamento baseado em RF.
No entanto, os sistemas VLP interiores, exibem algumas limitações, como, a
reflexão multicaminho, inclinação do transmissor, posição do transmissor e
incerteza de orientação, sombra/bloqueio humano e ruído, que têm como
consequência o aumento do erro de posicionamento, e consequente redução
da precisão do sistema. Portanto, é imperativo estudar as características dos
diferentes canais VLP e modelá-los adequadamente para o duplo propósito de
iluminação e localização. Esta tesa aborda, primeiramente, o impacto dos
ângulos de inclinação do transmissor e reflexões multipercurso no
desempenho do sistema de posicionamento. Demonstra-se que a inclinação
do transmissor pode ser benéfica em sistemas VLP considerando tanto a linha
de vista (LOS) como as reflexões. Com os transmissores orientados para o
centro do plano recetor, o nível de potência recebido é maximizado devido aos
componentes LOS. Também é mostrado que o esquema proposto oferece
uma melhoria significativa de precisão de até ~66% em comparação com um
sistema VLP de transmissor não inclinado típico. O efeito da inclinação do
transmissor na uniformidade da iluminação também é investigado e os
resultados comprovam que a uniformidade alcançada está de acordo com a
Norma Europeia EN 12464-1.
O impacto da posição do transmissor e incerteza de orientação na precisão
do sistema VLP com base na intensidade do sinal recebido (RSS) foi também investigado. Os resultados da simulação mostram que as incertezas do
transmissor têm um impacto severo no erro de posicionamento, que pode ser
atenuado com o uso de mais transmissores. Para incertezas de
posicionamento dos transmissores menores que 5 cm, os erros médios de
posicionamento são 23.3, 15.1 e 13.2 cm para conjuntos de 4, 9 e 16
transmissores, respetivamente. Enquanto que, para a incerteza de orientação
de um transmissor menor de 5°, os erros médios de posicionamento são 31.9,
20.6 e 17 cm para conjuntos de 4, 9 e 16 transmissores, respetivamente.
O trabalho da tese abordou a investigação dos aspetos de projeto de um
sistema VLP indoor no qual uma rede neuronal artificial (ANN) é utilizada para
estimativa de posicionamento considerando um canal multipercurso. O estudo
considerou a influência do ruído como indicador de desempenho para a
comparação entre diferentes abordagens de projeto. Três algoritmos de treino
de ANNs diferentes foram considerados, a saber, Levenberg-Marquardt,
regularização Bayesiana e algoritmos de gradiente conjugado escalonado,
para minimizar o erro de posicionamento no sistema VLP. O projeto da ANN foi
otimizado com base no número de neurónios nas camadas ocultas, no número
de épocas de treino e no tamanho do conjunto de treino. Mostrou-se que, a
ANN com regularização Bayesiana superou a técnica RSS tradicional usando
a estimação não linear dos mínimos quadrados para todos os valores da
relação sinal-ruído.
Foi proposto um novo sistema VLP indoor baseado em máquinas de vetores
de suporte (SVM) e regressão polinomial considerando dois ambientes
interiores diferentes: uma sala vazia e uma sala mobiliada. Os resultados
mostraram que, numa sala vazia, a melhoria da precisão de posicionamento
para o erro de posicionamento de 2.5 cm são 36.1, 58.3 e 72.2% para três
cenários diferentes de acordo com a distribuição das regiões na sala. Para a
sala mobiliada, uma melhoria de precisão relativa de posicionamento de 214,
170 e 100% é observada para erro de posicionamento de 0.1, 0.2 e 0.3 m,
respetivamente.
Finalmente, foi proposto um sistema VLP indoor baseado em redes neurais
convolucionais (CNN). O sistema foi demonstrado experimentalmente usando
luminárias LED como transmissores e uma camara com obturador rotativo
como recetor. O algoritmo de detecção usou um detector de disparo único
(SSD) baseado numa CNN pré configurada (ou seja, MobileNet ou ResNet)
para classificação. O sistema foi validado usando uma configuração de teste
de tamanho real contendo oito luminárias LED. Os resultados obtidos
mostraram que o erro de posicionamento quadrático médio alcançado é de
4.67 e 5.27 cm com os modelos SSD MobileNet e SSD ResNet,
respetivamente. Os resultados da validação mostram que o sistema pode
processar 67 imagens por segundo, permitindo o posicionamento em tempo
real.Programa Doutoral em Engenharia Eletrotécnic
Le nuage de point intelligent
Discrete spatial datasets known as point clouds often lay the groundwork for decision-making applications. E.g., we can use such data as a reference for autonomous cars and robot’s navigation, as a layer for floor-plan’s creation and building’s construction, as a digital asset for environment modelling and incident prediction... Applications are numerous, and potentially increasing if we consider point clouds as digital reality assets. Yet, this expansion faces technical limitations mainly from the lack of semantic information within point ensembles. Connecting knowledge sources is still a very manual and time-consuming process suffering from error-prone human interpretation. This highlights a strong need for domain-related data analysis to create a coherent and structured information. The thesis clearly tries to solve automation problematics in point cloud processing to create intelligent environments, i.e. virtual copies that can be used/integrated in fully autonomous reasoning services. We tackle point cloud questions associated with knowledge extraction – particularly segmentation and classification – structuration, visualisation and interaction with cognitive decision systems. We propose to connect both point cloud properties and formalized knowledge to rapidly extract pertinent information using domain-centered graphs. The dissertation delivers the concept of a Smart Point Cloud (SPC) Infrastructure which serves as an interoperable and modular architecture for a unified processing. It permits an easy integration to existing workflows and a multi-domain specialization through device knowledge, analytic knowledge or domain knowledge. Concepts, algorithms, code and materials are given to replicate findings and extend current applications.Les ensembles discrets de données spatiales, appelés nuages de points, forment souvent le support principal pour des scénarios d’aide à la décision. Par exemple, nous pouvons utiliser ces données comme référence pour les voitures autonomes et la navigation des robots, comme couche pour la création de plans et la construction de bâtiments, comme actif numérique pour la modélisation de l'environnement et la prédiction d’incidents... Les applications sont nombreuses et potentiellement croissantes si l'on considère les nuages de points comme des actifs de réalité numérique. Cependant, cette expansion se heurte à des limites techniques dues principalement au manque d'information sémantique au sein des ensembles de points. La création de liens avec des sources de connaissances est encore un processus très manuel, chronophage et lié à une interprétation humaine sujette à l'erreur. Cela met en évidence la nécessité d'une analyse automatisée des données relatives au domaine étudié afin de créer une information cohérente et structurée. La thèse tente clairement de résoudre les problèmes d'automatisation dans le traitement des nuages de points pour créer des environnements intelligents, c'est-àdire des copies virtuelles qui peuvent être utilisées/intégrées dans des services de raisonnement totalement autonomes. Nous abordons plusieurs problématiques liées aux nuages de points et associées à l'extraction des connaissances - en particulier la segmentation et la classification - la structuration, la visualisation et l'interaction avec les systèmes cognitifs de décision. Nous proposons de relier à la fois les propriétés des nuages de points et les connaissances formalisées pour extraire rapidement les informations pertinentes à l'aide de graphes centrés sur le domaine. La dissertation propose le concept d'une infrastructure SPC (Smart Point Cloud) qui sert d'architecture interopérable et modulaire pour un traitement unifié. Elle permet une intégration facile aux flux de travail existants et une spécialisation multidomaine grâce aux connaissances liée aux capteurs, aux connaissances analytiques ou aux connaissances de domaine. Plusieurs concepts, algorithmes, codes et supports sont fournis pour reproduire les résultats et étendre les applications actuelles.Diskrete räumliche Datensätze, so genannte Punktwolken, bilden oft die Grundlage für Entscheidungsanwendungen. Beispielsweise können wir solche Daten als Referenz für autonome Autos und Roboternavigation, als Ebene für die Erstellung von Grundrissen und Gebäudekonstruktionen, als digitales Gut für die Umgebungsmodellierung und Ereignisprognose verwenden... Die Anwendungen sind zahlreich und nehmen potenziell zu, wenn wir Punktwolken als Digital Reality Assets betrachten. Allerdings stößt diese Erweiterung vor allem durch den Mangel an semantischen Informationen innerhalb von Punkt-Ensembles auf technische Grenzen. Die Verbindung von Wissensquellen ist immer noch ein sehr manueller und zeitaufwendiger Prozess, der unter fehleranfälliger menschlicher Interpretation leidet. Dies verdeutlicht den starken Bedarf an domänenbezogenen Datenanalysen, um eine kohärente und strukturierte Information zu schaffen. Die Arbeit versucht eindeutig, Automatisierungsprobleme in der Punktwolkenverarbeitung zu lösen, um intelligente Umgebungen zu schaffen, d.h. virtuelle Kopien, die in vollständig autonome Argumentationsdienste verwendet/integriert werden können. Wir befassen uns mit Punktwolkenfragen im Zusammenhang mit der Wissensextraktion - insbesondere Segmentierung und Klassifizierung - Strukturierung, Visualisierung und Interaktion mit kognitiven Entscheidungssystemen. Wir schlagen vor, sowohl Punktwolkeneigenschaften als auch formalisiertes Wissen zu verbinden, um schnell relevante Informationen mithilfe von domänenzentrierten Grafiken zu extrahieren. Die Dissertation liefert das Konzept einer Smart Point Cloud (SPC) Infrastruktur, die als interoperable und modulare Architektur für eine einheitliche Verarbeitung dient. Es ermöglicht eine einfache Integration in bestehende Workflows und eine multidimensionale Spezialisierung durch Gerätewissen, analytisches Wissen oder Domänenwissen. Konzepte, Algorithmen, Code und Materialien werden zur Verfügung gestellt, um Erkenntnisse zu replizieren und aktuelle Anwendungen zu erweitern