403 research outputs found
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
Improving Deep Learning-based Defect Detection on Window Frames with Image Processing Strategies
Detecting subtle defects in window frames, including dents and scratches, is
vital for upholding product integrity and sustaining a positive brand
perception. Conventional machine vision systems often struggle to identify
these defects in challenging environments like construction sites. In contrast,
modern vision systems leveraging machine and deep learning (DL) are emerging as
potent tools, particularly for cosmetic inspections. However, the promise of DL
is yet to be fully realized. A few manufacturers have established a clear
strategy for AI integration in quality inspection, hindered mainly by issues
like scarce clean datasets and environmental changes that compromise model
accuracy. Addressing these challenges, our study presents an innovative
approach that amplifies defect detection in DL models, even with constrained
data resources. The paper proposes a new defect detection pipeline called
InspectNet (IPT-enhanced UNET) that includes the best combination of image
enhancement and augmentation techniques for pre-processing the dataset and a
Unet model tuned for window frame defect detection and segmentation.
Experiments were carried out using a Spot Robot doing window frame inspections
. 16 variations of the dataset were constructed using different image
augmentation settings. Results of the experiments revealed that, on average,
across all proposed evaluation measures, Unet outperformed all other algorithms
when IPT-enhanced augmentations were applied. In particular, when using the
best dataset, the average Intersection over Union (IoU) values achieved were
IPT-enhanced Unet, reaching 0.91 of mIoU
Design Of Computer Vision Systems For Optimizing The Threat Detection Accuracy
This dissertation considers computer vision (CV) systems in which a central monitoring station receives and analyzes the video streams captured and delivered wirelessly by multiple cameras. It addresses how the bandwidth can be allocated to various cameras by presenting a cross-layer solution that optimizes the overall detection or recognition accuracy. The dissertation presents and develops a real CV system and subsequently provides a detailed experimental analysis of cross-layer optimization. Other unique features of the developed solution include employing the popular HTTP streaming approach, utilizing homogeneous cameras as well as heterogeneous ones with varying capabilities and limitations, and including a new algorithm for estimating the effective medium airtime. The results show that the proposed solution significantly improves the CV accuracy.
Additionally, the dissertation features an improved neural network system for object detection. The proposed system considers inherent video characteristics and employs different motion detection and clustering algorithms to focus on the areas of importance in consecutive frames, allowing the system to dynamically and efficiently distribute the detection task among multiple deployments of object detection neural networks. Our experimental results indicate that our proposed method can enhance the mAP (mean average precision), execution time, and required data transmissions to object detection networks.
Finally, as recognizing an activity provides significant automation prospects in CV systems, the dissertation presents an efficient activity-detection recurrent neural network that utilizes fast pose/limbs estimation approaches. By combining object detection with pose estimation, the domain of activity detection is shifted from a volume of RGB (Red, Green, and Blue) pixel values to a time-series of relatively small one-dimensional arrays, thereby allowing the activity detection system to take advantage of highly capable neural networks that have been trained on large GPU clusters for thousands of hours. Consequently, capable activity detection systems with considerably fewer training sets and processing hours can be built
QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios
The concerns about individuals security have justified the increasing number of surveillance
cameras deployed both in private and public spaces. However, contrary to popular belief,
these devices are in most cases used solely for recording, instead of feeding intelligent analysis
processes capable of extracting information about the observed individuals. Thus, even though
video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant
details about the subjects that took part in a crime depends on the manual inspection
of recordings. As such, the current goal of the research community is the development of
automated surveillance systems capable of monitoring and identifying subjects in surveillance
scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric
recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at
designing a visual surveillance system capable of acquiring biometric data at a distance (e.g.,
face, iris or gait) without requiring human intervention in the process, as well as devising biometric
recognition methods robust to the degradation factors resulting from the unconstrained
acquisition process.
Regarding the first goal, the analysis of the data acquired by typical surveillance systems
shows that large acquisition distances significantly decrease the resolution of biometric samples,
and thus their discriminability is not sufficient for recognition purposes. In the literature,
diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring
high-resolution imagery at a distance, particularly when using a master-slave configuration. In
the master-slave configuration, the video acquired by a typical surveillance camera is analyzed
for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged
at high-resolution by the PTZ camera. Several methods have already shown that this configuration
can be used for acquiring biometric data at a distance. Nevertheless, these methods
failed at providing effective solutions to the typical challenges of this strategy, restraining its
use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development
of a biometric data acquisition system based on the cooperation of a PTZ camera
with a typical surveillance camera. The first proposal is a camera calibration method capable
of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ
camera. The second proposal is a camera scheduling method for determining - in real-time -
the sequence of acquisitions that maximizes the number of different targets obtained, while
minimizing the cumulative transition time. In order to achieve the first goal of this thesis,
both methods were combined with state-of-the-art approaches of the human monitoring field
to develop a fully automated surveillance capable of acquiring biometric data at a distance and
without human cooperation, designated as QUIS-CAMPI system.
The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis
of the performance of the state-of-the-art biometric recognition approaches shows that these
approaches attain almost ideal recognition rates in unconstrained data. However, this performance
is incongruous with the recognition rates observed in surveillance scenarios. Taking into
account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising
biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a
distance ranging from 5 to 40 meters and without human intervention in the acquisition process.
This set allows to objectively assess the performance of state-of-the-art biometric recognition
methods in data that truly encompass the covariates of surveillance scenarios. As such, this set
was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained
by the nine methods specially designed for this competition. In addition, the data acquired by
the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the
development of methods robust to the covariates of surveillance scenarios. The first proposal
regards a method for detecting corrupted features in biometric signatures inferred by a redundancy
analysis algorithm. The second proposal is a caricature-based face recognition approach
capable of enhancing the recognition performance by automatically generating a caricature
from a 2D photo. The experimental evaluation of these methods shows that both approaches
contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivĂduos tem justificado o crescimento
do nĂşmero de câmaras de vĂdeo-vigilância instaladas tanto em espaços privados como pĂşblicos.
Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos
casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente
capaz de inferir em tempo real informações sobre os indivĂduos observados. Assim, apesar de a
vĂdeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda
confinado Ă disponibilização de vĂdeos que tĂŞm que ser manualmente inspecionados para extrair
informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal
desafio da comunidade cientĂfica Ă© o desenvolvimento de sistemas automatizados capazes de
monitorizar e identificar indivĂduos em ambientes de vĂdeo-vigilância.
Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento
biomĂ©trico aos ambientes de vĂdeo-vigilância. De forma mais especifica, pretende-se
1) conceber um sistema de vĂdeo-vigilância que consiga adquirir dados biomĂ©tricos a longas distâncias
(e.g., imagens da cara, Ăris, ou vĂdeos do tipo de passo) sem requerer a cooperação dos
indivĂduos no processo; e 2) desenvolver mĂ©todos de reconhecimento biomĂ©trico robustos aos
fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas.
No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas tĂpicos
de vĂdeo-vigilância mostra que, devido Ă distância de captura, os traços biomĂ©tricos amostrados
não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis.
Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir
imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave.
Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse
(e.g. carros, pessoas) a partir do vĂdeo adquirido por uma câmara de vĂdeo-vigilância
e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos
métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos
à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados
com esta estratĂ©gia, impedindo assim o seu uso em ambientes de vĂdeo-vigilância. Deste modo,
esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de
vĂdeo-vigilância usando uma câmara PTZ assistida por uma câmara tĂpica de vĂdeo-vigilância. O
primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara
master para o ângulo da câmara PTZ (slave) sem o auxĂlio de outros dispositivos Ăłticos. O
segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela
câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações
que maximiza o nĂşmero de diferentes sujeitos observados e simultaneamente minimiza o
tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os
dois métodos propostos foram combinados com os avanços alcançados na área da monitorização
de humanos para assim desenvolver o primeiro sistema de vĂdeo-vigilância completamente automatizado
e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação
dos indivĂduos no processo, designado por sistema QUIS-CAMPI.
O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada
com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento
biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento
quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento
maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho nĂŁo Ă© corroborado pelos resultados observados em ambientes de vĂdeo-vigilância, o que sugere que os conjuntos
de dados atuais nĂŁo contĂŞm verdadeiramente os fatores de degradação tĂpicos dos ambientes de
vĂdeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biomĂ©tricos atuais,
esta tese introduz um novo conjunto de dados biomĂ©tricos (imagens da face e vĂdeos do tipo de
passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação
dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho
dos mĂ©todos do estado-da-arte no reconhecimento de indivĂduos em imagens/vĂdeos
capturados num ambiente real de vĂdeo-vigilância. Como tal, este conjunto foi utilizado para
promover a primeira competição de reconhecimento biométrico em ambientes não controlados.
Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9
métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos
pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para
aumentar a robustez aos fatores de degradação observados em ambientes de vĂdeo-vigilância. O
primeiro Ă© um mĂ©todo para detetar caracterĂsticas corruptas em assinaturas biomĂ©tricas atravĂ©s
da análise da redundância entre subconjuntos de caracterĂsticas. O segundo Ă© um mĂ©todo de
reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma Ăşnica
foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir
as taxas de erro em dados adquiridos de forma nĂŁo controlada
Fast whole-brain imaging of seizures in zebrafish larvae by two-photon light-sheet microscopy
Light-sheet fluorescence microscopy (LSFM) enables real-time whole-brain functional imaging in zebrafish larvae. Conventional one-photon LSFM can however induce undesirable visual stimulation due to the use of visible excitation light. The use of two-photon (2P) excitation, employing near-infrared invisible light, provides unbiased investigation of neuronal circuit dynamics. However, due to the low efficiency of the 2P absorption process, the imaging speed of this technique is typically limited by the signal-to-noise-ratio. Here, we describe a 2P LSFM setup designed for non-invasive imaging that enables quintuplicating state-of-the-art volumetric acquisition rate of the larval zebrafish brain (5 Hz) while keeping low the laser intensity on the specimen. We applied our system to the study of pharmacologically-induced acute seizures, characterizing the spatial-temporal dynamics of pathological activity and describing for the first time the appearance of caudo-rostral ictal waves (CRIWs)
Fast whole-brain imaging of seizures in zebrafish larvae by two-photon light-sheet microscopy
Light-sheet fluorescence microscopy (LSFM) enables real-time whole-brain
functional imaging in zebrafish larvae. Conventional one photon LSFM can
however induce undesirable visual stimulation due to the use of visible
excitation light. The use of two-photon (2P) excitation, employing
near-infrared invisible light, provides unbiased investigation of neuronal
circuit dynamics. However, due to the low efficiency of the 2P absorption
process, the imaging speed of this technique is typically limited by the
signal-to-noise-ratio. Here, we describe a 2P LSFM setup designed for
non-invasive imaging that enables quintuplicating state-of-the-art volumetric
acquisition rate of the larval zebrafish brain (5 Hz) while keeping low the
laser intensity on the specimen. We applied our system to the study of
pharmacologically-induced acute seizures, characterizing the spatial-temporal
dynamics of pathological activity and describing for the first time the
appearance of caudo-rostral ictal waves (CRIWs).Comment: Replacement: accepted version of the manuscript, to be published in
Biomedical Optics Express. 36 pages, 15 figure
Robust computational intelligence techniques for visual information processing
The third part is exclusively dedicated to the super-resolution of Magnetic Resonance Images. In one of these works, an algorithm based on the random shifting technique is developed. Besides, we studied noise removal and resolution enhancement simultaneously. To end, the cost function of deep networks has been modified by different combinations of norms in order to improve their training.
Finally, the general conclusions of the research are presented and discussed, as well as the possible future research lines that are able to make use of the results obtained in this Ph.D. thesis.This Ph.D. thesis is about image processing by computational intelligence techniques. Firstly, a general overview of this book is carried out, where the motivation, the hypothesis, the objectives, and the methodology employed are described. The use and analysis of different mathematical norms will be our goal. After that, state of the art focused on the applications of the image processing proposals is presented. In addition, the fundamentals of the image modalities, with particular attention to magnetic resonance, and the learning techniques used in this research, mainly based on neural networks, are summarized. To end up, the mathematical framework on which this work is based on, â‚š-norms, is defined.
Three different parts associated with image processing techniques follow. The first non-introductory part of this book collects the developments which are about image segmentation. Two of them are applications for video surveillance tasks and try to model the background of a scenario using a specific camera. The other work is centered on the medical field, where the goal of segmenting diabetic wounds of a very heterogeneous dataset is addressed.
The second part is focused on the optimization and implementation of new models for curve and surface fitting in two and three dimensions, respectively. The first work presents a parabola fitting algorithm based on the measurement of the distances of the interior and exterior points to the focus and the directrix. The second work changes to an ellipse shape, and it ensembles the information of multiple fitting methods. Last, the ellipsoid problem is addressed in a similar way to the parabola
Development and application of inhibitory luminopsins for the treatment of epilepsy
Optogenetics has shown great promise as a direct neuromodulatory tool for halting seizure activity in various animal models of epilepsy. However, light delivery into the brain is still a major practical challenge that needs to be addressed before future clinical translation is feasible. Not only does light delivery into the brain require surgically implanted hardware that can be both invasive and restrictive, but it is also difficult to illuminate large or complicated structures in the brain due to light scatter and attenuation. We have bypassed the challenges of external light delivery by directly coupling a bioluminescent light source (a genetically encoded Renilla luciferase) to an inhibitory opsin (Natronomonas halorhodopsin) as a single fusion protein, which we term an inhibitory luminopsin (iLMO). iLMOs were developed and characterized in vitro and in vivo using intracellular recordings, multielectrode arrays, and behavioral testing. iLMO2 was shown to generate hyperpolarizing outward currents in response to both external light and luciferase substrate, which was sufficient to suppress action potential firing and synchronous bursting activity in vitro. iLMO2 was further shown to suppress single-unit firing rate and local field potentials in the hippocampus of anesthetized and awake animals. Finally, expression of iLMO was scaled up to multiple structures of the basal ganglia to modulate rotational behavior of freely moving animals in a hardware-independent fashion. iLMO2 was further utilized to acutely suppress focal epileptic discharges induced by intracerebral injection of bicuculline and generalized seizures resulting from systemic administration of pentylenetetrazol. Inhibitory luminopsins have enabled the possibility of optogenetic inhibition of neural activity in a non-invasive and hardware-independent fashion. This work increases the versatility, scalability, and practicality of utilizing optogenetic approaches for halting seizure activity in vivo.Ph.D
Development of optical methods for real-time whole-brain functional imaging of zebrafish neuronal activity
Each one of us in his life has, at least once, smelled the scent of roses, read one canto of Dante’s Commedia or listened to the sound of the sea from a shell. All of this is possible thanks to the astonishing capabilities of an organ, such as the brain, that allows us to collect and organize perceptions coming from sensory organs and to produce behavioural responses accordingly. Studying an operating brain in a non-invasive way is extremely difficult in mammals, and particularly in humans. In the last decade, a small teleost fish, zebrafish (Danio rerio), has been making its way into the field of neurosciences. The brain of a larval zebrafish is made up of 'only' 100000 neurons and it’s completely transparent, making it possible to optically access it. Here, taking advantage of the best of currently available technology, we devised optical solutions to investigate the dynamics of neuronal activity throughout the entire brain of zebrafish larvae
A Survey of Smart Classroom Literature
Recently, there has been a substantial amount of research on smart classrooms, encompassing a number of areas, including Information and Communication Technology, Machine Learning, Sensor Networks, Cloud Computing, and Hardware. Smart classroom research has been quickly implemented to enhance education systems, resulting in higher engagement and empowerment of students, educators, and administrators. Despite decades of using emerging technology to improve teaching practices, critics often point out that methods miss adequate theoretical and technical foundations.
As a result, there have been a number of conflicting reviews on different perspectives of smart classrooms. For a realistic smart classroom approach, a piecemeal implementation is insufficient.
This survey contributes to the current literature by presenting a comprehensive analysis of various disciplines using a standard terminology and taxonomy. This multi-field study reveals new research possibilities and problems that must be tackled in order to integrate interdisciplinary works in a synergic manner. Our analysis shows that smart classroom is a rapidly developing research area that complements a number of emerging technologies. Moreover, this paper also describes the co-occurrence network of technological keywords using VOSviewer for an in-depth analysis
- …