39 research outputs found
A distributed camera system for multi-resolution surveillance
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor.
Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.
Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table.
We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating
under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance
Reproducible Evaluation of Pan-Tilt-Zoom Tracking
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in
computer vision for many years. However, it is very difficult to assess the
progress that has been made on this topic because there is no standard
evaluation methodology. The difficulty in evaluating PTZ tracking algorithms
arises from their dynamic nature. In contrast to other forms of tracking, PTZ
tracking involves both locating the target in the image and controlling the
motors of the camera to aim it so that the target stays in its field of view.
This type of tracking can only be performed online. In this paper, we propose a
new evaluation framework based on a virtual PTZ camera. With this framework,
tracking scenarios do not change for each experiment and we are able to
replicate online PTZ camera control and behavior including camera positioning
delays, tracker processing delays, and numerical zoom. We tested our evaluation
framework with the Camshift tracker to show its viability and to establish
baseline results.Comment: This is an extended version of the 2015 ICIP paper "Reproducible
Evaluation of Pan-Tilt-Zoom Tracking
Low and Variable Frame Rate Face Tracking Using an IP PTZ Camera
RÉSUMÉ
En vision par ordinateur, le suivi d'objets avec des caméras PTZ a des applications dans divers domaines, tels que la surveillance vidéo, la surveillance du trafic, la surveillance de personnes et la reconnaissance de visage. Toutefois, un suivi plus précis, efficace, et fiable est requis pour une utilisation courante dans ces domaines. Dans cette thèse, le suivi est appliqué au haut du corps d'un humain, en incluant son visage. Le suivi du visage permet de déterminer son emplacement pour chaque trame d'une vidéo. Il peut être utilisé pour obtenir des images du visage d'un humain dans des poses différentes. Dans ce travail, nous proposons de suivre le visage d'un humain à l’aide d'une caméra IP PTZ (caméra réseau orientable). Une caméra IP PTZ répond à une commande via son serveur Web intégré et permet un accès distribué à partir d'Internet. Le suivi avec ce type de caméra inclut un bon nombre de défis, tels que des temps de réponse irrégulier aux commandes de contrôle, des taux de trame faibles et irréguliers, de grand mouvements de la cible entre deux trames, des occlusions, des modifications au champ de vue, des changements d'échelle, etc.
Dans notre travail, nous souhaitons solutionner les problèmes des grands mouvements de la cible entre deux trames consécutives, du faible taux de trame, des modifications de l'arrière-plan, et du suivi avec divers changements d'échelle. En outre, l'algorithme de suivi doit prévoir les temps de réponse irréguliers de la caméra.
Notre solution se compose d’une phase d’initialisation pour modéliser la cible (haut du corps), d’une adaptation du filtre de particules qui utilise le flux optique pour générer des échantillons à chaque trame (APF-OFS), et du contrôle de la caméra. Chaque composante exige des stratégies différentes.
Lors de l'initialisation, on suppose que la caméra est statique. Ainsi, la détection du mouvement par soustraction d’arrière-plan est utilisée pour détecter l'emplacement initial de la personne. Ensuite, pour supprimer les faux positifs, un classificateur Bayesien est appliqué sur la région détectée afin de localiser les régions avec de la peau. Ensuite, une détection du visage basée sur la méthode de Viola et Jones est effectuée sur les régions de la peau. Si un visage est détecté, le suivi est lancé sur le haut du corps de la personne.----------ABSTRACT
Object tracking with PTZ cameras has various applications in different computer vision topics such as video surveillance, traffic monitoring, people monitoring and face recognition. Accurate, efficient, and reliable tracking is required for this task. Here, object tracking is applied to human upper body tracking and face tracking. Face tracking determines the location of the human face for each input image of a video. It can be used to get images of the face of a human target under different poses. We propose to track the human face by means of an Internet Protocol (IP) Pan-Tilt-Zoom (PTZ) camera (i.e. a network-based camera that pans, tilts and zooms). An IP PTZ camera responds to command via its integrated web server. It allows a distributed access from Internet (access from everywhere, but with non-defined delay). Tracking with such camera includes many challenges such as irregular response times to camera control commands, low and irregular frame rate, large motions of the target between two frames, target occlusion, changing field of view (FOV), various scale changes, etc.
In our work, we want to cope with the problem of large inter-frame motion of targets, low usable frame rate, background changes, and tracking with various scale changes. In addition, the tracking algorithm should handle the camera response time and zooming.
Our solution consists of a system initialization phase which is the processing before camera motion and a tracker based on an Adaptive Particle Filter using Optical Flow based Sampling (APF-OFS) tracker, and camera control that are the processing after the motion of the camera. Each part requires different strategies.
For initialization, when the camera is stationary, motion detection for a static camera is used to detect the initial location of the person face entering an area. For motion detection in the FOV of the camera, a background subtraction method is applied. Then to remove false positives, Bayesian skin classifier is applied on the detected motion region to discriminate skin regions from non skin regions. Face detection based on Viola and Jones face detector can be performed on the detected skin regions independently of their face size and position within the image
Plastic Mannequin-Based Robotic Telepresence for Remote Clinical Ward Rounding
Mobile robotic telepresence is a potential solution to addressing the problem of access to quality healthcare delivery in rural areas. Despite the availability of this system in its different forms, the capital and operating costs are unffordable for people living in rural areas, particularly in emerging economies. In this paper, the authors reduced the cost of mobile robotic telepresence solution for remote ward rounding using plastic mannequin and solar photovoltaic technology. An IP camera was fixed in each of the eye sockets of the plastic mannequin. These cameras are connected to a mini-computer embedded in the plastic mannequin. A Wi-Fi module establishes an Internet connection between remote physicians and rural healthcare facilities. In addition, most of these communities are not even connected to the power grid. Therefore, the system is powered by a solar photovoltaic energy source to provide a cheap and reliable power system. Another unique feature of this solution is that it gives the patient a better impression of the physical presence of a physician. This development will increase the adoption of robotic telepresense for remote clinical ward rounding in developing countrie
Cost-Effective Medical Robotic Telepresence Solution using Plastic Mannequin
Robotic telepresence is an Information and Communication Technology (ICT) solution that has a huge potential to address the problem of access to quality healthcare delivery in rural areas. However, the capital and operating costs of available systems are considered to be unffordable for rural dwellers in emerging economies. In addition, most of these communities are not even connected to the power grid. In this paper, the authors reduced the cost of engaging a robotic telepresence solution for rural medicare by using plastic mannequin and solar photovoltaic technology. An IP camera was fixed in each of the eye sockets of the plastic mannequin. These cameras are connected to a mini-computer embedded in the plastic mannequin. A Wi-Fi module establishes an Internet connection between remote physicians and rural heathcare facilities. The system is powered by a solar photovoltaic energy source to guarantee power availability. Another unique feature of this solution is that it gives the patient a better impression of the physical presence of a physician. Comparative cost analysis with robotic telepresence available in the market showed that our system is more affordable. This development will increase the adoption of robotic telepresense in rural telemedicine
QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios
The concerns about individuals security have justified the increasing number of surveillance
cameras deployed both in private and public spaces. However, contrary to popular belief,
these devices are in most cases used solely for recording, instead of feeding intelligent analysis
processes capable of extracting information about the observed individuals. Thus, even though
video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant
details about the subjects that took part in a crime depends on the manual inspection
of recordings. As such, the current goal of the research community is the development of
automated surveillance systems capable of monitoring and identifying subjects in surveillance
scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric
recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at
designing a visual surveillance system capable of acquiring biometric data at a distance (e.g.,
face, iris or gait) without requiring human intervention in the process, as well as devising biometric
recognition methods robust to the degradation factors resulting from the unconstrained
acquisition process.
Regarding the first goal, the analysis of the data acquired by typical surveillance systems
shows that large acquisition distances significantly decrease the resolution of biometric samples,
and thus their discriminability is not sufficient for recognition purposes. In the literature,
diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring
high-resolution imagery at a distance, particularly when using a master-slave configuration. In
the master-slave configuration, the video acquired by a typical surveillance camera is analyzed
for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged
at high-resolution by the PTZ camera. Several methods have already shown that this configuration
can be used for acquiring biometric data at a distance. Nevertheless, these methods
failed at providing effective solutions to the typical challenges of this strategy, restraining its
use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development
of a biometric data acquisition system based on the cooperation of a PTZ camera
with a typical surveillance camera. The first proposal is a camera calibration method capable
of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ
camera. The second proposal is a camera scheduling method for determining - in real-time -
the sequence of acquisitions that maximizes the number of different targets obtained, while
minimizing the cumulative transition time. In order to achieve the first goal of this thesis,
both methods were combined with state-of-the-art approaches of the human monitoring field
to develop a fully automated surveillance capable of acquiring biometric data at a distance and
without human cooperation, designated as QUIS-CAMPI system.
The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis
of the performance of the state-of-the-art biometric recognition approaches shows that these
approaches attain almost ideal recognition rates in unconstrained data. However, this performance
is incongruous with the recognition rates observed in surveillance scenarios. Taking into
account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising
biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a
distance ranging from 5 to 40 meters and without human intervention in the acquisition process.
This set allows to objectively assess the performance of state-of-the-art biometric recognition
methods in data that truly encompass the covariates of surveillance scenarios. As such, this set
was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained
by the nine methods specially designed for this competition. In addition, the data acquired by
the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the
development of methods robust to the covariates of surveillance scenarios. The first proposal
regards a method for detecting corrupted features in biometric signatures inferred by a redundancy
analysis algorithm. The second proposal is a caricature-based face recognition approach
capable of enhancing the recognition performance by automatically generating a caricature
from a 2D photo. The experimental evaluation of these methods shows that both approaches
contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento
do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos.
Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos
casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente
capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a
vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda
confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair
informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal
desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de
monitorizar e identificar indivíduos em ambientes de vídeo-vigilância.
Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento
biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se
1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias
(e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos
indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos
fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas.
No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos
de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados
não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis.
Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir
imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave.
Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse
(e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância
e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos
métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos
à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados
com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo,
esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de
vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O
primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara
master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O
segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela
câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações
que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o
tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os
dois métodos propostos foram combinados com os avanços alcançados na área da monitorização
de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado
e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação
dos indivíduos no processo, designado por sistema QUIS-CAMPI.
O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada
com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento
biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento
quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento
maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos
de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de
vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais,
esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de
passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação
dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho
dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos
capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para
promover a primeira competição de reconhecimento biométrico em ambientes não controlados.
Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9
métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos
pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para
aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O
primeiro é um método para detetar características corruptas em assinaturas biométricas através
da análise da redundância entre subconjuntos de características. O segundo é um método de
reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única
foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir
as taxas de erro em dados adquiridos de forma não controlada