25 research outputs found

    Optimisation du suivi de personnes dans un réseau de caméras

    Get PDF
    This thesis addresses the problem of improving the performance of people tracking process in a new framework called Global Tracker, which evaluates the quality of people trajectory (obtained by simple tracker) and recovers the potential errors from the previous stage. The first part of this Global Tracker estimates the quality of the tracking results, based on a statistical model analyzing the distribution of the target features to detect potential anomalies.To differentiate real errors from natural phenomena, we analyze all the interactions between the tracked object and its surroundings (other objects and background elements). In the second part, a post tracking method is designed to associate different tracklets (segments of trajectory) corresponding to the same person which were not associated by a first stage of tracking. This tracklet matching process selects the most relevant appearance features to compute a visual signature for each tracklet. Finally, the Global Tracker is evaluated with various benchmark datasets reproducing real-life situations, outperforming the state-of-the-art trackers.Cette thèse s’intéresse à l’amélioration des performances du processus de suivi de personnes dans un réseau de caméras et propose une nouvelle plate-forme appelée global tracker. Cette plate-forme évalue la qualité des trajectoires obtenues par un simple algorithme de suivi et permet de corriger les erreurs potentielles de cette première étape de suivi. La première partie de ce global tracker estime la qualité des trajectoires à partir d’un modèle statistique analysant des distributions des caractéristiques de la cible (ie : l’objet suivi) telles que ses dimensions, sa vitesse, sa direction, afin de détecter de potentielles anomalies. Pour distinguer de véritables erreurs par rapport à des phénomènes optiques, nous analysons toutes les interactions entre l’objet suivi et tout son environnement incluant d’autres objets mobiles et les éléments du fond de la scène. Dans la deuxième partie du global tracker, une méthode en post-traitement a été conçue pour associer les différentes tracklets (ie : segments de trajectoires fiables) correspondant à la même personne qui n’auraient pas été associées correctement par la première étape de suivi. L’algorithme d’association des tracklets choisit les caractéristiques d’apparence les plus saillantes et discriminantes afin de calculer une signature visuelle adaptée à chaque tracklet. Finalement le global tracker est évalué à partir de plusieurs bases de données de benchmark qui reproduit une large variété de situations réelles. A travers toutes ces expérimentations, les performances du global tracker sont équivalentes ou supérieures aux meilleurs algorithmes de suivi de l’état de l’art

    Person re-Identification over distributed spaces and time

    Get PDF
    PhDReplicating the human visual system and cognitive abilities that the brain uses to process the information it receives is an area of substantial scientific interest. With the prevalence of video surveillance cameras a portion of this scientific drive has been into providing useful automated counterparts to human operators. A prominent task in visual surveillance is that of matching people between disjoint camera views, or re-identification. This allows operators to locate people of interest, to track people across cameras and can be used as a precursory step to multi-camera activity analysis. However, due to the contrasting conditions between camera views and their effects on the appearance of people re-identification is a non-trivial task. This thesis proposes solutions for reducing the visual ambiguity in observations of people between camera views This thesis first looks at a method for mitigating the effects on the appearance of people under differing lighting conditions between camera views. This thesis builds on work modelling inter-camera illumination based on known pairs of images. A Cumulative Brightness Transfer Function (CBTF) is proposed to estimate the mapping of colour brightness values based on limited training samples. Unlike previous methods that use a mean-based representation for a set of training samples, the cumulative nature of the CBTF retains colour information from underrepresented samples in the training set. Additionally, the bi-directionality of the mapping function is explored to try and maximise re-identification accuracy by ensuring samples are accurately mapped between cameras. Secondly, an extension is proposed to the CBTF framework that addresses the issue of changing lighting conditions within a single camera. As the CBTF requires manually labelled training samples it is limited to static lighting conditions and is less effective if the lighting changes. This Adaptive CBTF (A-CBTF) differs from previous approaches that either do not consider lighting change over time, or rely on camera transition time information to update. By utilising contextual information drawn from the background in each camera view, an estimation of the lighting change within a single camera can be made. This background lighting model allows the mapping of colour information back to the original training conditions and thus remove the need for 3 retraining. Thirdly, a novel reformulation of re-identification as a ranking problem is proposed. Previous methods use a score based on a direct distance measure of set features to form a correct/incorrect match result. Rather than offering an operator a single outcome, the ranking paradigm is to give the operator a ranked list of possible matches and allow them to make the final decision. By utilising a Support Vector Machine (SVM) ranking method, a weighting on the appearance features can be learned that capitalises on the fact that not all image features are equally important to re-identification. Additionally, an Ensemble-RankSVM is proposed to address scalability issues by separating the training samples into smaller subsets and boosting the trained models. Finally, the thesis looks at a practical application of the ranking paradigm in a real world application. The system encompasses both the re-identification stage and the precursory extraction and tracking stages to form an aid for CCTV operators. Segmentation and detection are combined to extract relevant information from the video, while several combinations of matching techniques are combined with temporal priors to form a more comprehensive overall matching criteria. The effectiveness of the proposed approaches is tested on datasets obtained from a variety of challenging environments including offices, apartment buildings, airports and outdoor public spaces

    Re-identifying people in the crowd

    Get PDF
    Developing an automated surveillance system is of great interest for various reasons including forensic and security applications. In the case of a network of surveillance cameras with non-overlapping fields of view, person detection and tracking alone are insufficient to track a subject of interest across the network. In this case, instances of a person captured in one camera view need to be retrieved among a gallery of different people, in other camera views. This vision problem is commonly known as person re-identification (re-id). Cross-view instances of pedestrians exhibit varied levels of illumination, viewpoint, and pose variations which makes the problem very challenging. Despite recent progress towards improving accuracy, existing systems suffer from low applicability to real-world scenarios. This is mainly caused by the need for large amounts of annotated data from pairwise camera views to be available for training. Given the difficulty of obtaining such data and annotating it, this thesis aims to bring the person re-id problem a step closer to real-world deployment. In the first contribution, the single-shot protocol, where each individual is represented by a pair of images that need to be matched, is considered. Following the extensive annotation of four datasets for six attributes, an evaluation of the most widely used feature extraction schemes is conducted. The results reveal two high-performing descriptors among those evaluated, and show illumination variation to have the most impact on re-id accuracy. Motivated by the wide availability of videos from surveillance cameras and the additional visual and temporal information they provide, video-based person re-id is then investigated, and a su-pervised system is developed. This is achieved by improving and extending the best performing image-based person descriptor into three dimensions and combining it with distance metric learn-ing. The system obtained achieves state-of-the-art results on two widely used datasets. Given the cost and difficulty of obtaining labelled data from pairwise cameras in a network to train the model, an unsupervised video-based person re-id method is also developed. It is based on a set-based distance measure that leverages rank vectors to estimate the similarity scores between person tracklets. The proposed system outperforms other unsupervised methods by a large margin on two datasets while competing with deep learning methods on another large-scale dataset

    Smart video surveillance of pedestrians : fixed, aerial, and multi-camera methods

    Get PDF
    Crowd analysis from video footage is an active research topic in the field of computer vision. Crowds can be analaysed using different approaches, depending on their characteristics. Furthermore, analysis can be performed from footage obtained through different sources. Fixed CCTV cameras can be used, as well as cameras mounted on moving vehicles. To begin, a literature review is provided, where research works in the the fields of crowd analysis, as well as object and people tracking, occlusion handling, multi-view and sensor fusion, and multi-target tracking are analyses and compared, and their advantages and limitations highlighted. Following that, the three contributions of this thesis are presented: in a first study, crowds will be classified based on various cues (i.e. density, entropy), so that the best approaches to further analyse behaviour can be selected; then, some of the challenges of individual target tracking from aerial video footage will be tackled; finally, a study on the analysis of groups of people from multiple cameras is proposed. The analysis entails the movements of people and objects in the scene. The idea is to track as many people as possible within the crowd, and to be able to obtain knowledge from their movements, as a group, and to classify different types of scenes. An additional contribution of this thesis, are two novel datasets: on the one hand, a first set to test the proposed aerial video analysis methods; on the other, a second to validate the third study, that is, with groups of people recorded from multiple overlapping cameras performing different actions

    Few-Shot Deep Adversarial Learning for Video-based Person Re-identification

    Full text link
    Video-based person re-identification (re-ID) refers to matching people across camera views from arbitrary unaligned video footages. Existing methods rely on supervision signals to optimise a projected space under which the distances between inter/intra-videos are maximised/minimised. However, this demands exhaustively labelling people across camera views, rendering them unable to be scaled in large networked cameras. Also, it is noticed that learning effective video representations with view invariance is not explicitly addressed for which features exhibit different distributions otherwise. Thus, matching videos for person re-ID demands flexible models to capture the dynamics in time-series observations and learn view-invariant representations with access to limited labeled training samples. In this paper, we propose a novel few-shot deep learning approach to video-based person re-ID, to learn comparable representations that are discriminative and view-invariant. The proposed method is developed on the variational recurrent neural networks (VRNNs) and trained adversarially to produce latent variables with temporal dependencies that are highly discriminative yet view-invariant in matching persons. Through extensive experiments conducted on three benchmark datasets, we empirically show the capability of our method in creating view-invariant temporal features and state-of-the-art performance achieved by our method.Comment: Appearing at IEEE Transactions on Image Processin

    Understanding Target Trajectory Behavior: A Dynamic Scene Modeling Approach

    Get PDF
    [Resumen] El análisis de comportamiento humano es uno de los campos más activos en la rama de visión por computador. Con el incremento de cámaras, especialmente en entornos controlados tales como aeropuertos, estaciones de tren o museos, se hace cada vez más necesario el uso de sistemas automáticos que puedan catalogar la información proporcionada. En el caso de entornos concurridos, es muy difícil el poder distinguir el comportamiento de personas en base a sus gestos, debido a la falta de visión de su cuerpo al completo. Por ende, el análisis de comportamiento se realiza en base a sus trayectorias, añadiendo técnicas de razonamiento de alto nivel para ulilizar dicha información en múltiples aplicaciones, tales como la video vigilancia o el análisis de tráfico. El propósito de esta investigación es el desarrollo de un sistema totalmente automático para el análisis de comportamiento de las personas. Por una parte, se presentan dos sistemas para el seguimiento de múltiples objetivos, así como un sistema novedoso para la re-identificación de personas, con la intención de detectar todo objeto de interés en la escena, devolviendo sus trayectorias como salida. Por otra parte, se presenta un sistema novedoso para el análisis de comportamiento basado en información del entorno de la escena. Está basado en la idea que que toda persona,cuando intenta llegar a un cierto lugar, tiende a seguir el mismo camino que suele utilizar la mayoría de la gente. Se presentan una serie de métricas para la detección de movimientos anómalos, haciendo que este método sea ideal para su utilización en sistemas de tiempo real.[Abstract] Human behavior analysis is one of the most active computer vision research fields. As the number of cameras are increased, especially in restricted environments, like airports, train stations or museums, the need of automatic systems that can catalog the information provided by the cameras becomes crucial. In the case of crowded scenes, it is very difficult to distinguish people behavior because of the lack of visual contact of the whole body. Thus, behavior analysis remains in the evaluation of trajectories, adding high-level knowledge approaches in order to use that information in several applications like video surveillance or traffic analysis. The proposal of this research is the design of a fully-automatic human behavior system from a distance. On the one hand, two different multiple-target tracking methods and a target re-identification procedure are presented to detect every target in the scene, returning their trajectories as output. On the other hand, a novel behavior analysis system, which includes information about the environment, is provided. It is based in the idea that every person tries to reach a goal in the scene following the same path the majority of people should use. An extremely fast abnormal behavior metric is presented, providing our method with the capabilities needed to be used in real-time scenarios[Resumo] A análise do comportamento humano é un dos campos máis activos na rama da visión por computadora. Co incremento de cámaras, especialmente en entornos controlados tales coma aeroportos, estacións de tren ou museos, faise cada vez máis necesario o uso de sistemas automáticos que poidan catalogar a información proporcionada. No caso de entornos concurridos, é moi complicado de poder distinguir o comportamento de persoas dacordo cos seus xestos, debido á falta dunha visión completa do corpo do suxeito. Por tanto, a análise de comportamento tende a realizarse en base á traxectoria, engadindo técnicas de razoamento de alto nivel para utilizar dita información en diversas aplicacións, tales coma a video vixiancia ou a análise de tráfico. O propósito desta investigación é o desenrolo dun sistema totalmente automático para a análise do comportamento das persoas. Por unha parte, preséntanse dous sistemas para o seguimento de múltiples obxectivos, así coma un sistema novidoso para a re-identificación de persoas, coa intención de detectar todo obxecto de interés na escena, devolvendo as traxectorias asociadas como saída. Por outra parte, preséntase un sistema novidoso para a análise de comportamente baseada na informaci ón do entorno da escena. Está baseado na idea de que toda persoa, cando intenta acadar un certo luegar, tende a seguir o mesmo cami~no que xeralmente usa a maioría da xente. Preséntanse unha serie de métricas para a detección de movementos anómalos, facendo posible que este método poida ser utilizado en sistemas de tempo real

    A Multi-Resident Number Estimation Method for Smart Homes

    Get PDF
    Population aging requires innovative solutions to increase the quality of life and preserve autonomous and independent living at home. A need of particular significance is the identification of behavioral drifts. A relevant behavioral drift concerns sociality: older people tend to isolate themselves. There is therefore the need to find methodologies to identify if, when, and how long the person is in the company of other people (possibly, also considering the number). The challenge is to address this task in poorly sensorized apartments, with non-intrusive sensors that are typically wireless and can only provide local and simple information. The proposed method addresses technological issues, such as PIR (Passive InfraRed) blind times, topological issues, such as sensor interference due to the inability to separate detection areas, and algorithmic issues. The house is modeled as a graph to constrain transitions between adjacent rooms. Each room is associated with a set of values, for each identified person. These values decay over time and represent the probability that each person is still in the room. Because the used sensors cannot determine the number of people, the approach is based on a multi-branch inference that, over time, differentiates the movements in the apartment and estimates the number of people. The proposed algorithm has been validated with real data obtaining an accuracy of 86.8%

    Taking the Temperature of Sports Arenas:Automatic Analysis of People

    Get PDF
    corecore