293,708 research outputs found

    YOLORe-IDNet: An Efficient Multi-Camera System for Person-Tracking

    Full text link
    The growing need for video surveillance in public spaces has created a demand for systems that can track individuals across multiple cameras feeds in real-time. While existing tracking systems have achieved impressive performance using deep learning models, they often rely on pre-existing images of suspects or historical data. However, this is not always feasible in cases where suspicious individuals are identified in real-time and without prior knowledge. We propose a person-tracking system that combines correlation filters and Intersection Over Union (IOU) constraints for robust tracking, along with a deep learning model for cross-camera person re-identification (Re-ID) on top of YOLOv5. The proposed system quickly identifies and tracks suspect in real-time across multiple cameras and recovers well after full or partial occlusion, making it suitable for security and surveillance applications. It is computationally efficient and achieves a high F1-Score of 79% and an IOU of 59% comparable to existing state-of-the-art algorithms, as demonstrated in our evaluation on a publicly available OTB-100 dataset. The proposed system offers a robust and efficient solution for the real-time tracking of individuals across multiple camera feeds. Its ability to track targets without prior knowledge or historical data is a significant improvement over existing systems, making it well-suited for public safety and surveillance applications

    Person tracking with non-overlapping multiple cameras

    Get PDF
    Monitoring and tracking of any target in a surveillance system is an important task. When these targets are human then this problem comes under person identification and tracking. At present, large scale smart video surveillance system is an essential component for any commercial or public campus. Since field of view (FOV) of a camera is limited; for large area monitoring, multiple cameras are needed at different locations. This paper proposes a novel model for tracking a person under multiple non-overlapping cameras. It builds the reference signature of the person at the beginning of the tracking system to match with the upcoming signatures captured by other cameras within the specified area of observation with the help of trained support vector machine (SVM) between two cameras. For experiments, wide area re-identification dataset (WARD) and a real-time scenario have been used with color, shape and texture features for person's re-identification

    3D Face Tracking and Texture Fusion in the Wild

    Full text link
    We present a fully automatic approach to real-time 3D face reconstruction from monocular in-the-wild videos. With the use of a cascaded-regressor based face tracking and a 3D Morphable Face Model shape fitting, we obtain a semi-dense 3D face shape. We further use the texture information from multiple frames to build a holistic 3D face representation from the video frames. Our system is able to capture facial expressions and does not require any person-specific training. We demonstrate the robustness of our approach on the challenging 300 Videos in the Wild (300-VW) dataset. Our real-time fitting framework is available as an open source library at http://4dface.org

    PALMAR: Towards Adaptive Multi-inhabitant Activity Recognition in Point-Cloud Technology

    Full text link
    With the advancement of deep neural networks and computer vision-based Human Activity Recognition, employment of Point-Cloud Data technologies (LiDAR, mmWave) has seen a lot interests due to its privacy preserving nature. Given the high promise of accurate PCD technologies, we develop, PALMAR, a multiple-inhabitant activity recognition system by employing efficient signal processing and novel machine learning techniques to track individual person towards developing an adaptive multi-inhabitant tracking and HAR system. More specifically, we propose (i) a voxelized feature representation-based real-time PCD fine-tuning method, (ii) efficient clustering (DBSCAN and BIRCH), Adaptive Order Hidden Markov Model based multi-person tracking and crossover ambiguity reduction techniques and (iii) novel adaptive deep learning-based domain adaptation technique to improve the accuracy of HAR in presence of data scarcity and diversity (device, location and population diversity). We experimentally evaluate our framework and systems using (i) a real-time PCD collected by three devices (3D LiDAR and 79 GHz mmWave) from 6 participants, (ii) one publicly available 3D LiDAR activity data (28 participants) and (iii) an embedded hardware prototype system which provided promising HAR performances in multi-inhabitants (96%) scenario with a 63% improvement of multi-person tracking than state-of-art framework without losing significant system performances in the edge computing device.Comment: Accepted in IEEE International Conference on Computer Communications 202

    Real-time 3D human tracking for mobile robots with multisensors

    Full text link
    © 2017 IEEE. Acquiring the accurate 3-D position of a target person around a robot provides fundamental and valuable information that is applicable to a wide range of robotic tasks, including home service, navigation and entertainment. This paper presents a real-time robotic 3-D human tracking system which combines a monocular camera with an ultrasonic sensor by the extended Kalman filter (EKF). The proposed system consists of three sub-modules: monocular camera sensor tracking model, ultrasonic sensor tracking model and multi-sensor fusion. An improved visual tracking algorithm is presented to provide partial location estimation (2-D). The algorithm is designed to overcome severe occlusions, scale variation, target missing and achieve robust re-detection. The scale accuracy is further enhanced by the estimated 3-D information. An ultrasonic sensor array is employed to provide the range information from the target person to the robot and Gaussian Process Regression is used for partial location estimation (2-D). EKF is adopted to sequentially process multiple, heterogeneous measurements arriving in an asynchronous order from the vision sensor and the ultrasonic sensor separately. In the experiments, the proposed tracking system is tested in both simulation platform and actual mobile robot for various indoor and outdoor scenes. The experimental results show the superior performance of the 3-D tracking system in terms of both the accuracy and robustness

    MetaSpace II: Object and full-body tracking for interaction and navigation in social VR

    Full text link
    MetaSpace II (MS2) is a social Virtual Reality (VR) system where multiple users can not only see and hear but also interact with each other, grasp and manipulate objects, walk around in space, and get tactile feedback. MS2 allows walking in physical space by tracking each user's skeleton in real-time and allows users to feel by employing passive haptics i.e., when users touch or manipulate an object in the virtual world, they simultaneously also touch or manipulate a corresponding object in the physical world. To enable these elements in VR, MS2 creates a correspondence in spatial layout and object placement by building the virtual world on top of a 3D scan of the real world. Through the association between the real and virtual world, users are able to walk freely while wearing a head-mounted device, avoid obstacles like walls and furniture, and interact with people and objects. Most current virtual reality (VR) environments are designed for a single user experience where interactions with virtual objects are mediated by hand-held input devices or hand gestures. Additionally, users are only shown a representation of their hands in VR floating in front of the camera as seen from a first person perspective. We believe, representing each user as a full-body avatar that is controlled by natural movements of the person in the real world (see Figure 1d), can greatly enhance believability and a user's sense immersion in VR.Comment: 10 pages, 9 figures. Video: http://living.media.mit.edu/projects/metaspace-ii

    xD-Track: Leveraging Multi-Dimensional Information for Passive Wi-Fi Tracking

    Get PDF
    We describe the design and implementation of xD-Track, the first practical Wi-Fi based device-free localization system that employs a simultaneous and joint estimation of time-of-flight, angle-of-arrival, angle-of-departure, and Doppler shift to fully characterize the wireless channel between a sender and receiver. Using this full characterization, xD-Track introduces novel methods to measure and isolate the signal path that reflects off a person of interest, allowing it to localize a human with just a single pair of access points, or a single client-access point pair. Searching the multiple dimensions to accomplish the above is highly computationally burdensome, so xD-Track introduces novel methods to prune computational requirements, making our approach suitable for real-time person tracking. We implement xD-Track on the WARP software-defined radio platform and evaluate in a cluttered office environment. Experiments tracking people moving indoors demonstrate a 230% angle-of-arrival accuracy improvement and a 98% end-to-end tracking accuracy improvement over the state of the art localization scheme SpotFi, adapted for device-free localization. The general platform we propose can be easily extended for other applications including gesture recognition and Wi-Fi imaging to significantly improve performance

    Real time multiple camera person detection and tracking

    Get PDF
    As the amount of video data grows larger every day, the efforts to create intelligent systems able to perceive, understand and extrapolate useful information from this data grow larger, namely object detection and tracking systems have been a widely researched area in the past few years. In the present work we develop a real time, multiple camera, multiple person detection and tracking system prototype, using static, overlapped, sh-eye top view cameras. The goal is to create a system able to intelligently and automatically extrapolate object trajectories from surveillance footage. To solve these problems we employ different types of techniques, namely a combination of the representational power of deep neural networks, which have been yielding outstanding results in computer vision problems over the last few years, and more classical, already established object tracking algorithms in order to represent and track the target objects. In particular, we split the problem in two sub-problems: single camera multiple object tracking and multiple camera multiple object tracking, which we tackle in a modular manner. Our long-term motivation is to deploy this system in a commercial application, such as commercial areas or airports, so that we can build upon intelligent visual surveillance systems.À medida que a quantidade de dados de vídeo cresce, os esforços para criar sistemas inteligentes capazes de observar, entender e extrapolar informação útil destes dados intensifcam-se. Nomeadamente, sistemas de detecção e tracking de objectos têm sido uma àrea amplamente investigada nos últimos anos. No presente trabalho, desenvolvemos um protótipo de tracking multi-câmara, multi-objecto que corre em tempo real, e que usa várias câmaras fish-eye estáticas de topo, com sobreposição entre elas. O objetivo é criar um sistema capaz de extrapolar de modo inteligente e automático as trajetórias de pessoas a partir de imagens de vigilância. Para resolver estes problemas, utilizamos diferentes tipos de técnicas, nomeadamente, uma combinação do poder representacional das redes neurais, que têm produzido excelentes resultados em problemas de visão computacional nos últimos anos, e algoritmos de tracking mais clássicos e já estabelecidos, para representar e seguir o percurso dos objectos de interesse. Em particular, dividimos o problema maior em dois sub-problemas: tracking de objetos de uma única câmera e tracking de objetos de múltiplas câmeras, que abordamos de modo modular. A nossa motivação a longo prazo é implmentar este tipo de sistema em aplicações comerciais, como áreas comerciais ou aeroportos, para que possamos dar mais um passo em direcção a sistemas de vigilância visual inteligentes

    Ein generisches System zur automatischen Detektion, Verfolgung und Wiedererkennung von Personen in Videodaten

    Get PDF
    An important area in computer vision is the person-centered video analysis. Applications cover many areas of today's life like driver assistance, human-machine-interaction, threat assessment in military context and specifically visual surveillance. The basis of this person-centered analysis is person detection and tracking in video data. This is a precondition for all subsequent analysis or interpretation approaches. Moreover, person reidentification is a substantial component of many applications. Such a reidentification of persons is necessary in cases where a long time period or a large spatial area is considered. In these cases, connections between the occurrences of people that are not directly temporally or spatially connected are to be established. A typical example of this is the surveillance of large public spaces like airports where multiple networked cameras are utilised and a long time period is relevant. Due to the diversity of application areas for person detection, tracking, and reidentification, it is desirable to develop a generic system that is most independent of certain aspects of application scenarios and thus universally applicable. In this work, such a system for person detection, tracking and reidentification is introduced. This system is generic regarding different aspects. The system is independent of the application scenario, meaning that no assumptions on the application environment are made. For instance, it is not assumed that the scene background is known or other information regarding the scene is available. It is also not assumed that the recording sensor is stationary, which means the system introduced in this work is applicable in the case of a moving camera. Equally, the system is not limited to certain object classes since no object class specific knowledge other than a set of training samples is used. In addition, the system is mostly independent of the used sensor since no other than the intensity-gradient based local features are used. Thus, the overall system is applicable in the visible and the infrared spectral range since no features like color or depth are employed. The system generality is specifically accomplished by the exclusive use of the Implicit Shape Model approach and local image features for all three system levels, whereby the levels are closely connected and merge in an integrated approach. For person tracking, an extension of the Implicit Shape Model, which combines bottom-up tracking-by-detection with top-down model-based strategies, is introduced. By that, a stabilisation of person detection and automatic tracking through short-term occlusion is accomplished. Likewise, separate steps and heuristics for data association, i.e the association of object hypotheses over time, and model update become redundant. During person tracking, an Implicit Shape Model based identity model, that is used for person reidentification, is established. By that tight coupling of all levels from detection to reidentification, the system is independently applicable under real conditions
    corecore