42,924 research outputs found

    Joint Detection and Tracking in Videos with Identification Features

    Full text link
    Recent works have shown that combining object detection and tracking tasks, in the case of video data, results in higher performance for both tasks, but they require a high frame-rate as a strict requirement for performance. This is assumption is often violated in real-world applications, when models run on embedded devices, often at only a few frames per second. Videos at low frame-rate suffer from large object displacements. Here re-identification features may support to match large-displaced object detections, but current joint detection and re-identification formulations degrade the detector performance, as these two are contrasting tasks. In the real-world application having separate detector and re-id models is often not feasible, as both the memory and runtime effectively double. Towards robust long-term tracking applicable to reduced-computational-power devices, we propose the first joint optimization of detection, tracking and re-identification features for videos. Notably, our joint optimization maintains the detector performance, a typical multi-task challenge. At inference time, we leverage detections for tracking (tracking-by-detection) when the objects are visible, detectable and slowly moving in the image. We leverage instead re-identification features to match objects which disappeared (e.g. due to occlusion) for several frames or were not tracked due to fast motion (or low-frame-rate videos). Our proposed method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.Comment: Accepted at Image and Vision Computing Journa

    Real-time Vehicle Detection, Tracking and Counting System Based on YOLOv7

    Get PDF
    The importance of real-time vehicle detection tracking and counting system based on YOLOv7 is an important tool for monitoring traffic flow on highways. Highway traffic management, planning, and prevention rely heavily on real-time traffic monitoring technologies to avoid frequent traffic snarls, moving violations, and fatal car accidents. These systems rely only on data from timedependent vehicle trajectories used to predict online traffic flow. Three crucial duties include the detection, tracking, and counting of cars on urban roads and highways as well as the calculation of statistical traffic flow statistics (such as determining the real-time vehicles flow and how many different types of vehicles travel). Important phases in these systems include object detection, tracking, categorizing, and counting. The YOLOv7 identification method is presented to address the issues of high missed detection rates of the YOLOv7 algorithm for vehicle detection on urban highways, weak perspective perception of small targets, and insufficient feature extraction. This system aims to provide real-time monitoring of vehicles, enabling insights into traffic patterns and facilitating informed decision-making. In this paper, vehicle detecting, tracking, and counting can be calculated on real-time videos based on modified YOLOv7 with high accuracy

    Object Detection using Particle Swarm Optimisation and Kalman Filter to Track Partially occluded Targets

    Get PDF
    Motion estimation, object detection, and tracking have been actively pursued by researchers in the field of real time video processing. In the present work, a new algorithm is proposed to automatically detect objects using revised local binary pattern (m-LBP) for object detection. The detected object was tracked and its location estimated using the Kalman filter, whose state covariance matrix was tuned using particle swarm optimisation (PSO). PSO, being a nature inspired algorithm, is a well proven optimization technique. This algorithm was applied to important real-world problems of partially-occluded objects in infrared videos. Algorithm validation was performed by realizing a thermal imager, and this novel algorithm was implemented in it to demonstrate that the proposed algorithm is more efficient and produces better results in motion estimation for partially-occluded objects. It is also shown that track convergence is 56% faster in the PSO-Kalman algorithm than tracking with Kalman-only filter

    Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis

    Full text link
    Splitting of sequential data, such as videos and time series, is an essential step in various data analysis tasks, including object tracking and anomaly detection. However, splitting sequential data presents a variety of challenges that can impact the accuracy and reliability of subsequent analyses. This concept article examines the challenges associated with splitting sequential data, including data acquisition, data representation, split ratio selection, setting up quality criteria, and choosing suitable selection strategies. We explore these challenges through two real-world examples: motor test benches and particle tracking in liquids

    Semantic-Aware Real-Time Correlation Tracking Framework for UAV Videos

    Get PDF
    Discriminative correlation filter (DCF) has contributed tremendously to address the problem of object tracking benefitting from its high computational efficiency. However, it has suffered from performance degradation in unmanned aerial vehicle (UAV) tracking. This article presents a novel semantic-aware real-time correlation tracking framework (SARCT) for UAV videos to enhance the performance of DCF trackers without incurring excessive computing cost. Specifically, SARCT first constructs an additional detection module to generate ROI proposals and to filter any response regarding the target irrelevant area. Then, a novel semantic segmentation module based on semantic template generation and semantic coefficient prediction is further introduced to capture semantic information, which can provide precise ROI mask, thereby effectively suppressing background interference in the ROI proposals. By sharing features and specific network layers for object detection and semantic segmentation, SARCT reduces parameter redundancy to attain sufficient speed for real-time applications. Systematic experiments are conducted on three typical aerial datasets in order to evaluate the performance of the proposed SARCT. The results demonstrate that SARCT is able to improve the accuracy of conventional DCF-based trackers significantly, outperforming state-of-the-art deep trackers

    Real-Time Facial Emotion Recognition Using Fast R-CNN

    Get PDF
    In computer vision and image processing, object detection algorithms are used to detect semantic objects of certain classes of images and videos. Object detector algorithms use deep learning networks to classify detected regions. Unprecedented advancements in Convolutional Neural Networks (CNN) have led to new possibilities and implementations for object detectors. An object detector which uses a deep learning algorithm detect objects through proposed regions, and then classifies the region using a CNN. Object detectors are computationally efficient unlike a typical CNN which is computationally complex and expensive. Object detectors are widely used for face detection, recognition, and object tracking. In this thesis, deep learning based object detection algorithms are implemented to classify facially expressed emotions in real-time captured through a webcam. A typical CNN would classify images without specifying regions within an image, which could be considered as a limitation towards better understanding the network performance which depend on different training options. It would also be more difficult to verify whether a network have converged and is able to generalize, which is the ability to classify unseen data, data which was not part of the training set. Fast Region-based Convolutional Neural Network, an object detection algorithm; used to detect facially expressed emotion in real-time by classifying proposed regions. The Fast R-CNN is trained using a high-quality video database, consisting of 24 actors, facially expressing eight different emotions, obtained from images which were processed from 60 videos per actor. An object detector’s performance is measured using various metrics. Regardless of how an object detector performed with respect to average precision or miss rate, doing well on such metrics would not necessarily mean that the network is correctly classifying regions. This may result from the fact that the network model has been over-trained. In our work we showed that object detector algorithm such as Fast R-CNN performed surprisingly well in classifying facially expressed emotions in real-time, performing better than CNN

    Object Detection During Newborn Resuscitation Activities

    Full text link
    Birth asphyxia is a major newborn mortality problem in low-resource countries. International guideline provides treatment recommendations; however, the importance and effect of the different treatments are not fully explored. The available data is collected in Tanzania, during newborn resuscitation, for analysis of the resuscitation activities and the response of the newborn. An important step in the analysis is to create activity timelines of the episodes, where activities include ventilation, suction, stimulation etc. Methods: The available recordings are noisy real-world videos with large variations. We propose a two-step process in order to detect activities possibly overlapping in time. The first step is to detect and track the relevant objects, like bag-mask resuscitator, heart rate sensors etc., and the second step is to use this information to recognize the resuscitation activities. The topic of this paper is the first step, and the object detection and tracking are based on convolutional neural networks followed by post processing. Results: The performance of the object detection during activities were 96.97 % (ventilations), 100 % (attaching/removing heart rate sensor) and 75 % (suction) on a test set of 20 videos. The system also estimate the number of health care providers present with a performance of 71.16 %. Conclusion: The proposed object detection and tracking system provides promising results in noisy newborn resuscitation videos. Significance: This is the first step in a thorough analysis of newborn resuscitation episodes, which could provide important insight about the importance and effect of different newborn resuscitation activitiesComment: 8 page
    • …
    corecore