569 research outputs found

    Should every stage of track-by-detection utilize deep learning?

    Get PDF
    The neural network-based solutions are becoming more and more popular because of their ability to solve problems, which could not be solved before. This has also led people to utilize neural networks to solve problem, where more classical methods could be utilized. This work tries to solve if the usage of neural networks in track-by-detection paradigm gives the system an advance over system with classical methods when tracking pedestrians. Track-by-detection is a two-module system, where the first module extracts detections from an input image. Detections are fed to the second module, which associates detections with unique identifier and tries to track identified objects through a sequence of concurrent images. The hypothesis of this work is that both modules in track-by-detection can be replaced with solutions without a neural network. Research was performed for both modules separately, be because the object detection can be evaluated without the second module, which can be evaluated with precalculated detections. Research about object detection was done as a literature review. Different tracking algorithms were evaluated using MOTChallenge’s data set. According to the results from the literature review, object detection cannot be replaced with classical methods. The results about tracking shows that tracking can be done well without neural networks. Results of this work shows that neural network-based solutions are justified to be used in the first module of track-by-detection. The second module can be neural network-based, but this required more resources to get working well. Many of the more classical methods can be swapped easily in the track-by-detection to find which methods works best in the current use-case. According to the results, neural network-based trackers do not bring enough benefits for real-time tracking to be used over classical methods.Neuroverkkopohjaisten ratkaisujen suosio on jatkuvassa kasvussa, koska niiden avulla kyetään ratkaisemaan ongelmia, joita ei ole ennen voinut ratkaista. Tämä on kuitenkin johtanut siihen, että neuroverkkoja käytetään ratkaisemaan ongelmia, joihin perinteiset menetelmät toimisivat hyvin. Tässä työssä pyritään selvittämään, onko neuroverkkojen hyödyntäminen jalankulkijoiden havaintopohjaisessa seurannassa tarpeen. Havaintopohjainen seuranta (Track-by-detection) on paradigma, joka muodostuu kahdesta moduulista. Ensimmäinen moduuli hoitaa kohteen tunnistuksen annetusta kuvasta ja lähettää havainnot seuraavalle moduulille. Toisen moduulin työ on luoda uniikkeja tunnisteita havainnoille ja yhdistää peräkkäisten kuvien havainnot toisiinsa. Tämän työn hypoteesi on, että havaintopohjaisen seurannan molemmat moduulit voidaan korvata klassisilla menetelmillä, jotka eivät hyödynnä neuroverkkoja. Molempia moduuleja tutkittiin erikseen, koska molempia moduuleja voidaan arvioida ilman toista. Ensimmäinen moduuli arvioitiin kirjallisuuskatsauksena ja toinen moduuli arvioitiin hyödyntäen MOTChallengen arviointikriteerejä. Kirjallisuuskatsauksen tuloksien perusteella kohteen tunnistusta ei voida korvata järkevästi klassisilla menetelmillä. Havaintojen seurantaan käytettävä moduuli voidaan korvata menetelmillä, jotka eivät käytä neuroverkkoja. Työn tuloksien mukaan neuroverkkopohjaiset ratkaisut ovat oikeutettuja käytettäväksi havaintopohjaisen paradigman ensimmäisessä moduulissa. Toisessa moduulissa neuroverkkopohjaisia ratkaisuja voidaan hyödyntää, mutta tällöin ratkaisun luominen vaatii enemmän resursseja kehitysvaiheessa. Klassisilla menetelmillä voidaan helposti ja nopeasti kokeilla eri menetelmiä löytääksemme parhaimman mahdollisimman menetelmän ratkaistavalle käyttötapaukselle. Tuloksien mukaan neuroverkkopohjaiset seurantamenetelmät eivät tuo tarpeeksi hyötyä reaaliaikaisessa seurannassa, jotta niiden käyttäminen klassisten menetelmien sijaan olisi oikeutettua

    A Survey and Comparison of Low-Cost Sensing Technologies for Road Traffic Monitoring

    Get PDF
    Abstract This paper reviews low-cost vehicle and pedestrian detection methods and compares their accuracy. The main goal of this survey is to summarize the progress achieved to date and to help identify the sensing technologies that provide high detection accuracy and meet requirements related to cost and ease of installation. Special attention is paid to wireless battery-powered detectors of small dimensions that can be quickly and effortlessly installed alongside traffic lanes (on the side of a road or on a curb) without any additional supporting structures. The comparison of detection methods presented in this paper is based on results of experiments that were conducted with a variety of sensors in a wide range of configurations. During experiments various sensor sets were analyzed. It was shown that the detection accuracy can be significantly improved by fusing data from appropriately selected set of sensors. The experimental results reveal that accurate vehicle detection can be achieved by using sets of passive sensors. Application of active sensors was necessary to obtain satisfactory results in case of pedestrian detection

    It takes two to tango: cascading off-the-shelf face detectors

    Get PDF
    Recent face detection methods have achieved high detection rates in unconstrained environments. However, as they still generate excessive false positives, any method for reducing false positives is highly desirable. This work aims to massively reduce false positives of existing face detection methods whilst maintaining the true detection rate. In addition, the proposed method also aims to sidestep the detector retraining task which generally requires enormous effort. To this end, we propose a two-stage framework which cascades two off-the-shelf face detectors. Not all face detectors can be cascaded and achieve good performance. Thus, we study three properties that allow us to determine the best pair of detectors. These three properties are: (1) correlation of true positives; (2) diversity of false positives and (3) detector runtime. Experimental results on recent large benchmark datasets such as FDDB and WIDER FACE support our findings that the false positives of a face detector could be potentially reduced by 90% whilst still maintaining high true positive detection rate. In addition, with a slight decrease in true positives, we found a pair of face detector that achieves significantly lower false positives, while being five times faster than the current state-of-the-art detector

    Cyclist Detection, Tracking, and Trajectory Analysis in Urban Traffic Video Data

    Full text link
    The major objective of this thesis work is examining computer vision and machine learning detection methods, tracking algorithms and trajectory analysis for cyclists in traffic video data and developing an efficient system for cyclist counting. Due to the growing number of cyclist accidents on urban roads, methods for collecting information on cyclists are of significant importance to the Department of Transportation. The collected information provides insights into solving critical problems related to transportation planning, implementing safety countermeasures, and managing traffic flow efficiently. Intelligent Transportation System (ITS) employs automated tools to collect traffic information from traffic video data. In comparison to other road users, such as cars and pedestrians, the automated cyclist data collection is relatively a new research area. In this work, a vision-based method for gathering cyclist count data at intersections and road segments is developed. First, we develop methodology for an efficient detection and tracking of cyclists. The combination of classification features along with motion based properties are evaluated to detect cyclists in the test video data. A Convolutional Neural Network (CNN) based detector called You Only Look Once (YOLO) is implemented to increase the detection accuracy. In the next step, the detection results are fed into a tracker which is implemented based on the Kernelized Correlation Filters (KCF) which in cooperation with the bipartite graph matching algorithm allows to track multiple cyclists, concurrently. Then, a trajectory rebuilding method and a trajectory comparison model are applied to refine the accuracy of tracking and counting. The trajectory comparison is performed based on semantic similarity approach. The proposed counting method is the first cyclist counting method that has the ability to count cyclists under different movement patterns. The trajectory data obtained can be further utilized for cyclist behavioral modeling and safety analysis

    Efficient and Accurate Tracking for Face Diarization via Periodical Detection

    Get PDF
    Face diarization, i.e. face tracking and clustering within video documents, is useful and important for video indexing and fast browsing but it is also a difficult and time consuming task. In this paper, we address the tracking aspect and propose a novel algorithm with two main contributions. First, we propose an approach that leverages state-of-the-art deformable part-based model (DPM) face detector with a multi-cue discriminant tracking-by-detection framework that relies on automatically learned long-term time-interval sensitive association costs specific to each document type. Secondly to improve performance, we propose an explicit false alarm removal step at the track level to efficiently filter out wrong detections (and resulting tracks). Altogether, the method is able to skip frames, i.e. process only 3 to 4 frames per second - thus cutting down computational cost - while performing better than state-of-the-art methods as evaluated on three public benchmarks from different context including a movie and broadcast data

    The XMM Cluster Survey: X-ray analysis methodology

    Get PDF
    The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters using all publicly available data in the XMM-Newton Science Archive. Its main aims are to measure cosmological parameters and trace the evolution of X-ray scaling relations. In this paper we describe the data processing methodology applied to the 5,776 XMM observations used to construct the current XCS source catalogue. A total of 3,675 > 4-sigma cluster candidates with > 50 background-subtracted X-ray counts are extracted from a total non-overlapping area suitable for cluster searching of 410 deg^2. Of these, 993 candidates are detected with > 300 background-subtracted X-ray photon counts, and we demonstrate that robust temperature measurements can be obtained down to this count limit. We describe in detail the automated pipelines used to perform the spectral and surface brightness fitting for these candidates, as well as to estimate redshifts from the X-ray data alone. A total of 587 (122) X-ray temperatures to a typical accuracy of < 40 (< 10) per cent have been measured to date. We also present the methodology adopted for determining the selection function of the survey, and show that the extended source detection algorithm is robust to a range of cluster morphologies by inserting mock clusters derived from hydrodynamical simulations into real XMM images. These tests show that the simple isothermal beta-profiles is sufficient to capture the essential details of the cluster population detected in the archival XMM observations. The redshift follow-up of the XCS cluster sample is presented in a companion paper, together with a first data release of 503 optically-confirmed clusters.Comment: MNRAS accepted, 45 pages, 38 figures. Our companion paper describing our optical analysis methodology and presenting a first set of confirmed clusters has now been submitted to MNRA

    1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

    Full text link
    The 1st^{\text{st}} Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.Comment: MaCVi 2023 was part of WACV 2023. This report (38 pages) discusses the competition as part of MaCV

    Region of Interest Generation for Pedestrian Detection using Stereo Vision

    Get PDF
    Pedestrian detection is an active research area in the field of computer vision. The sliding window paradigm is usually followed to extract all possible detector windows, however, it is very time consuming. Subsequently, stereo vision using a pair of camera is preferred to reduce the search space that includes the depth information. Disparity map generation using feature correspondence is an integral part and a prior task to depth estimation. In our work, we apply the ORB features to fasten the feature correspondence process. Once the ROI generation phase is over, the extracted detector window is represented by low level histogram of oriented gradient (HOG) features. Subsequently, Linear Support Vector Machine (SVM) is applied to classify them as either pedestrian or non-pedestrian. The experimental results reveal that ORB driven depth estimation is at least seven times faster than the SURF descriptor and ten times faster than the SIFT descriptor

    Biologically inspired vision for human-robot interaction

    Get PDF
    Human-robot interaction is an interdisciplinary research area that is becoming more and more relevant as robots start to enter our homes, workplaces, schools, etc. In order to navigate safely among us, robots must be able to understand human behavior, to communicate, and to interpret instructions from humans, either by recognizing their speech or by understanding their body movements and gestures. We present a biologically inspired vision system for human-robot interaction which integrates several components: visual saliency, stereo vision, face and hand detection and gesture recognition. Visual saliency is computed using color, motion and disparity. Both the stereo vision and gesture recognition components are based on keypoints coded by means of cortical V1 simple, complex and end-stopped cells. Hand and face detection is achieved by using a linear SVM classifier. The system was tested on a child-sized robot.Postprin
    corecore