23 research outputs found

    Transferring a generic pedestrian detector towards specific scenes

    Full text link
    The performance of a generic pedestrian detector may drop significantly when it is applied to a specific scene due to mismatch between the source dataset used to train the detector and samples in the target scene. In this paper, we investigate how to automatically train a scene-specific pedestrian detector starting with a generic detector in video surveillance without further manually labeling any samples under a novel transfer learning framework. It tackles the problem from three aspects. (1) With a graphical represen-tation and through exploring the indegrees from target sam-ples to source samples, the source samples are properly re-weighted. The indegrees detect the boundary between the distributions of the source dataset and the target dataset. The re-weighted source dataset better matches the target scene. (2) It takes the context information from motions, scene structures and scene geometry as the confidence scores of samples from the target scene to guide trans-fer learning. (3) The confidence scores propagate among samples on a graph according to the underlying visual structures of samples. All these considerations are formu-lated under a single objective function called Confidence-Encoded SVM. At the test stage, only the appearance-based detector is used without the context cues. The effectiveness of the proposed framework is demonstrated through experi-ments on two video surveillance datasets. Compared with a generic pedestrian detector, it significantly improves the de-tection rate by 48 % and 36 % at one false positive per image on the two datasets respectively. 1

    Transferring a generic pedestrian detector towards specific scenes.

    Get PDF
    近年來,在公開的大規模人工標注數據集上訓練通用行人檢測器的方法有了顯著的進步。然而,當通用行人檢測器被應用到一個特定的,未公開過的場景中時,它的性能會不如預期。這是由待檢測的數據(源樣本)與訓練數據(目標樣本)的不匹配,以及新場景中視角、光照、分辨率和背景噪音的變化擾動造成的。在本論文中,我們提出一個新的自動將通用行人檢測器適應到特定場景中的框架。這個框架分為兩個階段。在第一階段,我們探索監控錄像場景中提供的特定表征。利用這些表征,從目標場景中選擇正負樣本並重新訓練行人檢測器,該過程不斷迭代直至收斂。在第二階段,我們提出一個新的機器學習框架,該框架綜合每個樣本的標簽和比重。根據這些比重,源樣本和目標樣本被重新權重,以優化最終的分類器。這兩種方法都屬於半監督學習,僅僅需要非常少的人工干預。使用提出的方法可以顯著提高通用行人檢測器的准確性。實驗顯示,由方法訓練出來的檢測器可以和使用大量手工標注的目標場景數據訓練出來的媲美。與其它解決類似問題的方法比較,該方法同樣好於許多已有方法。本論文的工作已經分別於朲朱朱年和朲朱朲年在杉杅杅杅計算機視覺和模式識別會議(权杖材杒)中發表。In recent years, significant progress has been made in learning generic pedestrian detectors from publicly available manually labeled large scale training datasets. However, when a generic pedestrian detector is applied to a specific, previously undisclosed scene where the testing data (target examples) does not match with the training data (source examples) because of variations of viewpoints, resolutions, illuminations and backgrounds, its accuracy may decrease greatly.In this thesis, a new framework is proposed automatically adapting a pre-trained generic pedestrian detector to a specific traffic scene. The framework is two-phased. In the first phase, scene-specific cues in the video surveillance sequence are explored. Utilizing the multi-cue information, both condent positive and negative examples from the target scene are selected to re-train the detector iteratively. In the second phase, a new machine learning framework is proposed, incorporating not only example labels but also example confidences. Source and target examples are re-weighted according to their confidence, optimizing the performance of the final classifier. Both methods belong to semi-supervised learning and require very little human intervention.The proposed approaches significantly improve the accuracy of the generic pedestrian detector. Their results are comparable with the detector trained using a large number of manually labeled frames from the target scene. Comparison with other existing approaches tackling similar problems shows that the proposed approaches outperform many contemporary methods.The works have been published on the IEEE Conference on Computer Vision and Pattern Recognition in 2011 and 2012, respectively.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Wang, Meng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 42-45).Abstracts also in Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- PedestrianDetection --- p.1Chapter 1.1.1 --- Overview --- p.1Chapter 1.1.2 --- StatisticalLearning --- p.1Chapter 1.1.3 --- ObjectRepresentation --- p.2Chapter 1.1.4 --- SupervisedStatisticalLearninginObjectDetection --- p.3Chapter 1.2 --- PedestrianDetectioninVideoSurveillance --- p.4Chapter 1.2.1 --- ProblemSetting --- p.4Chapter 1.2.2 --- Challenges --- p.4Chapter 1.2.3 --- MotivationsandContributions --- p.5Chapter 1.3 --- RelatedWork --- p.6Chapter 1.4 --- OrganizationsofChapters --- p.9Chapter 2 --- Label Inferring by Multi-Cues --- p.10Chapter 2.1 --- DataSet --- p.10Chapter 2.2 --- Method --- p.12Chapter 2.2.1 --- CondentPositiveExamplesofPedestrians --- p.13Chapter 2.2.2 --- CondentNegativeExamplesfromtheBackground --- p.17Chapter 2.2.3 --- CondentNegativeExamplesfromVehicles --- p.17Chapter 2.2.4 --- FinalSceneSpecicPedestrianDetector --- p.19Chapter 2.3 --- ExperimentResults --- p.20Chapter 3 --- Transferring a Detector by Condence Propagation --- p.24Chapter 3.1 --- Method --- p.25Chapter 3.1.1 --- Overview --- p.25Chapter 3.1.2 --- InitialEstimationofCondenceScores --- p.27Chapter 3.1.3 --- Re-weightingSourceSamples --- p.27Chapter 3.1.4 --- Condence-EncodedSVM --- p.30Chapter 3.2 --- Experiments --- p.33Chapter 3.2.1 --- Datasets --- p.33Chapter 3.2.2 --- ParameterSetting --- p.35Chapter 3.2.3 --- Results --- p.36Chapter 4 --- Conclusions and Future Work --- p.4

    Automated Video Analysis for Maritime Surveillance

    Get PDF

    Learning to understand the world in 3D

    Get PDF
    3D Computer vision is a research topic gathering even increasing attention thanks to the more and more widespread availability of off-the-shelf depth sensors and large-scale 3D datasets. The main purpose of 3D computer vision is to understand the geometry of the objects in order to interact with them. Recently, the success of deep neural networks for processing images has fostered a data driven approach to solve 3D vision problems. Inspired by the potential of this field, in this thesis we will address two main problems: (a) how to leverage machine/deep learning techniques to build a robust and effective pipeline to establish correspondences between surfaces, and (b) how to obtain a reliable 3D reconstruction of an object using RGB images sparsely acquired from different point of views by means of deep neural networks. At the heart of many 3D computer vision applications lies surface matching, an effective paradigm aimed at finding correspondences between points belonging to different shapes. To this end, it is essential to first identify the characteristic points of an object and then create an adequate representation of them. We will refer to these two steps as keypoint detection and keypoint description, respectively. As a first contribution (a) of this Ph.D thesis, we will propose data driven solutions to tackle the problems of keypoint detection and description. As a further interesting direction of research, we investigate the problem of 3D object reconstruction from RGB data only (b). If in the past this application has been addressed by SLAM and Structure from motion (SfM) techniques, this radically changed in recent years thanks to the dawn of deep learning. Following this trend, we will introduce a novel approach that combines traditional computer vision techniques with deep learning to perform a view point variant 3D object reconstruction from non-overlapping RGB views

    Robust Face Tracking in Video Sequences

    Get PDF
    Ce travail présente une analyse et une discussion détaillées d’un nouveau système de suivi des visages qui utilise plusieurs modèles d’apparence ainsi qu’un e approche suivi par détection. Ce système peut aider un système de reconnaissance de visages basé sur la vidéo en donnant des emplacements de visages d’individus spécifiques (région d’intérêt, ROI) pour chaque cadre. Un système de reconnaissance faciale peut utiliser les ROI fournis par le suivi du visage pour obtenir des preuves accumulées de la présence d’une personne d’une personne présente dans une vidéo, afin d’identifier une personne d’intérêt déjà inscrite dans le système de reconnaissance faciale. La tâche principale d’une méthode de suivi est de trouver l’emplacement d’un visage présent dans une image en utilisant des informations de localisation à partir de la trame précédente. Le processus de recherche se fait en trouvant la meilleure région qui maximise la possibilité d’un visage présent dans la trame en comparant la région avec un modèle d’apparence du visage. Cependant, au cours de ce processus, plusieurs facteurs externes nuisent aux performances d’une méthode de suivi. Ces facteurs externes sont qualifiés de nuisances et apparaissent habituellement sous la forme d’une variation d’éclairage, d’un encombrement de la scène, d’un flou de mouvement, d’une occlusion partielle, etc. Ainsi, le principal défi pour une méthode de suivi est de trouver la meilleure région malgré les changements d’apparence fréquents du visage pendant le processus de suivi. Étant donné qu’il n’est pas possible de contrôler ces nuisances, des modèles d’apparence faciale robustes sont conçus et développés de telle sorte qu’ils soient moins affectés par ces nuisances et peuvent encore suivre un visage avec succès lors de ces scénarios. Bien qu’un modèle d’apparence unique puisse être utilisé pour le suivi d’un visage, il ne peut pas s’attaquer à toutes les nuisances de suivi. Par conséquent, la méthode proposée utilise plusieurs modèles d’apparence faciale pour s’attaquer à ces nuisances. En outre, la méthode proposée combine la méthodologie du suivi par détection en employant un détecteur de visage qui fournit des rectangles englobants pour chaque image. Par conséquent, le détecteur de visage aide la méthode de suivi à aborder les nuisances de suivi. De plus, un détecteur de visage contribue à la réinitialisation du suivi pendant un cas de dérive. Cependant, la précision suivi peut encore être améliorée en générant des candidats additionnels autour de l’estimation de la position de l’objet par la méthode de suivi et en choisissant le meilleur parmi eux. Ainsi, dans la méthode proposée, le suivi du visage est formulé comme le visage candidat qui maximise la similitude de tous les modèles d’apparence.----------ABSTRACT: This work presents a detailed analysis and discussion of a novel face tracking system that utilizes multiple appearance models along with a tracking-by-detection framework that can aid a video-based face recognition system by giving face locations of specific individuals (Region Of Interest, ROI) for every frame. A face recognition system can utilize the ROIs provided by the face tracker to get accumulated evidence of a person being present in a video, in order to identify a person of interest that is already enrolled in the face recognition system. The primary task of a face tracker is to find the location of a face present in an image by utilizing its location information from the previous frame. The searching process is done by finding the best region that maximizes the possibility of a face being present in the frame by comparing the region with a face appearance model. However, during this face search, several external factors inhibit the performance of a face tracker. These external factors are termed as tracking nuisances, and usually appear in the form of illumination variation, background clutter, motion blur, partial occlusion, etc. Thus, the main challenge for a face tracker is to find the best region in spite of frequent appearance changes of the face during the tracking process. Since, it is not possible to control these nuisances. Robust face appearance models are designed and developed such that they do not too much affected by these nuisances and still can track a face successfully during such scenarios. Although a single face appearance model can be used for tracking a face, it cannot tackle all the tracking nuisances. Hence, the proposed method utilizes multiple face appearance models. By doing this, different appearance models can facilitate tracking in the presence of tracking nuisances. In addition, the proposed method, combines the tracking-by-detection methodology by employing a face detector that outputs a bounding box for every frame. Therefore, the face detector aids the face tracker in tackling the tracking nuisances. In addition, a face detector aids in the re-initialization of the tracker during tracking drift. However, the precision of the tracker can further be improved by generating face candidates around the face tracking output and choosing the best among them. Thus, in the proposed method, face tracking is formulated as the face candidate that maximizes the similarity of all the appearance models

    Transferring boosted detectors towards viewpoint and scene adaptiveness

    No full text
    10.1109/TIP.2010.2103951IEEE Transactions on Image Processing2051388-1400IIPR

    Exploring the relationship between intelligent transport system capability and business agility within the Bus Rapid Transit in South Africa

    Get PDF
    Abstract: More than 65% of South Africans use public transportation to access educational, business, and financial activity. Mobility of individuals and products, particularly in metropolitan areas, suffers from delays, unreliability, absence of safety and air pollution. On the other hand, mobility demand is increasing quicker than South Africa's accessible infrastructure. Public transport services are poor in general, but this picture is transforming a high-quality mass transit system using high-capacity buses along dedicated bus lanes by implementing the Bus Rapid Transit (BRT) system. The BRT system appeared as the leading mode of urban passenger transit in the first decade of the twenty-first century after a few pioneering applications in the later portion of the twentieth century. In addition, Intelligent Transport System’s (ITS) advantages motivate both advanced and developing nations, such as South Africa, to invest in these techniques rather than spending enormous quantities on expanding the transportation network. Various stakeholders in government, academia and industry are in the process of presenting a shared vision of this new strategy and first practical steps should be taken towards this objective. Intelligent transport system capacity can provide better and more inclusive public transportation facilities to commuters through enhanced reliability and accessibility; to operators through efficiency gains; and to customers and operators in terms of cost-effectiveness and service provision affordability. International experience shows that capacities of the ITS can boost transportation profits by as much as 10-15%...D.Phil. (Engineering Management

    Changing Priorities. 3rd VIBRArch

    Full text link
    In order to warrant a good present and future for people around the planet and to safe the care of the planet itself, research in architecture has to release all its potential. Therefore, the aims of the 3rd Valencia International Biennial of Research in Architecture are: - To focus on the most relevant needs of humanity and the planet and what architectural research can do for solving them. - To assess the evolution of architectural research in traditionally matters of interest and the current state of these popular and widespread topics. - To deepen in the current state and findings of architectural research on subjects akin to post-capitalism and frequently related to equal opportunities and the universal right to personal development and happiness. - To showcase all kinds of research related to the new and holistic concept of sustainability and to climate emergency. - To place in the spotlight those ongoing works or available proposals developed by architectural researchers in order to combat the effects of the COVID-19 pandemic. - To underline the capacity of architectural research to develop resiliency and abilities to adapt itself to changing priorities. - To highlight architecture's multidisciplinarity as a melting pot of multiple approaches, points of view and expertise. - To open new perspectives for architectural research by promoting the development of multidisciplinary and inter-university networks and research groups. For all that, the 3rd Valencia International Biennial of Research in Architecture is open not only to architects, but also for any academic, practitioner, professional or student with a determination to develop research in architecture or neighboring fields.Cabrera Fausto, I. (2023). Changing Priorities. 3rd VIBRArch. Editorial Universitat Politècnica de València. https://doi.org/10.4995/VIBRArch2022.2022.1686

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF

    Multi-Agent Systems

    Get PDF
    A multi-agent system (MAS) is a system composed of multiple interacting intelligent agents. Multi-agent systems can be used to solve problems which are difficult or impossible for an individual agent or monolithic system to solve. Agent systems are open and extensible systems that allow for the deployment of autonomous and proactive software components. Multi-agent systems have been brought up and used in several application domains
    corecore