69 research outputs found

    Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model

    Full text link
    Multi-target multi-camera tracking (MTMCT), i.e., tracking multiple targets across multiple cameras, is a crucial technique for smart city applications. In this paper, we propose an effective and reliable MTMCT framework for vehicles, which consists of a traffic-aware single camera tracking (TSCT) algorithm, a trajectory-based camera link model (CLM) for vehicle re-identification (ReID), and a hierarchical clustering algorithm to obtain the cross camera vehicle trajectories. First, the TSCT, which jointly considers vehicle appearance, geometric features, and some common traffic scenarios, is proposed to track the vehicles in each camera separately. Second, the trajectory-based CLM is adopted to facilitate the relationship between each pair of adjacently connected cameras and add spatio-temporal constraints for the subsequent vehicle ReID with temporal attention. Third, the hierarchical clustering algorithm is used to merge the vehicle trajectories among all the cameras to obtain the final MTMCT results. Our proposed MTMCT is evaluated on the CityFlow dataset and achieves a new state-of-the-art performance with IDF1 of 74.93%.Comment: Accepted by ACM International Conference on Multimedia 202

    Applications of a Graph Theoretic Based Clustering Framework in Computer Vision and Pattern Recognition

    Full text link
    Recently, several clustering algorithms have been used to solve variety of problems from different discipline. This dissertation aims to address different challenging tasks in computer vision and pattern recognition by casting the problems as a clustering problem. We proposed novel approaches to solve multi-target tracking, visual geo-localization and outlier detection problems using a unified underlining clustering framework, i.e., dominant set clustering and its extensions, and presented a superior result over several state-of-the-art approaches.Comment: doctoral dissertatio

    A HMM Classifier with Contextual Observability: Application to Indoor People Tracking

    Get PDF
    Indoor tracking people activities with sensors networks is of high importance in number of domains, such as ambient assisted living. Home sensors have seen strong development over the last few years, especially due to the emergence of Internet of Things. A wide range of sensors are today available to be installed at home : video cameras, RGB-D Kinect, binary proximity sensors, thermometers, accelerometers, etc. An important issue in deploying sensors is to make them work in a common reference frame (extrinsic calibration issue), in order to jointly exploit the data they retrieve. Determining the perception areas that are covered by each sensor is also an issue that is not so easy to solve in practice. In this paper we address both calibration and coverage isssues within in a common framework, based on Hidden Mar-kov Models (HMMs) and clustering techniques. The proposed solution requires a map of the environment, as well as the ground truth of a tracked moving object/person, which are both provided by an external system (e.g. a robot that performs telemetric mapping). The objective of the paper is twofold. On one hand, we propose an extended framework of the classical HMM in order to (a) handle contextual observations and (b) solve general classification problems. In the other, we demonstrate the relevancy of the approach by tracking a person with 4 Kinects in an apartment. A sensing floor allows the implicit calibration and mapping during an initial learning phase

    Automatic visual detection of human behavior: a review from 2000 to 2014

    Get PDF
    Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.This work is funded by the Portuguese Foundation for Science and Technology (FCT - Fundacao para a Ciencia e a Tecnologia) under research Grant SFRH/BD/84939/2012

    Inference of Non-Overlapping Camera Network Topology using Statistical Approaches

    Get PDF
    This work proposes an unsupervised learning model to infer the topological information of a camera network automatically. This algorithm works on non-overlapped and overlapped cameras field of views (FOVs). The constructed model detects the entry/exit zones of the moving objects across the cameras FOVs using the Data-Spectroscopic method. The probabilistic relationships between each pair of entry/exit zones are learnt to localize the camera network nodes. Increase the certainty of the probabilistic relationships using Computer-Generating to create more Monte Carlo observations of entry/exit points. Our method requires no assumptions, no processors for each camera and no communication among the cameras. The purpose is to figure out the relationship between each pair of linked cameras using the statistical approaches which help to track the moving objects depending on their present location. The Output is shown as a Markov chain model that represents the weighted-unit links between each pair of cameras FOV

    Optimisation du suivi de personnes dans un rΓ©seau de camΓ©ras

    Get PDF
    This thesis addresses the problem of improving the performance of people tracking process in a new framework called Global Tracker, which evaluates the quality of people trajectory (obtained by simple tracker) and recovers the potential errors from the previous stage. The first part of this Global Tracker estimates the quality of the tracking results, based on a statistical model analyzing the distribution of the target features to detect potential anomalies.To differentiate real errors from natural phenomena, we analyze all the interactions between the tracked object and its surroundings (other objects and background elements). In the second part, a post tracking method is designed to associate different tracklets (segments of trajectory) corresponding to the same person which were not associated by a first stage of tracking. This tracklet matching process selects the most relevant appearance features to compute a visual signature for each tracklet. Finally, the Global Tracker is evaluated with various benchmark datasets reproducing real-life situations, outperforming the state-of-the-art trackers.Cette thΓ¨se s’intΓ©resse Γ  l’amΓ©lioration des performances du processus de suivi de personnes dans un rΓ©seau de camΓ©ras et propose une nouvelle plate-forme appelΓ©e global tracker. Cette plate-forme Γ©value la qualitΓ© des trajectoires obtenues par un simple algorithme de suivi et permet de corriger les erreurs potentielles de cette premiΓ¨re Γ©tape de suivi. La premiΓ¨re partie de ce global tracker estime la qualitΓ© des trajectoires Γ  partir d’un modΓ¨le statistique analysant des distributions des caractΓ©ristiques de la cible (ie : l’objet suivi) telles que ses dimensions, sa vitesse, sa direction, afin de dΓ©tecter de potentielles anomalies. Pour distinguer de vΓ©ritables erreurs par rapport Γ  des phΓ©nomΓ¨nes optiques, nous analysons toutes les interactions entre l’objet suivi et tout son environnement incluant d’autres objets mobiles et les Γ©lΓ©ments du fond de la scΓ¨ne. Dans la deuxiΓ¨me partie du global tracker, une mΓ©thode en post-traitement a Γ©tΓ© conΓ§ue pour associer les diffΓ©rentes tracklets (ie : segments de trajectoires fiables) correspondant Γ  la mΓͺme personne qui n’auraient pas Γ©tΓ© associΓ©es correctement par la premiΓ¨re Γ©tape de suivi. L’algorithme d’association des tracklets choisit les caractΓ©ristiques d’apparence les plus saillantes et discriminantes afin de calculer une signature visuelle adaptΓ©e Γ  chaque tracklet. Finalement le global tracker est Γ©valuΓ© Γ  partir de plusieurs bases de donnΓ©es de benchmark qui reproduit une large variΓ©tΓ© de situations rΓ©elles. A travers toutes ces expΓ©rimentations, les performances du global tracker sont Γ©quivalentes ou supΓ©rieures aux meilleurs algorithmes de suivi de l’état de l’art

    Robust real-time tracking in smart camera networks

    Get PDF

    Person re-Identification over distributed spaces and time

    Get PDF
    PhDReplicating the human visual system and cognitive abilities that the brain uses to process the information it receives is an area of substantial scientific interest. With the prevalence of video surveillance cameras a portion of this scientific drive has been into providing useful automated counterparts to human operators. A prominent task in visual surveillance is that of matching people between disjoint camera views, or re-identification. This allows operators to locate people of interest, to track people across cameras and can be used as a precursory step to multi-camera activity analysis. However, due to the contrasting conditions between camera views and their effects on the appearance of people re-identification is a non-trivial task. This thesis proposes solutions for reducing the visual ambiguity in observations of people between camera views This thesis first looks at a method for mitigating the effects on the appearance of people under differing lighting conditions between camera views. This thesis builds on work modelling inter-camera illumination based on known pairs of images. A Cumulative Brightness Transfer Function (CBTF) is proposed to estimate the mapping of colour brightness values based on limited training samples. Unlike previous methods that use a mean-based representation for a set of training samples, the cumulative nature of the CBTF retains colour information from underrepresented samples in the training set. Additionally, the bi-directionality of the mapping function is explored to try and maximise re-identification accuracy by ensuring samples are accurately mapped between cameras. Secondly, an extension is proposed to the CBTF framework that addresses the issue of changing lighting conditions within a single camera. As the CBTF requires manually labelled training samples it is limited to static lighting conditions and is less effective if the lighting changes. This Adaptive CBTF (A-CBTF) differs from previous approaches that either do not consider lighting change over time, or rely on camera transition time information to update. By utilising contextual information drawn from the background in each camera view, an estimation of the lighting change within a single camera can be made. This background lighting model allows the mapping of colour information back to the original training conditions and thus remove the need for 3 retraining. Thirdly, a novel reformulation of re-identification as a ranking problem is proposed. Previous methods use a score based on a direct distance measure of set features to form a correct/incorrect match result. Rather than offering an operator a single outcome, the ranking paradigm is to give the operator a ranked list of possible matches and allow them to make the final decision. By utilising a Support Vector Machine (SVM) ranking method, a weighting on the appearance features can be learned that capitalises on the fact that not all image features are equally important to re-identification. Additionally, an Ensemble-RankSVM is proposed to address scalability issues by separating the training samples into smaller subsets and boosting the trained models. Finally, the thesis looks at a practical application of the ranking paradigm in a real world application. The system encompasses both the re-identification stage and the precursory extraction and tracking stages to form an aid for CCTV operators. Segmentation and detection are combined to extract relevant information from the video, while several combinations of matching techniques are combined with temporal priors to form a more comprehensive overall matching criteria. The effectiveness of the proposed approaches is tested on datasets obtained from a variety of challenging environments including offices, apartment buildings, airports and outdoor public spaces

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Activity Analysis; Finding Explanations for Sets of Events

    Get PDF
    Automatic activity recognition is the computational process of analysing visual input and reasoning about detections to understand the performed events. In all but the simplest scenarios, an activity involves multiple interleaved events, some related and others independent. The activity in a car park or at a playground would typically include many events. This research assumes the possible events and any constraints between the events can be defined for the given scene. Analysing the activity should thus recognise a complete and consistent set of events; this is referred to as a global explanation of the activity. By seeking a global explanation that satisfies the activity’s constraints, infeasible interpretations can be avoided, and ambiguous observations may be resolved. An activity’s events and any natural constraints are defined using a grammar formalism. Attribute Multiset Grammars (AMG) are chosen because they allow defining hierarchies, as well as attribute rules and constraints. When used for recognition, detectors are employed to gather a set of detections. Parsing the set of detections by the AMG provides a global explanation. To find the best parse tree given a set of detections, a Bayesian network models the probability distribution over the space of possible parse trees. Heuristic and exhaustive search techniques are proposed to find the maximum a posteriori global explanation. The framework is tested for two activities: the activity in a bicycle rack, and around a building entrance. The first case study involves people locking bicycles onto a bicycle rack and picking them up later. The best global explanation for all detections gathered during the day resolves local ambiguities from occlusion or clutter. Intensive testing on 5 full days proved global analysis achieves higher recognition rates. The second case study tracks people and any objects they are carrying as they enter and exit a building entrance. A complete sequence of the person entering and exiting multiple times is recovered by the global explanation
    • …
    corecore