21 research outputs found

    Comparing feature matching for object categorization in video surveillance

    Get PDF
    In this paper we consider an object categorization system using local HMAX features. Two feature matching techniques are compared: the MAX technique, originally proposed in the HMAX framework, and the histogram technique originating from Bag-of-Words literature. We have found that each of these techniques have their own field of operation. The histogram technique clearly outperforms the MAX technique with 5-15% for small dictionaries up to 500-1,000 features, favoring this technique for embedded (surveillance) applications. Additionally, we have evaluated the influence of interest point operators in the system. A first experiment analyzes the effect of dictionary creation and has showed that random dictionaries outperform dictionaries created from Hessian-Laplace points. Secondly, the effect of operators in the dictionary matching stage has been evaluated. Processing all image points outperforms the point selection from the Hessian-Laplace operator

    Real-time centre detection of an OLED structure

    Get PDF
    The research presented in this paper focuses on real-time image processing for visual servoing, i.e. the positioning of a x-y table by using a camera only instead of encoders. A camera image stream plus real-time image processing determines the position in the next iteration of the table controller. With a frame rate of 1000 fps, a maximum processing time of only 1 millisecond is allowed for each image of 80x80 pixels. This visual servoing task is performed on an OLED (Organic Light Emitting Diode) substrate that can be found in displays, with a typical size of 100 by 200 µm. The presented algorithm detects the center of an OLED well with sub-pixel accuracy (1 pixel equals 4 µm, sub-pixel accuracy reliable up to ±1 µm) and a computation time less than 1 millisecond

    Real-time centre detection of an OLED structure

    Get PDF
    Abstract. The research presented in this paper focuses on real-time image processing for visual servoing, i.e. the positioning of a x-y table by using a camera only instead of encoders. A camera image stream plus real-time image processing determines the position in the next iteration of the table controller. With a frame rate of 1000 fps, a maximum processing time of only 1 millisecond is allowed for each image of 80x80 pixels. This visual servoing task is performed on an OLED (Organic Light Emitting Diode) substrate that can be found in displays, with a typical size of 100 by 200 µm. The presented algorithm detects the center of an OLED well with sub-pixel accuracy (1 pixel equals 4 µm, sub-pixel accuracy reliable up to ±1 µm) and a computation time less than 1 millisecond

    Method for orthorectification of terrestrial radar maps

    Get PDF
    International audienceThe vehicle-based PELICAN radar system is used in the context of mobile mapping. The R-SLAM algorithm allows simultaneous retrieval of the vehicle trajectory and of the map of the environment. As the purpose of PELICAN is to provide a means for gathering spatial information, the impact of distortion caused by the topography is not negligible. This article proposes an orthorectification process to correct panoramic radar images and the consequent R-SLAM trajectory and radar map. The a priori knowledge of the area topography is provided by a digital elevation model. By applying the method to the data obtained from a path with large variations in altitude it is shown that the corrected panoramic radar images are contracted by the orthorectification process. The efficiency of the orthorectification process is assessed firstly by comparing R-SLAM trajectories to a GPS trajectory and secondly by comparing the position of Ground Control Points on the radar map with their GPS position. The RMS positioning error moves from 5.56 m for the raw radar map to 0.75 m for the orthorectified radar map

    Perceptually-guided deep neural networks for ego-action prediction: Object grasping

    Get PDF
    We tackle the problem of predicting a grasping action in ego-centric video for the assistance to upper limb amputees. Our work is based on paradigms of neuroscience that state that human gaze expresses intention and anticipates actions. In our scenario, human gaze fixations are recorded by a glass-worn eye-tracker and then used to predict the grasping actions. We have studied two aspects of the problem: which object from a given taxonomy will be grasped, and when is the moment to trigger the grasping action. To recognize objects, we using gaze to guide Convolutional Neural Networks (CNN) to focus on an object-to-grasp area. However, the acquired sequence of fixations is noisy due to saccades toward distractors and visual fatigue, and gaze is not always reliably directed toward the object-of-interest. To deal with this challenge, we use video-level annotations indicating the object to be grasped and a weak loss in Deep CNNs. To detect a moment when a person will take an object we take advantage of the predictive power of Long-Short Term Memory networks to analyze gaze and visual dynamics. Results show that our method achieves better performance than other approaches on a real-life dataset. (C) 2018 Elsevier Ltd. All rights reserved.This work was partially supported by French National Center of Scientific research with grant Suvipp PEPS CNRS-Idex 215-2016, by French National Center of Scientific research with Interdisciplinary project CNRS RoBioVis 2017–2019, the Scientific Council of Labri, University of Bordeaux, and the Spanish Ministry of Economy and Competitiveness under the National Grants TEC2014-53390-P and TEC2014-61729-EXP.Publicad

    Attributed Graph Matching using Local Descriptions

    Get PDF
    Version final disponible : www.springerlink.comInternational audienceIn the pattern recognition context, objects can be represented as graphs with attributed nodes and edges involving their relations. Consequently, matching attributed graphs plays an important role in objects recognition. In this paper, a node signatures extraction is combined with an optimal assignment method for matching attributed graphs. In particular, we show how local descriptions are used to define a node-to-node cost in an assignment problem using the Hungarian method. Moreover, we propose a distance formula to compute the distance between attributed graphs. The experiments demonstrate that the newly presented algorithm is well-suited to pattern recognition applications. Compared with well-known methods, our algorithm gives good results for retrieving images

    Algorithms for Image Analysis in Traffic Surveillance Systems

    Get PDF
    Import 23/07/2015The presence of various surveillance systems in many areas of the modern society is indisputable and the most perceptible are the video surveillance systems. This thesis mainly describes novel algorithm for vision-based estimation of the parking lot occupancy and the closely related topics of pre-processing of images captured under harsh conditions. The developed algorithms have their practical application in the parking guidance systems which are still more popular. One part of this work also tries to contribute to the specific area of computer graphics denoted as direct volume rendering (DVR).Přítomnost nejrůznějších dohledových systémů v mnoha oblastech soudobé společnosti je nesporná a systémy pro monitorování dopravy jsou těmi nejviditelnějšími. Hlavní část této práce se věnuje popisu nového algoritmu pro detekci obsazenosti parkovacích míst pomocí analýzy obrazu získaného z kamerového systému. Práce se také zabývá tématy úzce souvisejícími s předzpracováním obrazu získaného za ztížených podmínek. Vyvinuté algoritmy mají své praktické uplatnění zejména v oblasti pomocných parkovacích systémů, které se stávají čím dál tím více populárními. Jedna část této práce se snaží přispět do oblasti počítačové grafiky označované jako přímá vizualizace objemových dat.Prezenční460 - Katedra informatikyvyhově
    corecore