3,436 research outputs found

    DeepMatching: Hierarchical Deformable Dense Matching

    Get PDF
    We introduce a novel matching algorithm, called DeepMatching, to compute dense correspondences between images. DeepMatching relies on a hierarchical, multi-layer, correlational architecture designed for matching images and was inspired by deep convolutional approaches. The proposed matching algorithm can handle non-rigid deformations and repetitive textures and efficiently determines dense correspondences in the presence of significant changes between images. We evaluate the performance of DeepMatching, in comparison with state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al 2005), the MPI-Sintel (Butler et al 2012) and the Kitti (Geiger et al 2013) datasets. DeepMatching outperforms the state-of-the-art algorithms and shows excellent results in particular for repetitive textures.We also propose a method for estimating optical flow, called DeepFlow, by integrating DeepMatching in the large displacement optical flow (LDOF) approach of Brox and Malik (2011). Compared to existing matching algorithms, additional robustness to large displacements and complex motion is obtained thanks to our matching approach. DeepFlow obtains competitive performance on public benchmarks for optical flow estimation

    Object Recognition from very few Training Examples for Enhancing Bicycle Maps

    Full text link
    In recent years, data-driven methods have shown great success for extracting information about the infrastructure in urban areas. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. While large datasets have been published regarding cars, for cyclists very few labeled data is available although appearance, point of view, and positioning of even relevant objects differ. Unfortunately, labeling data is costly and requires a huge amount of work. In this paper, we thus address the problem of learning with very few labels. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. We propose a system for object recognition that is trained with only 15 examples per class on average. To achieve this, we combine the advantages of convolutional neural networks and random forests to learn a patch-wise classifier. In the next step, we map the random forest to a neural network and transform the classifier to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, we integrate data of the Global Positioning System (GPS) to localize the predictions on the map. In comparison to Faster R-CNN and other networks for object recognition or algorithms for transfer learning, we considerably reduce the required amount of labeled data. We demonstrate good performance on the recognition of traffic signs for cyclists as well as their localization in maps.Comment: Submitted to IV 2018. This research was supported by German Research Foundation DFG within Priority Research Programme 1894 "Volunteered Geographic Information: Interpretation, Visualization and Social Computing

    Generalized Kernel-based Visual Tracking

    Full text link
    In this work we generalize the plain MS trackers and attempt to overcome standard mean shift trackers' two limitations. It is well known that modeling and maintaining a representation of a target object is an important component of a successful visual tracker. However, little work has been done on building a robust template model for kernel-based MS tracking. In contrast to building a template from a single frame, we train a robust object representation model from a large amount of data. Tracking is viewed as a binary classification problem, and a discriminative classification rule is learned to distinguish between the object and background. We adopt a support vector machine (SVM) for training. The tracker is then implemented by maximizing the classification score. An iterative optimization scheme very similar to MS is derived for this purpose.Comment: 12 page

    The analytic edge - image reconstruction from edge data via the Cauchy Integral

    Full text link
    A novel image reconstruction algorithm from edges (image gradients) follows from the Sokhostki-Plemelj Theorem of complex analysis, an elaboration of the standard Cauchy (Singular) Integral. This algorithm demonstrates the use of Singular Integral Equation methods to image processing, extending the more common use of Partial Differential Equations (e.g. based on variants of the Diffusion or Poisson equations). The Cauchy Integral approach has a deep connection to and sheds light on the (linear and non-linear) diffusion equation, the retinex algorithm and energy-based image regularization. It extends the commonly understood local definition of an edge to a global, complex analytic structure - the analytic edge - the contrast weighted kernel of the Cauchy Integral. Superposition of the set of analytic edges provides a "filled-in" image which is the piece-wise analytic image corresponding to the edge (gradient data) supplied. This is a fully parallel operation which avoids the time penalty associated with iterative solutions and thus is compatible with the short time (about 150 milliseconds) that is biologically available for the brain to construct a perceptual image from edge data. Although this algorithm produces an exact reconstruction of a filled-in image from the gradients of that image, slight modifications of it produce images which correspond to perceptual reports of human observers when presented with a wide range of "visual contrast illusion" images

    Weighted Bayesian Gaussian Mixture Model for Roadside LiDAR Object Detection

    Full text link
    Background modeling is widely used for intelligent surveillance systems to detect moving targets by subtracting the static background components. Most roadside LiDAR object detection methods filter out foreground points by comparing new data points to pre-trained background references based on descriptive statistics over many frames (e.g., voxel density, number of neighbors, maximum distance). However, these solutions are inefficient under heavy traffic, and parameter values are hard to transfer from one scenario to another. In early studies, the probabilistic background modeling methods widely used for the video-based system were considered unsuitable for roadside LiDAR surveillance systems due to the sparse and unstructured point cloud data. In this paper, the raw LiDAR data were transformed into a structured representation based on the elevation and azimuth value of each LiDAR point. With this high-order tensor representation, we break the barrier to allow efficient high-dimensional multivariate analysis for roadside LiDAR background modeling. The Bayesian Nonparametric (BNP) approach integrates the intensity value and 3D measurements to exploit the measurement data using 3D and intensity info entirely. The proposed method was compared against two state-of-the-art roadside LiDAR background models, computer vision benchmark, and deep learning baselines, evaluated at point, object, and path levels under heavy traffic and challenging weather. This multimodal Weighted Bayesian Gaussian Mixture Model (GMM) can handle dynamic backgrounds with noisy measurements and substantially enhances the infrastructure-based LiDAR object detection, whereby various 3D modeling for smart city applications could be created

    Performance Evaluation of State-of-the-art Filtering Criteria Applied to SIFT Features

    Get PDF
    International audienceUnlike the matching strategy of minimizing dissimilarity measure between descriptors, Lowe, while introducing the SIFT-method, suggested a more effective matching strategy using the ratio between the nearest and the second nearest neighbor. It leads to excellent matching accuracy. Unlike all these strategies that rely on deterministic formalism, some researchers have recently opted for statistical analysis of the matching process. The cornerstone of this formalism exploits the Markov inequality and the ratio criterion has been interpreted as an upper bound on the probability that a match do not belong to the background distribution. In this paper, we first examine some of the assumptions and methods used in these works and demonstrate their inconsistencies. And then, we propose improvements by refining the bound, by providing a tighter bound on that probability. The fact that the ratio criterion is an upper bound indicates that refining the bound reduces the probability that the established matches come from the background. Experiments on the well-known Oxford-5k and Paris-6k datasets show performance improvement for the image retrieval application

    Sea-Surface Object Detection Based on Electro-Optical Sensors: A Review

    Get PDF
    Sea-surface object detection is critical for navigation safety of autonomous ships. Electrooptical (EO) sensors, such as video cameras, complement radar on board in detecting small obstacle sea-surface objects. Traditionally, researchers have used horizon detection, background subtraction, and foreground segmentation techniques to detect sea-surface objects. Recently, deep learning-based object detection technologies have been gradually applied to sea-surface object detection. This article demonstrates a comprehensive overview of sea-surface object-detection approaches where the advantages and drawbacks of each technique are compared, covering four essential aspects: EO sensors and image types, traditional object-detection methods, deep learning methods, and maritime datasets collection. In particular, sea-surface object detections based on deep learning methods are thoroughly analyzed and compared with highly influential public datasets introduced as benchmarks to verify the effectiveness of these approaches. The arti
    • …
    corecore