5,255 research outputs found

    Connectivity-Enforcing Hough Transform for the Robust Extraction of Line Segments

    Full text link
    Global voting schemes based on the Hough transform (HT) have been widely used to robustly detect lines in images. However, since the votes do not take line connectivity into account, these methods do not deal well with cluttered images. In opposition, the so-called local methods enforce connectivity but lack robustness to deal with challenging situations that occur in many realistic scenarios, e.g., when line segments cross or when long segments are corrupted. In this paper, we address the critical limitations of the HT as a line segment extractor by incorporating connectivity in the voting process. This is done by only accounting for the contributions of edge points lying in increasingly larger neighborhoods and whose position and directional content agree with potential line segments. As a result, our method, which we call STRAIGHT (Segment exTRAction by connectivity-enforcInG HT), extracts the longest connected segments in each location of the image, thus also integrating into the HT voting process the usually separate step of individual segment extraction. The usage of the Hough space mapping and a corresponding hierarchical implementation make our approach computationally feasible. We present experiments that illustrate, with synthetic and real images, how STRAIGHT succeeds in extracting complete segments in several situations where current methods fail.Comment: Submitted for publicatio

    On The Continuous Steering of the Scale of Tight Wavelet Frames

    Full text link
    In analogy with steerable wavelets, we present a general construction of adaptable tight wavelet frames, with an emphasis on scaling operations. In particular, the derived wavelets can be "dilated" by a procedure comparable to the operation of steering steerable wavelets. The fundamental aspects of the construction are the same: an admissible collection of Fourier multipliers is used to extend a tight wavelet frame, and the "scale" of the wavelets is adapted by scaling the multipliers. As an application, the proposed wavelets can be used to improve the frequency localization. Importantly, the localized frequency bands specified by this construction can be scaled efficiently using matrix multiplication

    Advances in Stereo Vision

    Get PDF
    Stereopsis is a vision process whose geometrical foundation has been known for a long time, ever since the experiments by Wheatstone, in the 19th century. Nevertheless, its inner workings in biological organisms, as well as its emulation by computer systems, have proven elusive, and stereo vision remains a very active and challenging area of research nowadays. In this volume we have attempted to present a limited but relevant sample of the work being carried out in stereo vision, covering significant aspects both from the applied and from the theoretical standpoints

    Automated segmentation, detection and fitting of piping elements from terrestrial LIDAR data

    Get PDF
    Since the invention of light detection and ranging (LIDAR) in the early 1960s, it has been adopted for use in numerous applications, from topographical mapping with airborne LIDAR platforms to surveying of urban sites with terrestrial LIDAR systems. Static terrestrial LIDAR has become an especially effective tool for surveying, in some cases replacing traditional techniques such as electronic total stations and GPS methods. Current state-of-the-art LIDAR scanners have very fine spatial resolution, generating precise 3D point cloud data with millimeter accuracy. Therefore, LIDAR data can provide 3D details of a scene with an unprecedented level of details. However, automated exploitation of LIDAR data is challenging, due to the non-uniform spatial sampling of the point clouds as well as to the massive volumes of data, which may range from a few million points to hundreds of millions of points depending on the size and complexity of the scene being scanned. ^ This dissertation focuses on addressing these challenges to automatically exploit large LIDAR point clouds of piping systems in industrial sites, such as chemical plants, oil refineries, and steel mills. A complete processing chain is proposed in this work, using raw LIDAR point clouds as input and generating cylinder parameter estimates for pipe segments as the output, which could then be used to produce computer aided design (CAD) models of pipes. The processing chain consists of three stages: (1) segmentation of LIDAR point clouds, (2) detection and identification of piping elements, and (3) cylinder fitting and parameter estimation. The final output of the cylinder fitting stage gives the estimated orientation, position, and radius of each detected pipe element. ^ A robust octree-based split and merge segmentation algorithm is proposed in this dissertation that can efficiently process LIDAR data. Following octree decomposition of the point cloud, graph theory analysis is used during the splitting process to separate points within each octant into components based on spatial connectivity. A series of connectivity criteria (proximity, orientation, and curvature) are developed for the merging process, which exploits contextual information to effectively merge cylindrical segments into complete pipes and planar segments into complete walls. Furthermore, by conducting surface fitting of segments and analyzing their principal curvatures, the proposed segmentation approach is capable of detecting and identifying the piping segments. ^ A novel cylinder fitting technique is proposed to accurately estimate the cylinder parameters for each detected piping segment from the terrestrial LIDAR point cloud. Specifically, the orientation, radius, and position of each piping element must be robustly estimated in the presence of noise. An original formulation has been developed to estimate the cylinder axis orientation using gradient descent optimization of an angular distance cost function. The cost function is based on the concept that surface normals of points in a cylinder point cloud are perpendicular to the cylinder axis. The key contribution of this algorithm is its capability to accurately estimate the cylinder orientation in the presence of noise without requiring a good initial starting point. After estimation of the cylinder\u27s axis orientation, the radius and position are then estimated in the 2D space formed from the projection of the 3D cylinder point cloud onto the plane perpendicular to the cylinder\u27s axis. With these high quality approximations, a least squares estimation in 3D is made for the final cylinder parameters. ^ Following cylinder fitting, the estimated parameters of each detected piping segment are used to generate a CAD model of the piping system. The algorithms and techniques in this dissertation form a complete processing chain that can automatically exploit large LIDAR point cloud of piping systems and generate CAD models

    City Scene Super-Resolution via Geometric Error Minimization

    Full text link
    Super-resolution techniques are crucial in improving image granularity, particularly in complex urban scenes, where preserving geometric structures is vital for data-informed cultural heritage applications. In this paper, we propose a city scene super-resolution method via geometric error minimization. The geometric-consistent mechanism leverages the Hough Transform to extract regular geometric features in city scenes, enabling the computation of geometric errors between low-resolution and high-resolution images. By minimizing mixed mean square error and geometric align error during the super-resolution process, the proposed method efficiently restores details and geometric regularities. Extensive validations on the SET14, BSD300, Cityscapes and GSV-Cities datasets demonstrate that the proposed method outperforms existing state-of-the-art methods, especially in urban scenes.Comment: 26 pages, 10 figure

    Mass Displacement Networks

    Full text link
    Despite the large improvements in performance attained by using deep learning in computer vision, one can often further improve results with some additional post-processing that exploits the geometric nature of the underlying task. This commonly involves displacing the posterior distribution of a CNN in a way that makes it more appropriate for the task at hand, e.g. better aligned with local image features, or more compact. In this work we integrate this geometric post-processing within a deep architecture, introducing a differentiable and probabilistically sound counterpart to the common geometric voting technique used for evidence accumulation in vision. We refer to the resulting neural models as Mass Displacement Networks (MDNs), and apply them to human pose estimation in two distinct setups: (a) landmark localization, where we collapse a distribution to a point, allowing for precise localization of body keypoints and (b) communication across body parts, where we transfer evidence from one part to the other, allowing for a globally consistent pose estimate. We evaluate on large-scale pose estimation benchmarks, such as MPII Human Pose and COCO datasets, and report systematic improvements when compared to strong baselines.Comment: 12 pages, 4 figure
    • …
    corecore