Search CORE

63 research outputs found

Resolution-independent superpixels based on convex constrained meshes without small angles

Author: A Levinshtein
J Shi
JR Shewchuk
M Bergh Van de
O Veksler
P Arbelaez
P Felzenszwalb
R Achanta
RG Gioi von
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/12/2016
Field of study

University of Liverpool Repository

Crossref

Convex constrained meshes for superpixel segmentations of images.

Author: Forsythe Jeremy
Kurlin Vitaliy
Publication venue
Publication date: 01/01/2017
Field of study

University of Liverpool Repository

Resolution-Independent Meshes of Superpixels

Author: A Levinshtein
J Forsythe
J Shi
M Bergh Van de
P Arbelaez
P Felzenszwalb
R Achanta
RG Gioi Von
V Kurlin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2019
Field of study

The over-segmentation into superpixels is an important preprocessing step to smartly compress the input size and speed up higher level tasks. A superpixel was traditionally considered as a small cluster of square-based pixels that have similar color intensities and are closely located to each other. In this discrete model the boundaries of superpixels often have irregular zigzags consisting of horizontal or vertical edges from a given pixel grid. However digital images represent a continuous world, hence the following continuous model in the resolution-independent formulation can be more suitable for the reconstruction problem. Instead of uniting squares in a grid, a resolution-independent superpixel is defined as a polygon that has straight edges with any possible slope at subpixel resolution. The harder continuous version of the over-segmentation problem is to split an image into polygons and find a best (say, constant) color of each polygon so that the resulting colored mesh well approximates the given image. Such a mesh of polygons can be rendered at any higher resolution with all edges kept straight. We propose a fast conversion of any traditional superpixels into polygons and guarantees that their straight edges do not intersect. The meshes based on the superpixels SEEDS (Superpixels Extracted via Energy-Driven Sampling) and SLIC (Simple Linear Iterative Clustering) are compared with past meshes based on the Line Segment Detector. The experiments on the Berkeley Segmentation Database confirm that the new superpixels have more compact shapes than pixel-based superpixels

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

A Persistence-Based Approach to Automatic Detection of Line Segments in Images

Author: A Chernov
D Ballard
D Cohen-Steiner
H Edelsbrunner
J Canny
J Forsythe
J Forsythe
P Arbelaez
P Kahn
R Grompone von Gioi
R Grompone von Gioi
R Tarjan
S Kalisnik
V Kurlin
V Kurlin
V Kurlin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

University of Liverpool Repository

Crossref

KIPPI: KInetic Polygonal Partitioning of Images

Author: Bauchet Jean-Philippe
Lafarge Florent
Publication venue: HAL CCSD
Publication date: 18/06/2018
Field of study

International audienceRecent works showed that floating polygons can be an interesting alternative to traditional superpixels, especially for analyzing scenes with strong geometric signatures , as man-made environments. Existing algorithms produce homogeneously-sized polygons that fail to capture thin geometric structures and over-partition large uniform areas. We propose a kinetic approach that brings more flexibility on polygon shape and size. The key idea consists in progressively extending pre-detected line-segments until they meet each other. Our experiments demonstrate that output partitions both contain less polygons and better capture geometric structures than those delivered by existing methods. We also show the applicative potential of the method when used as preprocessing in object contouring

Crossref

INRIA a CCSD electronic archive server

From light rays to 3D models

Author: Donné Simon
Publication venue: Faculty of Engineering and Architecture Ghent University
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Monocular 3d Object Recognition

Author: Zhu Menglong
Publication venue: ScholarlyCommons
Publication date: 01/01/2016
Field of study

Object recognition is one of the fundamental tasks of computer vision. Recent advances in the field enable reliable 2D detections from a single cluttered image. However, many challenges still remain. Object detection needs timely response for real world applications. Moreover, we are genuinely interested in estimating the 3D pose and shape of an object or human for the sake of robotic manipulation and human-robot interaction. In this thesis, a suite of solutions to these challenges is presented. First, Active Deformable Part Models (ADPM) is proposed for fast part-based object detection. ADPM dramatically accelerates the detection by dynamically scheduling the part evaluations and efficiently pruning the image locations. Second, we unleash the power of marrying discriminative 2D parts with an explicit 3D geometric representation. Several methods of such scheme are proposed for recovering rich 3D information of both rigid and non-rigid objects from monocular RGB images. (1) The accurate 3D pose of an object instance is recovered from cluttered images using only the CAD model. (2) A global optimal solution for simultaneous 2D part localization, 3D pose and shape estimation is obtained by optimizing a unified convex objective function. Both appearance and geometric compatibility are jointly maximized. (3) 3D human pose estimation from an image sequence is realized via an Expectation-Maximization algorithm. The 2D joint location uncertainties are marginalized out during inference and 3D pose smoothness is enforced across frames. By bridging the gap between 2D and 3D, our methods provide an end-to-end solution to 3D object recognition from images. We demonstrate a range of interesting applications using only a single image or a monocular video, including autonomous robotic grasping with a single image, 3D object image pop-up and a monocular human MoCap system. We also show empirical start-of-art results on a number of benchmarks on 2D detection and 3D pose and shape estimation

ScholarlyCommons@Penn

Combining Features and Semantics for Low-level Computer Vision

Author: Güney Fatma
Publication venue: Universität Tübingen
Publication date: 01/01/2018
Field of study

Visual perception of depth and motion plays a significant role in understanding and navigating the environment. Reconstructing outdoor scenes in 3D and estimating the motion from video cameras are of utmost importance for applications like autonomous driving. The corresponding problems in computer vision have witnessed tremendous progress over the last decades, yet some aspects still remain challenging today. Striking examples are reflecting and textureless surfaces or large motions which cannot be easily recovered using traditional local methods. Further challenges include occlusions, large distortions and difficult lighting conditions. In this thesis, we propose to overcome these challenges by modeling non-local interactions leveraging semantics and contextual information. Firstly, for binocular stereo estimation, we propose to regularize over larger areas on the image using object-category specific disparity proposals which we sample using inverse graphics techniques based on a sparse disparity estimate and a semantic segmentation of the image. The disparity proposals encode the fact that objects of certain categories are not arbitrarily shaped but typically exhibit regular structures. We integrate them as non-local regularizer for the challenging object class 'car' into a superpixel-based graphical model and demonstrate its benefits especially in reflective regions. Secondly, for 3D reconstruction, we leverage the fact that the larger the reconstructed area, the more likely objects of similar type and shape will occur in the scene. This is particularly true for outdoor scenes where buildings and vehicles often suffer from missing texture or reflections, but share similarity in 3D shape. We take advantage of this shape similarity by localizing objects using detectors and jointly reconstructing them while learning a volumetric model of their shape. This allows to reduce noise while completing missing surfaces as objects of similar shape benefit from all observations for the respective category. Evaluations with respect to LIDAR ground-truth on a novel challenging suburban dataset show the advantages of modeling structural dependencies between objects. Finally, motivated by the success of deep learning techniques in matching problems, we present a method for learning context-aware features for solving optical flow using discrete optimization. Towards this goal, we present an efficient way of training a context network with a large receptive field size on top of a local network using dilated convolutions on patches. We perform feature matching by comparing each pixel in the reference image to every pixel in the target image, utilizing fast GPU matrix multiplication. The matching cost volume from the network's output forms the data term for discrete MAP inference in a pairwise Markov random field. Extensive evaluations reveal the importance of context for feature matching.Die visuelle Wahrnehmung von Tiefe und Bewegung spielt eine wichtige Rolle bei dem Verständnis und der Navigation in unserer Umwelt. Die 3D Rekonstruktion von Szenen im Freien und die Schätzung der Bewegung von Videokameras sind von größter Bedeutung für Anwendungen, wie das autonome Fahren. Die Erforschung der entsprechenden Probleme des maschinellen Sehens hat in den letzten Jahrzehnten enorme Fortschritte gemacht, jedoch bleiben einige Aspekte heute noch ungelöst. Beispiele hierfür sind reflektierende und texturlose Oberflächen oder große Bewegungen, bei denen herkömmliche lokale Methoden häufig scheitern. Weitere Herausforderungen sind niedrige Bildraten, Verdeckungen, große Verzerrungen und schwierige Lichtverhältnisse. In dieser Arbeit schlagen wir vor nicht-lokale Interaktionen zu modellieren, die semantische und kontextbezogene Informationen nutzen, um diese Herausforderungen zu meistern. Für die binokulare Stereo Schätzung schlagen wir zuallererst vor zusammenhängende Bereiche mit objektklassen-spezifischen Disparitäts Vorschlägen zu regularisieren, die wir mit inversen Grafik Techniken auf der Grundlage einer spärlichen Disparitätsschätzung und semantischen Segmentierung des Bildes erhalten. Die Disparitäts Vorschläge kodieren die Tatsache, dass die Gegenstände bestimmter Kategorien nicht willkürlich geformt sind, sondern typischerweise regelmäßige Strukturen aufweisen. Wir integrieren sie für die komplexe Objektklasse 'Auto' in Form eines nicht-lokalen Regularisierungsterm in ein Superpixel-basiertes grafisches Modell und zeigen die Vorteile vor allem in reflektierenden Bereichen. Zweitens nutzen wir für die 3D-Rekonstruktion die Tatsache, dass mit der Größe der rekonstruierten Fläche auch die Wahrscheinlichkeit steigt, Objekte von ähnlicher Art und Form in der Szene zu enthalten. Dies gilt besonders für Szenen im Freien, in denen Gebäude und Fahrzeuge oft vorkommen, die unter fehlender Textur oder Reflexionen leiden aber ähnlichkeit in der Form aufweisen. Wir nutzen diese ähnlichkeiten zur Lokalisierung von Objekten mit Detektoren und zur gemeinsamen Rekonstruktion indem ein volumetrisches Modell ihrer Form erlernt wird. Dies ermöglicht auftretendes Rauschen zu reduzieren, während fehlende Flächen vervollständigt werden, da Objekte ähnlicher Form von allen Beobachtungen der jeweiligen Kategorie profitieren. Die Evaluierung auf einem neuen, herausfordernden vorstädtischen Datensatz in Anbetracht von LIDAR-Entfernungsdaten zeigt die Vorteile der Modellierung von strukturellen Abhängigkeiten zwischen Objekten. Zuletzt, motiviert durch den Erfolg von Deep Learning Techniken bei der Mustererkennung, präsentieren wir eine Methode zum Erlernen von kontextbezogenen Merkmalen zur Lösung des optischen Flusses mittels diskreter Optimierung. Dazu stellen wir eine effiziente Methode vor um zusätzlich zu einem Lokalen Netzwerk ein Kontext-Netzwerk zu erlernen, das mit Hilfe von erweiterter Faltung auf Patches ein großes rezeptives Feld besitzt. Für das Feature Matching vergleichen wir mit schnellen GPU-Matrixmultiplikation jedes Pixel im Referenzbild mit jedem Pixel im Zielbild. Das aus dem Netzwerk resultierende Matching Kostenvolumen bildet den Datenterm für eine diskrete MAP Inferenz in einem paarweisen Markov Random Field. Eine umfangreiche Evaluierung zeigt die Relevanz des Kontextes für das Feature Matching

Publikationsserver der Universität Tübingen

A fast approximate skeleton with guarantees for any cloud of points in a Euclidean space

Author: Elkin Yury
Kurlin Vitaliy
Liu Di
Publication venue
Publication date: 17/07/2020
Field of study

The tree reconstruction problem is to find an embedded straight-line tree that approximates a given cloud of unorganized points in

\mathbb{R}^m

up to a certain error. A practical solution to this problem will accelerate a discovery of new colloidal products with desired physical properties such as viscosity. We define the Approximate Skeleton of any finite point cloud

C

in a Euclidean space with theoretical guarantees. The Approximate Skeleton ASk

(C)

always belongs to a given offset of

C

, i.e. the maximum distance from

C

to ASk

(C)

can be a given maximum error. The number of vertices in the Approximate Skeleton is close to the minimum number in an optimal tree by factor 2. The new Approximate Skeleton of any unorganized point cloud

C

is computed in a near linear time in the number of points in

C

. Finally, the Approximate Skeleton outperforms past skeletonization algorithms on the size and accuracy of reconstruction for a large dataset of real micelles and random clouds

arXiv.org e-Print Archive

University of Liverpool Repository