Search CORE

280 research outputs found

Robust Object Detection with Interleaved Categorization and Segmentation

Author: Leibe Bastian
Leonardis Aleš
Schiele Bernt
Publication venue
Publication date: 18/06/2018
Field of study

This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. Our approach considers object categorization and figure-ground segmentation as two interleaved processes that closely collaborate towards a common goal. As shown in our work, the tight coupling between those two processes allows them to benefit from each other and improve the combined performance. The core part of our approach is a highly flexible learned representation for object shape that can combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. The resulting approach can detect categorical objects in novel images and automatically infer a probabilistic segmentation from the recognition result. This segmentation is then in turn used to again improve recognition by allowing the system to focus its efforts on object pixels and to discard misleading influences from the background. Moreover, the information from where in the image a hypothesis draws its support is employed in an MDL based hypothesis verification stage to resolve ambiguities between overlapping hypotheses and factor out the effects of partial occlusion. An extensive evaluation on several large data sets shows that the proposed system is applicable to a range of different object categories, including both rigid and articulated objects. In addition, its flexible representation allows it to achieve competitive object detection performance already from training sets that are between one and two orders of magnitude smaller than those used in comparable system

RERO DOC Digital Library

A Review of Codebook Models in Patch-Based Visual Object Recognition

Author: Niranjan Mahesan
Ramanan Amirthalingam
Publication venue
Publication date: 22/09/2011
Field of study

The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

Southampton (e-Prints Soton)

Crossref

Energy Based Multi-Model Fitting and Matching Problems

Author: Isack Hossam N
Publication venue: Scholarship@Western
Publication date: 01/01/2014
Field of study

Feature matching and model fitting are fundamental problems in multi-view geometry. They are chicken-&-egg problems: if models are known it is easier to find matches and vice versa. Standard multi-view geometry techniques sequentially solve feature matching and model fitting as two independent problems after making fairly restrictive assumptions. For example, matching methods rely on strong discriminative power of feature descriptors, which fail for stereo images with repetitive textures or wide baseline. Also, model fitting methods assume given feature matches, which are not known a priori. Moreover, when data supports multiple models the fitting problem becomes challenging even with known matches and current methods commonly use heuristics. One of the main contributions of this thesis is a joint formulation of fitting and matching problems. We are first to introduce an objective function combining both matching and multi-model estimation. We also propose an approximation algorithm for the corresponding NP-hard optimization problem using block-coordinate descent with respect to matching and model fitting variables. For fixed models, our method uses min-cost-max-flow based algorithm to solve a generalization of a linear assignment problem with label cost (sparsity constraint). Fixed matching case reduces to multi-model fitting subproblem, which is interesting in its own right. In contrast to standard heuristic approaches, we introduce global objective functions for multi-model fitting using various forms of regularization (spatial smoothness and sparsity) and propose a graph-cut based optimization algorithm, PEaRL. Experimental results show that our proposed mathematical formulations and optimization algorithms improve the accuracy and robustness of model estimation over the state-of-the-art in computer vision

CiteSeerX

Scholarship@Western

Human detection, tracking and segmentation from low-level to high-level vision

Author: Chen Cheng
Publication venue
Publication date: 01/12/2008
Field of study

The goal of this research is to detect, segment and track a human body as well as estimate its limb configuration from cluttered background. These are fundamental research issues that have attracted intensive attention in the computer vision community because of their wide applications. Meanwhile they also remain to be ones of the most challenging research issues largely due to the ubiquitous visual ambiguities in images/videos. The other challenging factor is the ill-posed nature of the problems. Inspired by the recent findings in cognitive psychology, we adopt several biologically plausible approaches to attack these challenging problems. This dissertation provides a comprehensive study of human detection, tracking and segmentation that covers several research issues ranging from low, middle, and high-level vision.In low-level vision, we investigate video segmentation where the main challenge is the non-convex classification problem, and we develop a cascaded multi-layer segmentation framework where no-convex classification problems are addressed in a split-and-merge paradigm by combining merits of both statistical modeling and graph theory.In middle-level vision, we propose a segmentation based hypothesis-and-test paradigm to achieve joint localization and segmentation that exploits the complementary nature of region-based and edge-based shape priors. In addition, we integrate both priors into a Graph-cut framework to improve the segmentation results.In high-level vision, our research has two related parts. First, we propose a hybrid body representation that embraces part-whole shape priors and part-based spatial prior for integrated pose recognition, localization and segmentation in a given image. Second, we further combine spatial and temporal priors in an integrated online learning and inference framework, where body parts can be detected, localized and segmented simultaneously from a video sequence. Both of them are supported by previous low-level and mid-level vision tasks.Experimental results show that the proposed algorithms can achieve accurate and robust tracking, localization and segmentation results for different walking subjects with significant appearance and motion variability and under cluttered background

SHAREOK repository

Two and three dimensional segmentation of multimodal imagery

Author: Vantaram Sreenath Rao
Publication venue: RIT Scholar Works
Publication date: 01/10/2012
Field of study

The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

RIT Scholar Works

Review of Graph, Medical and Color Image base Segmentation Techniques

Author: Patel Janakkumar Baldevbhai
R S Anand
Publication venue
Publication date: 24/04/2020
Field of study

Abstract-Thi

CiteSeerX

Using contour information and segmentation for object registration, modeling and retrieval

Author: Adamek Tomasz
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

Irish Universities

DCU Online Research Access Service

Combination of multiple image segmentations

Author: Wattuya P. (Pakaket)
Publication venue
Publication date: 20/07/2010
Field of study

Die Arbeit betrachtet die Kombination von mehreren Bildsegmentierungen im Bereich von contour detection und regionenbasierter Bildsegmentierung. Das Ziel ist die Kombination von mehreren Segmentierungen in eine verbesserte finale Segmentierung. Im Fall der regionenbasierten Kombination von Segmentierungen wird das generalized median Konzept verwendet, um automatisch die endgueltige Anzahl von Regionen zu bestimmen. Umfangreiche Experimente zeigen, dass die vorgeschlagene Kombinationsmethode bessere Ergebnisse erzielt als der Lernansatz unter Verwendung von Ground Truth Daten. Schliesslich untersuchen Experimente mit Evaluationsmassen fuer Segmentierungen das Verhalten sowie die Metrik-Eigenschaft der Masse. Die Studie soll als Leitlinie fuer die geeignete Wahl von Evaluationsmassen dienen. The thesis concerns combination of multiple image segmentations in the domains of contour detection and region-based image segmentation. The goal is to combine multiple segmentations into a final improved result. In the case of region-based image segmentation combination, a generalized median concept is proposed to automatically determine the final number of regions. Extensive experiments demonstrate that our combination method outperforms the ground truth based training approach. In addition, experimental investigation of existing segmentation evaluation measures on the metric property and the evaluating behaviors is presented. This study is intended to be as a guideline for appropriately choosing the evaluation measures

Münstersches Informations und Archivsystem für Multimediale Inhalte

Man-made Surface Structures from Triangulated Point Clouds

Author: Schindler Falko
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Photogrammetry aims at reconstructing shape and dimensions of objects captured with cameras, 3D laser scanners or other spatial acquisition systems. While many acquisition techniques deliver triangulated point clouds with millions of vertices within seconds, the interpretation is usually left to the user. Especially when reconstructing man-made objects, one is interested in the underlying surface structure, which is not inherently present in the data. This includes the geometric shape of the object, e.g. cubical or cylindrical, as well as corresponding surface parameters, e.g. width, height and radius. Applications are manifold and range from industrial production control to architectural on-site measurements to large-scale city models. The goal of this thesis is to automatically derive such surface structures from triangulated 3D point clouds of man-made objects. They are defined as a compound of planar or curved geometric primitives. Model knowledge about typical primitives and relations between adjacent pairs of them should affect the reconstruction positively. After formulating a parametrized model for man-made surface structures, we develop a reconstruction framework with three processing steps: During a fast pre-segmentation exploiting local surface properties we divide the given surface mesh into planar regions. Making use of a model selection scheme based on minimizing the description length, this surface segmentation is free of control parameters and automatically yields an optimal number of segments. A subsequent refinement introduces a set of planar or curved geometric primitives and hierarchically merges adjacent regions based on their joint description length. A global classification and constraint parameter estimation combines the data-driven segmentation with high-level model knowledge. Therefore, we represent the surface structure with a graphical model and formulate factors based on likelihood as well as prior knowledge about parameter distributions and class probabilities. We infer the most probable setting of surface and relation classes with belief propagation and estimate an optimal surface parametrization with constraints induced by inter-regional relations. The process is specifically designed to work on noisy data with outliers and a few exceptional freeform regions not describable with geometric primitives. It yields full 3D surface structures with watertightly connected surface primitives of different types. The performance of the proposed framework is experimentally evaluated on various data sets. On small synthetically generated meshes we analyze the accuracy of the estimated surface parameters, the sensitivity w.r.t. various properties of the input data and w.r.t. model assumptions as well as the computational complexity. Additionally we demonstrate the flexibility w.r.t. different acquisition techniques on real data sets. The proposed method turns out to be accurate, reasonably fast and little sensitive to defects in the data or imprecise model assumptions.Künstliche Oberflächenstrukturen aus triangulierten Punktwolken Ein Ziel der Photogrammetrie ist die Rekonstruktion der Form und Größe von Objekten, die mit Kameras, 3D-Laserscannern und anderern räumlichen Erfassungssystemen aufgenommen wurden. Während viele Aufnahmetechniken innerhalb von Sekunden triangulierte Punktwolken mit Millionen von Punkten liefern, ist deren Interpretation gewöhnlicherweise dem Nutzer überlassen. Besonders bei der Rekonstruktion künstlicher Objekte (i.S.v. engl. man-made = „von Menschenhand gemacht“ ist man an der zugrunde liegenden Oberflächenstruktur interessiert, welche nicht inhärent in den Daten enthalten ist. Diese umfasst die geometrische Form des Objekts, z.B. quaderförmig oder zylindrisch, als auch die zugehörigen Oberflächenparameter, z.B. Breite, Höhe oder Radius. Die Anwendungen sind vielfältig und reichen von industriellen Fertigungskontrollen über architektonische Raumaufmaße bis hin zu großmaßstäbigen Stadtmodellen. Das Ziel dieser Arbeit ist es, solche Oberflächenstrukturen automatisch aus triangulierten Punktwolken von künstlichen Objekten abzuleiten. Sie sind definiert als ein Verbund ebener und gekrümmter geometrischer Primitive. Modellwissen über typische Primitive und Relationen zwischen Paaren von ihnen soll die Rekonstruktion positiv beeinflussen. Nachdem wir ein parametrisiertes Modell für künstliche Oberflächenstrukturen formuliert haben, entwickeln wir ein Rekonstruktionsverfahren mit drei Verarbeitungsschritten: Im Rahmen einer schnellen Vorsegmentierung, die lokale Oberflächeneigenschaften berücksichtigt, teilen wir die gegebene vermaschte Oberfläche in ebene Regionen. Unter Verwendung eines Schemas zur Modellauswahl, das auf der Minimierung der Beschreibungslänge beruht, ist diese Oberflächensegmentierung unabhängig von Kontrollparametern und liefert automatisch eine optimale Anzahl an Regionen. Eine anschließende Verbesserung führt eine Menge von ebenen und gekrümmten geometrischen Primitiven ein und fusioniert benachbarte Regionen hierarchisch basierend auf ihrer gemeinsamen Beschreibungslänge. Eine globale Klassifikation und bedingte Parameterschätzung verbindet die datengetriebene Segmentierung mit hochrangigem Modellwissen. Dazu stellen wir die Oberflächenstruktur in Form eines graphischen Modells dar und formulieren Faktoren basierend auf der Likelihood sowie auf apriori Wissen über die Parameterverteilungen und Klassenwahrscheinlichkeiten. Wir leiten die wahrscheinlichste Konfiguration von Flächen- und Relationsklassen mit Hilfe von Belief-Propagation ab und schätzen eine optimale Oberflächenparametrisierung mit Bedingungen, die durch die Relationen zwischen benachbarten Primitiven induziert werden. Der Prozess ist eigens für verrauschte Daten mit Ausreißern und wenigen Ausnahmeregionen konzipiert, die nicht durch geometrische Primitive beschreibbar sind. Er liefert wasserdichte 3D-Oberflächenstrukturen mit Oberflächenprimitiven verschiedener Art. Die Leistungsfähigkeit des vorgestellten Verfahrens wird an verschiedenen Datensätzen experimentell evaluiert. Auf kleinen, synthetisch generierten Oberflächen untersuchen wir die Genauigkeit der geschätzten Oberflächenparameter, die Sensitivität bzgl. verschiedener Eigenschaften der Eingangsdaten und bzgl. Modellannahmen sowie die Rechenkomplexität. Außerdem demonstrieren wir die Flexibilität bzgl. verschiedener Aufnahmetechniken anhand realer Datensätze. Das vorgestellte Rekonstruktionsverfahren erweist sich als genau, hinreichend schnell und wenig anfällig für Defekte in den Daten oder falsche Modellannahmen

bonndoc – Der Publikationsserver der Universität Bonn