11 research outputs found

    Оценка моделей представления данных в системах обнаружения и распознавания объектов

    Get PDF
    В статье предложена классификация моделей представления данных в системах обнаружения и распознавания объектов визуальной сцены для решения практических задач. Впервые предложен комплекс критериев для оценки моделей представления данных. Приведены области применения рассмотренных методов.У статті запропоновано класифікацію моделей подання даних в системах знаходження та розпізнавання об’єктів візуальних сцен для вирішення практичних задач. Вперше запропоновано комплекс критеріїв для оцінки моделей подання даних. Наведено галузі застосування оглянутих методів.The data representation models classification in visual scene object detection and recognition systems is introduced. The criteria complex for data representation model evaluation is developed. The application domains for reviewed methods are adduced

    Background Filtering for Improving of Object Detection in Images

    Full text link

    Self-Supervised Clustering for Codebook Construction: An Application to Object Localization

    Get PDF
    Abstract. Approaches to object localization based on codebooks do not exploit the dependencies between appearance and geometric information present in training data. This work addresses the problem of computing a codebook tailored to the task of localization by applying regularization based on geometric information. We present a novel method, the Regularized Combined Partitional-Agglomerative clustering, which extends the standard CPA method by adding extra knowledge to the clustering process to preserve as much geometric information as needed. Due to the time complexity of the methodology, we also present an implementation on the GPU using nVIDIA CUDA technology, speeding up the process with a factor over 100x

    Local features for visual object matching and video scene detection

    Get PDF
    Local features are important building blocks for many computer vision algorithms such as visual object alignment, object recognition, and content-based image retrieval. Local features are extracted from an image by a local feature detector and then the detected features are encoded using a local feature descriptor. The resulting features based on the descriptors, such as histograms or binary strings, are used in matching to find similar features between objects in images. In this thesis, we deal with two research problem in the context of local features for object detection: we extend the original local feature detector and descriptor performance benchmarks from the wide baseline setting to the intra-class matching; and propose local features for consumer video scene boundary detection. In the intra-class matching, the visual appearance of objects semantic class can be very different (e.g., Harley Davidson and Scooter in the same motorbike class) and making the task more difficult than wide baseline matching. The performance of different local feature detectors and descriptors are evaluated over three different image databases and results for more advance analysis are reported. In the second part of the thesis, we study the use of Bag-of-Words (BoW) in the video scene boundary detection. In literature there have been several approaches to the task exploiting the local features, but based on the author’s knowledge, none of them are practical in an online processing of user videos. We introduce an online BoW based scene boundary detector using a dynamic codebook, study the optimal parameters for the detector and compare our method to the existing methods. Precision and recall curves are used as a performance metric. The goal of this thesis is to find the best local feature detector and descriptor for intra-class matching and develop a novel scene boundary detection method for online applications

    Human Pose Estimation with Implicit Shape Models

    Get PDF
    This work presents a new approach for estimating 3D human poses based on monocular camera information only. For this, the Implicit Shape Model is augmented by new voting strategies that allow to localize 2D anatomical landmarks in the image. The actual 3D pose estimation is then formulated as a Particle Swarm Optimization (PSO) where projected 3D pose hypotheses are compared with the generated landmark vote distributions

    Visual vocabularies for category-level object recognition

    Get PDF
    This thesis focuses on the study of visual vocabularies for category-level object recognition. Specifically, we state novel approaches for building visual codebooks. Our aim is not just to obtain more discriminative and more compact visual codebooks, but to bridge the gap between visual features and semantic concepts. A novel approach for obtaining class representative visual words is presented. It is based on a maximisation procedure, i. e. the Cluster Precision Maximisation (CPM), of a novel cluster precision criterion, and on an adaptive threshold refinement scheme for agglomerative clustering algorithms based on correlation clustering techniques. The objective is to increase the vocabulary compactness while at the same time improve the recognition rate and further increase the representativeness of the visual words. Moreover, we describe a novel clustering aggregation based approach for building efficient and semantic visual vocabularies. It consist of a novel framework for incorporating neighboring appearances of local descriptors into the vocabulary construction, and a rigorous approach for adding meaningful spatial coherency among the local features into the visual codebooks. We also propose an efficient high-dimensional data clustering algorithm, the Fast Reciprocal Nearest Neighbours (Fast-RNN). Our approach, which is a speeded up version of the standard RNN algorithm, is based on the projection search paradigm. Finally, we release a new database of images called Image Collection of Annotated Real-world Objects (ICARO), which is especially designed for evaluating category-level object recognition systems. An exhaustive comparison of ICARO with other well-known datasets used within the same context is carried out. We also propose a benchmark for both object classification and detection

    Visual vocabularies for category-level object recognition

    Get PDF
    This thesis focuses on the study of visual vocabularies for category-level object recognition. Specifically, we state novel approaches for building visual codebooks. Our aim is not just to obtain more discriminative and more compact visual codebooks, but to bridge the gap between visual features and semantic concepts. A novel approach for obtaining class representative visual words is presented. It is based on a maximisation procedure, i. e. the Cluster Precision Maximisation (CPM), of a novel cluster precision criterion, and on an adaptive threshold refinement scheme for agglomerative clustering algorithms based on correlation clustering techniques. The objective is to increase the vocabulary compactness while at the same time improve the recognition rate and further increase the representativeness of the visual words. Moreover, we describe a novel clustering aggregation based approach for building efficient and semantic visual vocabularies. It consist of a novel framework for incorporating neighboring appearances of local descriptors into the vocabulary construction, and a rigorous approach for adding meaningful spatial coherency among the local features into the visual codebooks. We also propose an efficient high-dimensional data clustering algorithm, the Fast Reciprocal Nearest Neighbours (Fast-RNN). Our approach, which is a speeded up version of the standard RNN algorithm, is based on the projection search paradigm. Finally, we release a new database of images called Image Collection of Annotated Real-world Objects (ICARO), which is especially designed for evaluating category-level object recognition systems. An exhaustive comparison of ICARO with other well-known datasets used within the same context is carried out. We also propose a benchmark for both object classification and detection

    Human Pose Estimation with Implicit Shape Models

    Get PDF
    Diese Doktorarbeit stellt einen neuen Ansatz vor, wie 3D Posen von Personen alleine auf Basis monokularer Bildinformation geschätzt werden können. Hierzu wird das Implicit Shape Modell um neue Votingstrategien erweitert, die die Lokalisierung anatomischer Landmarken im 2D Bildraum erlauben. Das anschließende eigentliche 3D Posenschätzungsproblem wird dann im Rahmen einer Partikel-Schwarm-Optimierung auf Basis der generierten Voteverteilungen formuliert

    Human Pose Estimation with Implicit Shape Models

    Get PDF
    Diese Doktorarbeit stellt einen neuen Ansatz vor, wie 3D Posen von Personen alleine auf Basis monokularer Bildinformation geschätzt werden können. Hierzu wird das Implicit Shape Modell um neue Votingstrategien erweitert, die die Lokalisierung anatomischer Landmarken im 2D Bildraum erlauben. Das anschließende eigentliche 3D Posenschätzungsproblem wird dann im Rahmen einer Partikel-Schwarm-Optimierung auf Basis der generierten Voteverteilungen formuliert
    corecore