119 research outputs found

    Study of object recognition and identification based on shape and texture analysis

    Get PDF
    The objective of object recognition is to enable computers to recognize image patterns without human intervention. According to its applications, it is mainly divided into two parts: recognition of object categories and detection/identification of objects. My thesis studied the techniques of object feature analysis and identification strategies, which solve the object recognition problem by employing effective and perceptually important object features. The shape information is of particular interest and a review of the shape representation and description is presented, as well as the latest research work on object recognition. In the second chapter of the thesis, a novel content-based approach is proposed for efficient shape classification and retrieval of 2D objects. Two object detection approaches, which are designed according to the characteristics of the shape context and SIFT descriptors, respectively, are analyzed and compared. It is found that the identification strategy constructed on a single type of object feature is only able to recognize the target object under specific conditions which the identifier is adapted to. These identifiers are usually designed to detect the target objects which are rich in the feature type captured by the identifier. In addition, this type of feature often distinguishes the target object from the complex scene. To overcome this constraint, a novel prototyped-based object identification method is presented to detect the target object in the complex scene by employing different types of descriptors to capture the heterogeneous features. All types of descriptors are modified to meet the requirement of the detection strategy’s framework. Thus this new method is able to describe and identify various kinds of objects whose dominant features are quite different. The identification system employs the cosine similarity to evaluate the resemblance between the prototype image and image windows on the complex scene. Then a ‘resemblance map’ is established with values on each patch representing the likelihood of the target object’s presence. The simulation approved that this novel object detection strategy is efficient, robust and of scale and rotation invariance

    A comprehensive review of fruit and vegetable classification techniques

    Get PDF
    Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable

    An Evaluation of Popular Copy-Move Forgery Detection Approaches

    Full text link
    A copy-move forgery is created by copying and pasting content within the same image, and potentially post-processing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e.g., matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and Zernike features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper appeared in IEEE Transaction on Information Forensics and Securit

    On incorporating inductive biases into deep neural networks

    Get PDF
    A machine learning (ML) algorithm can be interpreted as a system that learns to capture patterns in data distributions. Before the modern \emph{deep learning era}, emulating the human brain, the use of structured representations and strong inductive bias have been prevalent in building ML models, partly due to the expensive computational resources and the limited availability of data. On the contrary, armed with increasingly cheaper hardware and abundant data, deep learning has made unprecedented progress during the past decade, showcasing incredible performance on a diverse set of ML tasks. In contrast to \emph{classical ML} models, the latter seeks to minimize structured representations and inductive bias when learning, implicitly favoring the flexibility of learning over manual intervention. Despite the impressive performance, attention is being drawn towards enhancing the (relatively) weaker areas of deep models such as learning with limited resources, robustness, minimal overhead to realize simple relationships, and ability to generalize the learned representations beyond the training conditions, which were (arguably) the forte of classical ML. Consequently, a recent hybrid trend is surfacing that aims to blend structured representations and substantial inductive bias into deep models, with the hope of improving them. Based on the above motivation, this thesis investigates methods to improve the performance of deep models using inductive bias and structured representations across multiple problem domains. To this end, we inject a priori knowledge into deep models in the form of enhanced feature extraction techniques, geometrical priors, engineered features, and optimization constraints. Especially, we show that by leveraging the prior knowledge about the task in hand and the structure of data, the performance of deep learning models can be significantly elevated. We begin by exploring equivariant representation learning. In general, the real-world observations are prone to fundamental transformations (e.g., translation, rotation), and deep models typically demand expensive data-augmentations and a high number of filters to tackle such variance. In comparison, carefully designed equivariant filters possess this ability by nature. Henceforth, we propose a novel \emph{volumetric convolution} operation that can convolve arbitrary functions in the unit-ball (B3\mathbb{B}^3) while preserving rotational equivariance by projecting the input data onto the Zernike basis. We conduct extensive experiments and show that our formulations can be used to construct significantly cheaper ML models. Next, we study generative modeling of 3D objects and propose a principled approach to synthesize 3D point-clouds in the spectral-domain by obtaining a structured representation of 3D points as functions on the unit sphere (S2\mathbb{S}^2). Using the prior knowledge about the spectral moments and the output data manifold, we design an architecture that can maximally utilize the information in the inputs and generate high-resolution point-clouds with minimal computational overhead. Finally, we propose a framework to build normalizing flows (NF) based on increasing triangular maps and Bernstein-type polynomials. Compared to the existing NF approaches, our framework consists of favorable characteristics for fusing inductive bias within the model i.e., theoretical upper bounds for the approximation error, robustness, higher interpretability, suitability for compactly supported densities, and the ability to employ higher degree polynomials without training instability. Most importantly, we present a constructive universality proof, which permits us to analytically derive the optimal model coefficients for known transformations without training

    Efficient sketch-based 3D character modelling.

    Get PDF
    Sketch-based modelling (SBM) has undergone substantial research over the past two decades. In the early days, researchers aimed at developing techniques useful for modelling of architectural and mechanical models through sketching. With the advancement of technology used in designing visual effects for film, TV and games, the demand for highly realistic 3D character models has skyrocketed. To allow artists to create 3D character models quickly, researchers have proposed several techniques for efficient character modelling from sketched feature curves. Moreover several research groups have developed 3D shape databases to retrieve 3D models from sketched inputs. Unfortunately, the current state of the art in sketch-based organic modelling (3D character modelling) contains a lot of gaps and limitations. To bridge the gaps and improve the current sketch-based modelling techniques, this research aims to develop an approach allowing direct and interactive modelling of 3D characters from sketched feature curves, and also make use of 3D shape databases to guide the artist to create his / her desired models. The research involved finding a fusion between 3D shape retrieval, shape manipulation, and shape reconstruction / generation techniques backed by an extensive literature review, experimentation and results. The outcome of this research involved devising a novel and improved technique for sketch-based modelling, the creation of a software interface that allows the artist to quickly and easily create realistic 3D character models with comparatively less effort and learning. The proposed research work provides the tools to draw 3D shape primitives and manipulate them using simple gestures which leads to a better modelling experience than the existing state of the art SBM systems

    Shape Retrieval Methods for Architectural 3D Models

    Get PDF
    This thesis introduces new methods for content-based retrieval of architecture-related 3D models. We thereby consider two different overall types of architectural 3D models. The first type consists of context objects that are used for detailed design and decoration of 3D building model drafts. This includes e.g. furnishing for interior design or barriers and fences for forming the exterior environment. The second type consists of actual building models. To enable efficient content-based retrieval for both model types that is tailored to the user requirements of the architectural domain, type-specific algorithms must be developed. On the one hand, context objects like furnishing that provide similar functions (e.g. seating furniture) often share a similar shape. Nevertheless they might be considered to belong to different object classes from an architectural point of view (e.g. armchair, elbow chair, swivel chair). The differentiation is due to small geometric details and is sometimes only obvious to an expert from the domain. Building models on the other hand are often distinguished according to the underlying floor- and room plans. Topological floor plan properties for example serve as a starting point for telling apart residential and commercial buildings. The first contribution of this thesis is a new meta descriptor for 3D retrieval that combines different types of local shape descriptors using a supervised learning approach. The approach enables the differentiation of object classes according to small geometric details and at the same time integrates expert knowledge from the field of architecture. We evaluate our approach using a database containing arbitrary 3D models as well as on one that only consists of models from the architectural domain. We then further extend our approach by adding a sophisticated shape descriptor localization strategy. Additionally, we exploit knowledge about the spatial relationship of object components to further enhance the retrieval performance. In the second part of the thesis we introduce attributed room connectivity graphs (RCGs) as a means to characterize a 3D building model according to the structure of its underlying floor plans. We first describe how RCGs are inferred from a given building model and discuss how substructures of this graph can be queried efficiently. We then introduce a new descriptor denoted as Bag-of-Attributed-Subgraphs that transforms attributed graphs into a vector-based representation using subgraph embeddings. We finally evaluate the retrieval performance of this new method on a database consisting of building models with different floor plan types. All methods presented in this thesis are aimed at an as automated as possible workflow for indexing and retrieval such that only minimum human interaction is required. Accordingly, only polygon soups are required as inputs which do not need to be manually repaired or structured. Human effort is only needed for offline groundtruth generation to enable supervised learning and for providing information about the orientation of building models and the unit of measurement used for modeling

    From 3D Point Clouds to Pose-Normalised Depth Maps

    Get PDF
    We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)

    Biological image analysis

    Get PDF
    In biological research images are extensively used to monitor growth, dynamics and changes in biological specimen, such as cells or plants. Many of these images are used solely for observation or are manually annotated by an expert. In this dissertation we discuss several methods to automate the annotating and analysis of bio-images. Two large clusters of methods have been investigated and developed. A first set of methods focuses on the automatic delineation of relevant objects in bio-images, such as individual cells in microscopic images. Since these methods should be useful for many different applications, e.g. to detect and delineate different objects (cells, plants, leafs, ...) in different types of images (different types of microscopes, regular colour photographs, ...), the methods should be easy to adjust. Therefore we developed a methodology relying on probability theory, where all required parameters can easily be estimated by a biologist, without requiring any knowledge on the techniques used in the actual software. A second cluster of investigated techniques focuses on the analysis of shapes. By defining new features that describe shapes, we are able to automatically classify shapes, retrieve similar shapes from a database and even analyse how an object deforms through time
    • …
    corecore