3,774 research outputs found

    Disconnected Skeleton: Shape at its Absolute Scale

    Full text link
    We present a new skeletal representation along with a matching framework to address the deformable shape recognition problem. The disconnectedness arises as a result of excessive regularization that we use to describe a shape at an attainably coarse scale. Our motivation is to rely on the stable properties of the shape instead of inaccurately measured secondary details. The new representation does not suffer from the common instability problems of traditional connected skeletons, and the matching process gives quite successful results on a diverse database of 2D shapes. An important difference of our approach from the conventional use of the skeleton is that we replace the local coordinate frame with a global Euclidean frame supported by additional mechanisms to handle articulations and local boundary deformations. As a result, we can produce descriptions that are sensitive to any combination of changes in scale, position, orientation and articulation, as well as invariant ones.Comment: The work excluding {\S}V and {\S}VI has first appeared in 2005 ICCV: Aslan, C., Tari, S.: An Axis-Based Representation for Recognition. In ICCV(2005) 1339- 1346.; Aslan, C., : Disconnected Skeletons for Shape Recognition. Masters thesis, Department of Computer Engineering, Middle East Technical University, May 200

    Extracting 3D parametric curves from 2D images of Helical objects

    Get PDF
    Helical objects occur in medicine, biology, cosmetics, nanotechnology, and engineering. Extracting a 3D parametric curve from a 2D image of a helical object has many practical applications, in particular being able to extract metrics such as tortuosity, frequency, and pitch. We present a method that is able to straighten the image object and derive a robust 3D helical curve from peaks in the object boundary. The algorithm has a small number of stable parameters that require little tuning, and the curve is validated against both synthetic and real-world data. The results show that the extracted 3D curve comes within close Hausdorff distance to the ground truth, and has near identical tortuosity for helical objects with a circular profile. Parameter insensitivity and robustness against high levels of image noise are demonstrated thoroughly and quantitatively

    Image Segmentation using Human Visual System Properties with Applications in Image Compression

    Get PDF
    In order to represent a digital image, a very large number of bits is required. For example, a 512 X 512 pixel, 256 gray level image requires over two million bits. This large number of bits is a substantial drawback when it is necessary to store or transmit a digital image. Image compression, often referred to as image coding, attempts to reduce the number of bits used to represent an image, while keeping the degradation in the decoded image to a minimum. One approach to image compression is segmentation-based image compression. The image to be compressed is segmented, i.e. the pixels in the image are divided into mutually exclusive spatial regions based on some criteria. Once the image has been segmented, information is extracted describing the shapes and interiors of the image segments. Compression is achieved by efficiently representing the image segments. In this thesis we propose an image segmentation technique which is based on centroid-linkage region growing, and takes advantage of human visual system (HVS) properties. We systematically determine through subjective experiments the parameters for our segmentation algorithm which produce the most visually pleasing segmented images, and demonstrate the effectiveness of our method. We also propose a method for the quantization of segmented images based on HVS contrast sensitivity, arid investigate the effect of quantization on segmented images

    Shape representation and coding of visual objets in multimedia applications — An overview

    Get PDF
    Emerging multimedia applications have created the need for new functionalities in digital communications. Whereas existing compression standards only deal with the audio-visual scene at a frame level, it is now necessary to handle individual objects separately, thus allowing scalable transmission as well as interactive scene recomposition by the receiver. The future MPEG-4 standard aims at providing compression tools addressing these functionalities. Unlike existing frame-based standards, the corresponding coding schemes need to encode shape information explicitly. This paper reviews existing solutions to the problem of shape representation and coding. Region and contour coding techniques are presented and their performance is discussed, considering coding efficiency and rate-distortion control capability, as well as flexibility to application requirements such as progressive transmission, low-delay coding, and error robustnes

    Robust 3D Action Recognition through Sampling Local Appearances and Global Distributions

    Full text link
    3D action recognition has broad applications in human-computer interaction and intelligent surveillance. However, recognizing similar actions remains challenging since previous literature fails to capture motion and shape cues effectively from noisy depth data. In this paper, we propose a novel two-layer Bag-of-Visual-Words (BoVW) model, which suppresses the noise disturbances and jointly encodes both motion and shape cues. First, background clutter is removed by a background modeling method that is designed for depth data. Then, motion and shape cues are jointly used to generate robust and distinctive spatial-temporal interest points (STIPs): motion-based STIPs and shape-based STIPs. In the first layer of our model, a multi-scale 3D local steering kernel (M3DLSK) descriptor is proposed to describe local appearances of cuboids around motion-based STIPs. In the second layer, a spatial-temporal vector (STV) descriptor is proposed to describe the spatial-temporal distributions of shape-based STIPs. Using the Bag-of-Visual-Words (BoVW) model, motion and shape cues are combined to form a fused action representation. Our model performs favorably compared with common STIP detection and description methods. Thorough experiments verify that our model is effective in distinguishing similar actions and robust to background clutter, partial occlusions and pepper noise

    Multiresolution signal decomposition schemes

    Get PDF
    [PNA-R9810] Interest in multiresolution techniques for signal processing and analysis is increasing steadily. An important instance of such a technique is the so-called pyramid decomposition scheme. This report proposes a general axiomatic pyramid decomposition scheme for signal analysis and synthesis. This scheme comprises the following ingredients: (i) the pyramid consists of a (finite or infinite) number of levels such that the information content decreases towards higher levels; (ii) each step towards a higher level is constituted by an (information-reducing) analysis operator, whereas each step towards a lower level is modeled by an (information-preserving) synthesis operator. One basic assumption is necessary: synthesis followed by analysis yields the identity operator, meaning that no information is lost by these two consecutive steps. In this report, several examples are described of linear as well as nonlinear (e.g., morphological) pyramid decomposition schemes. Some of these examples are known from the literature (Laplacian pyramid, morphological granulometries, skeleton decomposition) and some of them are new (morphological Haar pyramid, median pyramid). Furthermore, the report makes a distinction between single-scale and multiscale decomposition schemes (i.e. without or with sample reduction).#[PNA-R9905] In its original form, the wavelet transform is a linear tool. However, it has been increasingly recognized that nonlinear extensions are possible. A major impulse to the development of nonlinea

    Morphological operations in image processing and analysis

    Get PDF
    Morphological operations applied in image processing and analysis are becoming increasingly important in today\u27s technology. Morphological operations which are based on set theory, can extract object features by suitable shape (structuring elements). Morphological filters are combinations of morphological operations that transform an image into a quantitative description of its geometrical structure which based on structuring elements. Important applications of morphological operations are shape description, shape recognition, nonlinear filtering, industrial parts inspection, and medical image processing. In this dissertation, basic morphological operations are reviewed, algorithms and theorems are presented for solving problems in distance transformation, skeletonization, recognition, and nonlinear filtering. A skeletonization algorithm using the maxima-tracking method is introduced to generate a connected skeleton. A modified algorithm is proposed to eliminate non-significant short branches. The back propagation morphology is introduced to reach the roots of morphological filters in only two-scan. The definitions and properties of back propagation morphology are discussed. The two-scan distance transformation is proposed to illustrate the advantage of this new definition. G-spectrum (geometric spectrum) which based upon the cardinality of a set of non-overlapping segments in an image using morphological operations is presented to be a useful tool not only for shape description but also for shape recognition. The G-spectrum is proven to be translation-, rotation-, and scaling-invariant. The shape likeliness based on G-spectrum is defined as a measurement in shape recognition. Experimental results are also illustrated. Soft morphological operations which are found to be less sensitive to additive noise and to small variations are the combinations of order statistic and morphological operations. Soft morphological operations commute with thresholding and obey threshold superposition. This threshold decomposition property allows gray-scale signals to be decomposed into binary signals which can be processed by only logic gates in parallel and then binary results can be combined to produce the equivalent output. Thus the implementation and analysis of function-processing soft morphological operations can be done by focusing only on the case of sets which not only are much easier to deal with because their definitions involve only counting the points instead of sorting numbers, but also allow logic gates implementation and parallel pipelined architecture leading to real-time implementation. In general, soft opening and closing are not idempotent operations, but under some constraints the soft opening and closing can be idempotent and the proof is given. The idempotence property gives us the idea of how to choose the structuring element sets and the value of index such that the soft morphological filters will reach the root signals without iterations. Finally, summary and future research of this dissertation are provided

    A Model-Based Approach for Compression of Fingerprint Images

    Get PDF
    We propose a new fingerprint image compression scheme based on the hybrid model of an image. Our scheme uses the essential steps of a typical automated fingerprint identification system (AFIS) such as enhancement, binarization and thinning to encode fingerprint images. The decoding process is based on reconstructing a hybrid surface by using the gray values on ridges and valleys. In this compression scheme, the ridge skeleton is coded efficiently by using differential chain codes. The valley skeleton is derived from the ridge skeleton and the gray values along the ridge and valley skeletons are encoded using the discrete cosine transform. The error between the original and the replica is also encoded to increase the quality. One advantage of our approach is that original features such as end points and bifurcation points can be extracted directly from compressed image even for a very high compression ratio. Another advantage is that the proposed scheme can be integrated to a typical AFIS easily. The algorithm has been applied to various fingerprint images, and high compression ratios like 63:1 have been obtained. A comparison to wavelet/scalar quantization (WSQ) has been also made
    • …
    corecore