1,913 research outputs found

    Fast algorithm for the 3-D DCT-II

    Get PDF
    Recently, many applications for three-dimensional (3-D) image and video compression have been proposed using 3-D discrete cosine transforms (3-D DCTs). Among different types of DCTs, the type-II DCT (DCT-II) is the most used. In order to use the 3-D DCTs in practical applications, fast 3-D algorithms are essential. Therefore, in this paper, the 3-D vector-radix decimation-in-frequency (3-D VR DIF) algorithm that calculates the 3-D DCT-II directly is introduced. The mathematical analysis and the implementation of the developed algorithm are presented, showing that this algorithm possesses a regular structure, can be implemented in-place for efficient use of memory, and is faster than the conventional row-column-frame (RCF) approach. Furthermore, an application of 3-D video compression-based 3-D DCT-II is implemented using the 3-D new algorithm. This has led to a substantial speed improvement for 3-D DCT-II-based compression systems and proved the validity of the developed algorithm

    Digital Signal Processing

    Get PDF
    Contains an introduction and reports on fourteen research projects.National Science Foundation FellowshipNational Science Foundation (Grant ECS84-07285)U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)Sanders Associates, Inc.U.S. Air Force - Office of Scientific Research (Contract F19628-85-K-0028)Advanced Television Research ProgramAmoco Foundation FellowshipHertz Foundation Fellowshi

    Improved Fourier Mellin Invariant for Robust Rotation Estimation with Omni-cameras

    Full text link
    Spectral methods such as the improved Fourier Mellin Invariant (iFMI) transform have proved faster, more robust and accurate than feature based methods on image registration. However, iFMI is restricted to work only when the camera moves in 2D space and has not been applied on omni-cameras images so far. In this work, we extend the iFMI method and apply a motion model to estimate an omni-camera's pose when it moves in 3D space. This is particularly useful in field robotics applications to get a rapid and comprehensive view of unstructured environments, and to estimate robustly the robot pose. In the experiment section, we compared the extended iFMI method against ORB and AKAZE feature based approaches on three datasets showing different type of environments: office, lawn and urban scenery (MPI-omni dataset). The results show that our method boosts the accuracy of the robot pose estimation two to four times with respect to the feature registration techniques, while offering lower processing times. Furthermore, the iFMI approach presents the best performance against motion blur typically present in mobile robotics.Comment: 5 pages, 4 figures, 1 tabl

    Hypercomplex Spectral Signal Representations for the Processing and Analysis of Images

    Get PDF
    In the present work hypercomplex spectral methods of the processing and analysis of images are introduced. The thesis is divided into three main chapters. First the quaternionic Fourier transform (QFT) for 2D signals is presented and its main properties are investigated. The QFT is closely related to the 2D Fourier transform and to the 2D Hartley transform. Similarities and differences of these three transforms are investigated with special emphasis on the symmetry properties. The Clifford Fourier transform is presented as nD generalization of the QFT. Secondly the concept of the phase of a signal is considered. We distinguish the global, the local and the instantaneous phase of a signal. It is shown how these 1D concepts can be extended to 2D using the QFT. In order to extend the concept of global phase we introduce the notion of the quaternionic analytic signal of a real signal. Defining quaternionic Gabor filters leads to the definition of the local quaternionic phase. The relation between signal structure and local signal phase, which is well-known in 1D, is extended to 2D using the quaternionic phase. In the third part two application of the theory are presented. For the image processing tasks of disparity estimation and texture segmentation there exist approaches which are based on the (complex) local phase. These methods are extended to the use of the quaternionic phase. In either case the properties of the complex approaches are preserved while new features are added by using the quaternionic phase

    Digital Signal Processing

    Get PDF
    Contains an introduction and reports on fifteen research projects.U.S. Navy - Office of Naval Research (Contract N00O14-81-K-0742)U.S. Navy - Office of Naval Research (Contract N00014-77-C-0266)National Science Foundation (Grant ECS80-07102)National Science Foundation (Grant ECS84-07285)Amoco Foundation FellowshipSanders Associates, Inc.Advanced Television Research ProgramM.I.T. Vinton Hayes FellowshipHertz Foundation Fellowshi

    Automatic face recognition using stereo images

    Get PDF
    Face recognition is an important pattern recognition problem, in the study of both natural and artificial learning problems. Compaxed to other biometrics, it is non-intrusive, non- invasive and requires no paxticipation from the subjects. As a result, it has many applications varying from human-computer-interaction to access control and law-enforcement to crowd surveillance. In typical optical image based face recognition systems, the systematic vaxiability arising from representing the three-dimensional (3D) shape of a face by a two-dimensional (21)) illumination intensity matrix is treated as random vaxiability. Multiple examples of the face displaying vaxying pose and expressions axe captured in different imaging conditions. The imaging environment, pose and expressions are strictly controlled and the images undergo rigorous normalisation and pre-processing. This may be implemented in a paxtially or a fully automated system. Although these systems report high classification accuracies (>90%), they lack versatility and tend to fail when deployed outside laboratory conditions. Recently, more sophisticated 3D face recognition systems haxnessing the depth information have emerged. These systems usually employ specialist equipment such as laser scanners and structured light projectors. Although more accurate than 2D optical image based recognition, these systems are equally difficult to implement in a non-co-operative environment. Existing face recognition systems, both 2D and 3D, detract from the main advantages of face recognition and fail to fully exploit its non-intrusive capacity. This is either because they rely too much on subject co-operation, which is not always available, or because they cannot cope with noisy data. The main objective of this work was to investigate the role of depth information in face recognition in a noisy environment. A stereo-based system, inspired by the human binocular vision, was devised using a pair of manually calibrated digital off-the-shelf cameras in a stereo setup to compute depth information. Depth values extracted from 2D intensity images using stereoscopy are extremely noisy, and as a result this approach for face recognition is rare. This was cofirmed by the results of our experimental work. Noise in the set of correspondences, camera calibration and triangulation led to inaccurate depth reconstruction, which in turn led to poor classifier accuracy for both 3D surface matching and 211) 2 depth maps. Recognition experiments axe performed on the Sheffield Dataset, consisting 692 images of 22 individuals with varying pose, illumination and expressions

    Content based image pose manipulation

    Get PDF
    This thesis proposes the application of space-frequency transformations to the domain of pose estimation in images. This idea is explored using the Wavelet Transform with illustrative applications in pose estimation for face images, and images of planar scenes. The approach is based on examining the spatial frequency components in an image, to allow the inherent scene symmetry balance to be recovered. For face images with restricted pose variation (looking left or right), an algorithm is proposed to maximise this symmetry in order to transform the image into a fronto-parallel pose. This scheme is further employed to identify the optimal frontal facial pose from a video sequence to automate facial capture processes. These features are an important pre-requisite in facial recognition and expression classification systems. The under lying principles of this spatial-frequency approach are examined with respect to images with planar scenes. Using the Continuous Wavelet Transform, full perspective planar transformations are estimated within a featureless framework. Restoring central symmetry to the wavelet transformed images in an iterative optimisation scheme removes this perspective pose. This advances upon existing spatial approaches that require segmentation and feature matching, and frequency only techniques that are limited to affine transformation recovery. To evaluate the proposed techniques, the pose of a database of subjects portraying varying yaw orientations is estimated and the accuracy is measured against the captured ground truth information. Additionally, full perspective homographies for synthesised and imaged textured planes are estimated. Experimental results are presented for both situations that compare favourably with existing techniques in the literature
    corecore