612 research outputs found

    Simultaneous Codeword Optimization (SimCO) for Dictionary Update and Learning

    Get PDF
    We consider the data-driven dictionary learning problem. The goal is to seek an over-complete dictionary from which every training signal can be best approximated by a linear combination of only a few codewords. This task is often achieved by iteratively executing two operations: sparse coding and dictionary update. In the literature, there are two benchmark mechanisms to update a dictionary. The first approach, such as the MOD algorithm, is characterized by searching for the optimal codewords while fixing the sparse coefficients. In the second approach, represented by the K-SVD method, one codeword and the related sparse coefficients are simultaneously updated while all other codewords and coefficients remain unchanged. We propose a novel framework that generalizes the aforementioned two methods. The unique feature of our approach is that one can update an arbitrary set of codewords and the corresponding sparse coefficients simultaneously: when sparse coefficients are fixed, the underlying optimization problem is similar to that in the MOD algorithm; when only one codeword is selected for update, it can be proved that the proposed algorithm is equivalent to the K-SVD method; and more importantly, our method allows us to update all codewords and all sparse coefficients simultaneously, hence the term simultaneous codeword optimization (SimCO). Under the proposed framework, we design two algorithms, namely, primitive and regularized SimCO. We implement these two algorithms based on a simple gradient descent mechanism. Simulations are provided to demonstrate the performance of the proposed algorithms, as compared with two baseline algorithms MOD and K-SVD. Results show that regularized SimCO is particularly appealing in terms of both learning performance and running speed.Comment: 13 page

    Map online system using internet-based image catalogue

    Get PDF
    Digital maps carry along its geodata information such as coordinate that is important in one particular topographic and thematic map. These geodatas are meaningful especially in military field. Since the maps carry along this information, its makes the size of the images is too big. The bigger size, the bigger storage is required to allocate the image file. It also can cause longer loading time. These conditions make it did not suitable to be applied in image catalogue approach via internet environment. With compression techniques, the image size can be reduced and the quality of the image is still guaranteed without much changes. This report is paying attention to one of the image compression technique using wavelet technology. Wavelet technology is much batter than any other image compression technique nowadays. As a result, the compressed images applied to a system called Map Online that used Internet-based Image Catalogue approach. This system allowed user to buy map online. User also can download the maps that had been bought besides using the searching the map. Map searching is based on several meaningful keywords. As a result, this system is expected to be used by Jabatan Ukur dan Pemetaan Malaysia (JUPEM) in order to make the organization vision is implemented

    Compression Technique Using DCT & Fractal Compression: A Survey

    Get PDF
    Steganography differs from digital watermarking because both the information and the very existence of the information are hidden. In the beginning, the fractal image compression method is used to compress the secret image, and then we encrypt this compressed data by DES.The Existing Steganographic approaches are unable to handle the Subterfuge attack i.e, they cannot deal with the opponents not only detects a message ,but also render it useless, or even worse, modify it to opponent favor. The advantage of BCBS is the decoding can be operated without access to the cover image and it also detects if the message has been tampered without using any extra error correction. To improve the imperceptibility of the BCBS, DCT is used in combination to transfer stego-image from spatial domain to the frequency domain. The hiding capacity of the information is improved by introducing Fractal Compression and the security is enhanced using by encrypting stego-image using DES.  Copyright © www.iiste.org Keywords: Steganography, data hiding, fractal image compression, DCT

    AVISARME: Audio Visual Synchronization Algorithm for a Robotic Musician Ensemble

    Get PDF
    This thesis presents a beat detection algorithm which combines both audio and visual inputs to synchronize a robotic musician to its human counterpart. Although there has been considerable work done to create sophisticated methods for audio beat detection, the visual aspect of musicianship has been largely ignored. With advancements in image processing techniques, as well as both computer and imaging technologies, it has recently become feasible to integrate visual inputs into beat detection algorithms. Additionally, the proposed method for audio tempo detection also attempts to solve many issues that are present in current algorithms. Current audio-only algorithms have imperfections, whether they are inaccurate, too computationally expensive, or suffer from terrible resolution. Through further experimental testing on both a popular music database and simulated music signals, the proposed algorithm performed statistically better in both accuracy and robustness than the baseline approaches. Furthermore, the proposed approach is extremely efficient, taking only 45ms to compute on a 2.5s signal, and maintains an extremely high temporal resolution of 0.125 BPM. The visual integration also relies on Full Scene Tracking, allowing it to be utilized for live beat detection for practically all musicians and instruments. Numerous optimization techniques have been implemented, such as pyramidal optimization (PO) and clustering techniques which are presented in this thesis. A Temporal Difference Learning approach to sensor fusion and beat synchronization is also proposed and tested thoroughly. This TD learning algorithm implements a novel policy switching criterion which provides a stable, yet quickly reacting estimation of tempo. The proposed algorithm has been implemented and tested on a robotic drummer to verify the validity of the approach. The results from testing are documented in great detail and compared with previously proposed approaches

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration

    Numerical Issues When Using Wavelets

    Get PDF
    International audienceWavelets and related multiscale representations pervade all areas of signal processing. The recent inclusion of wavelet algorithms in JPEG 2000 – the new still-picture compression standard– testifies to this lasting and significant impact. The reason of the success of the wavelets is due to the fact that wavelet basis represents well a large class of signals, and therefore allows us to detect roughly isotropic elements occurring at all spatial scales and locations. As the noise in the physical sciences is often not Gaussian, the modeling, in the wavelet space, of many kind of noise (Poisson noise, combination of Gaussian and Poisson noise, long-memory 1/f noise, non-stationary noise, ...) has also been a key step for the use of wavelets in scientific, medical, or industrial applications [1]. Extensive wavelet packages exist now, commercial (see for example [2]) or non commercial (see for example [3, 4]), which allows any researcher, doctor, or engineer to analyze his data using wavelets

    WG1N5315 - Response to Call for AIC evaluation methodologies and compression technologies for medical images: LAR Codec

    Get PDF
    This document presents the LAR image codec as a response to Call for AIC evaluation methodologies and compression technologies for medical images.This document describes the IETR response to the specific call for contributions of medical imaging technologies to be considered for AIC. The philosophy behind our coder is not to outperform JPEG2000 in compression; our goal is to propose an open source, royalty free, alternative image coder with integrated services. While keeping the compression performances in the same range as JPEG2000 but with lower complexity, our coder also provides services such as scalability, cryptography, data hiding, lossy to lossless compression, region of interest, free region representation and coding
    corecore