25 research outputs found

    Investigating Polynomial Fitting Schemes for Image Compression

    Get PDF
    Image compression is a means to perform transmission or storage of visual data in the most economical way. Though many algorithms have been reported, research is still needed to cope with the continuous demand for more efficient transmission or storage. This research work explores and implements polynomial fitting techniques as means to perform block-based lossy image compression. In an attempt to investigate nonpolynomial models, a region-based scheme is implemented to fit the whole image using bell-shaped functions. The idea is simply to view an image as a 3D geographical map consisting of hills and valleys. However, the scheme suffers from high computational demands and inferiority to many available image compression schemes. Hence, only polynomial models get further considerations. A first order polynomial (plane) model is designed to work in a multiplication- and division-free (MDF) environment. The intensity values of each image block are fitted to a plane and the parameters are then quantized and coded. Blocking artefacts, a common drawback of block-based image compression techniques, are reduced using an MDF line-fitting scheme at blocks’ boundaries. It is shown that a compression ratio of 62:1 at 28.8dB is attainable for the standard image PEPPER, outperforming JPEG, both objectively and subjectively for this part of the rate-distortion characteristics. Inter-block prediction can substantially improve the compression performance of the plane model to reach a compression ratio of 112:1 at 27.9dB. This improvement, however, slightly increases computational complexity and reduces pipelining capability. Although JPEG2000 is not a block-based scheme, it is encouraging that the proposed prediction scheme performs better in comparison to JPEG 2000, computationally and qualitatively. However, more experiments are needed to have a more concrete comparison. To reduce blocking artefacts, a new postprocessing scheme, based on Weber’s law, is employed. It is reported that images postprocessed using this scheme are subjectively more pleasing with a marginal increase in PSNR (<0.3 dB). The Weber’s law is modified to perform edge detection and quality assessment tasks. These results motivate the exploration of higher order polynomials, using three parameters to maintain comparable compression performance. To investigate the impact of higher order polynomials, through an approximate asymptotic behaviour, a novel linear mapping scheme is designed. Though computationally demanding, the performances of higher order polynomial approximation schemes are comparable to that of the plane model. This clearly demonstrates the powerful approximation capability of the plane model. As such, the proposed linear mapping scheme constitutes a new approach in image modeling, and hence worth future consideration

    A Novel Multi-Symbol Curve Fit based CABAC Framework for Hybrid Video Codec's with Improved Coding Efficiency and Throughput

    Get PDF
    Video compression is an essential component of present-day applications and a decisive factor between the success or failure of a business model. There is an ever increasing demand to transmit larger number of superior-quality video channels into the available transmission bandwidth. Consumers are increasingly discerning about the quality and performance of video-based products and there is therefore a strong incentive for continuous improvement in video coding technology for companies to have market edge over its competitors. Even though processor speeds and network bandwidths continue to increase, a better video compression results in a more competitive product. This drive to improve video compression technology has led to a revolution in the last decade. In this thesis we addresses some of these data compression problems in a practical multimedia system that employ Hybrid video coding schemes. Typically Real life video signals show non-stationary statistical behavior. The statistics of these signals largely depend on the video content and the acquisition process. Hybrid video coding schemes like H264/AVC exploits some of the non-stationary characteristics but certainly not all of it. Moreover, higher order statistical dependencies on a syntax element level are mostly neglected in existing video coding schemes. Designing a video coding scheme for a video coder by taking into consideration these typically observed statistical properties, however, offers room for significant improvements in coding efficiency.In this thesis work a new frequency domain curve-fitting compression framework is proposed as an extension to H264 Context Adaptive Binary Arithmetic Coder (CABAC) that achieves better compression efficiency at reduced complexity. The proposed Curve-Fitting extension to H264 CABAC, henceforth called as CF-CABAC, is modularly designed to conveniently fit into existing block based H264 Hybrid video Entropy coding algorithms. Traditionally there have been many proposals in the literature to fuse surfaces/curve fitting with Block-based, Region based, Training-based (VQ, fractals) compression algorithms primarily to exploiting pixel- domain redundancies. Though the compression efficiency of these are expectantly better than DCT transform based compression, but their main drawback is the high computational demand which make the former techniques non-competitive for real-time applications over the latter. The curve fitting techniques proposed so far have been on the pixel domain. The video characteristic on the pixel domain are highly non-stationary making curve fitting techniques not very efficient in terms of video quality, compression ratio and complexity. In this thesis, we explore using curve fitting techniques to Quantized frequency domain coefficients. we fuse this powerful technique to H264 CABAC Entropy coding. Based on some predictable characteristics of Quantized DCT coefficients, a computationally in-expensive curve fitting technique is explored that fits into the existing H264 CABAC framework. Also Due to the lossy nature of video compression and the strong demand for bandwidth and computation resources in a multimedia system, one of the key design issues for video coding is to optimize trade-off among quality (distortion) vs compression (rate) vs complexity. This thesis also briefly studies the existing rate distortion (RD) optimization approaches proposed to video coding for exploring the best RD performance of a video codec. Further, we propose a graph based algorithm for Rate-distortion. optimization of quantized coefficient indices for the proposed CF-CABAC entropy coding

    On-board three-dimensional object tracking: Software and hardware solutions

    Full text link
    We describe a real time system for recognition and tracking 3D objects such as UAVs, airplanes, fighters with the optical sensor. Given a 2D image, the system has to perform background subtraction, recognize relative rotation, scale and translation of the object to sustain a prescribed topology of the fleet. In the thesis a comparative study of different algorithms and performance evaluation is carried out based on time and accuracy constraints. For background subtraction task we evaluate frame differencing, approximate median filter, mixture of Gaussians and propose classification based on neural network methods. For object detection we analyze the performance of invariant moments, scale invariant feature transform and affine scale invariant feature transform methods. Various tracking algorithms such as mean shift with variable and a fixed sized windows, scale invariant feature transform, Harris and fast full search based on fast fourier transform algorithms are evaluated. We develop an algorithm for the relative rotations and the scale change calculation based on Zernike moments. Based on the design criteria the selection is made for on-board implementation. The candidate techniques have been implemented on the Texas Instrument TMS320DM642 EVM board. It is shown in the thesis that 14 frames per second can be processed; that supports the real time implementation of the tracking system under reasonable accuracy limits

    Advanced techniques for classification of polarimetric synthetic aperture radar data

    Get PDF
    With various remote sensing technologies to aid Earth Observation, radar-based imaging is one of them gaining major interests due to advances in its imaging techniques in form of syn-thetic aperture radar (SAR) and polarimetry. The majority of radar applications focus on mon-itoring, detecting, and classifying local or global areas of interests to support humans within their efforts of decision-making, analysis, and interpretation of Earth’s environment. This thesis focuses on improving the classification performance and process particularly concerning the application of land use and land cover over polarimetric SAR (PolSAR) data. To achieve this, three contributions are studied related to superior feature description and ad-vanced machine-learning techniques including classifiers, principles, and data exploitation. First, this thesis investigates the application of color features within PolSAR image classi-fication to provide additional discrimination on top of the conventional scattering information and texture features. The color features are extracted over the visual presentation of fully and partially polarimetric SAR data by generation of pseudo color images. Within the experiments, the obtained results demonstrated that with the addition of the considered color features, the achieved classification performances outperformed results with common PolSAR features alone as well as achieved higher classification accuracies compared to the traditional combination of PolSAR and texture features. Second, to address the large-scale learning challenge in PolSAR image classification with the utmost efficiency, this thesis introduces the application of an adaptive and data-driven supervised classification topology called Collective Network of Binary Classifiers, CNBC. This topology incorporates active learning to support human users with the analysis and interpretation of PolSAR data focusing on collections of images, where changes or updates to the existing classifier might be required frequently due to surface, terrain, and object changes as well as certain variations in capturing time and position. Evaluations demonstrated the capabilities of CNBC over an extensive set of experimental results regarding the adaptation and data-driven classification of single as well as collections of PolSAR images. The experimental results verified that the evolutionary classification topology, CNBC, did provide an efficient solution for the problems of scalability and dynamic adaptability allowing both feature space dimensions and the number of terrain classes in PolSAR image collections to vary dynamically. Third, most PolSAR classification problems are undertaken by supervised machine learn-ing, which require manually labeled ground truth data available. To reduce the manual labeling efforts, supervised and unsupervised learning approaches are combined into semi-supervised learning to utilize the huge amount of unlabeled data. The application of semi-supervised learning in this thesis is motivated by ill-posed classification tasks related to the small training size problem. Therefore, this thesis investigates how much ground truth is actually necessary for certain classification problems to achieve satisfactory results in a supervised and semi-supervised learning scenario. To address this, two semi-supervised approaches are proposed by unsupervised extension of the training data and ensemble-based self-training. The evaluations showed that significant speed-ups and improvements in classification performance are achieved. In particular, for a remote sensing application such as PolSAR image classification, it is advantageous to exploit the location-based information from the labeled training data. Each of the developed techniques provides its stand-alone contribution from different viewpoints to improve land use and land cover classification. The introduction of a new fea-ture for better discrimination is independent of the underlying classification algorithms used. The application of the CNBC topology is applicable to various classification problems no matter how the underlying data have been acquired, for example in case of remote sensing data. Moreover, the semi-supervised learning approach tackles the challenge of utilizing the unlabeled data. By combining these techniques for superior feature description and advanced machine-learning techniques exploiting classifier topologies and data, further contributions to polarimetric SAR image classification are made. According to the performance evaluations conducted including visual and numerical assessments, the proposed and investigated tech-niques showed valuable improvements and are able to aid the analysis and interpretation of PolSAR image data. Due to the generic nature of the developed techniques, their applications to other remote sensing data will require only minor adjustments

    3D Laser Scanner Development and Analysis

    Get PDF

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Application of statistical learning theory to plankton image analysis

    Get PDF
    Submitted to the Joint Program in Applied Ocean Science and Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy At the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2006A fundamental problem in limnology and oceanography is the inability to quickly identify and map distributions of plankton. This thesis addresses the problem by applying statistical machine learning to video images collected by an optical sampler, the Video Plankton Recorder (VPR). The research is focused on development of a real-time automatic plankton recognition system to estimate plankton abundance. The system includes four major components: pattern representation/feature measurement, feature extraction/selection, classification, and abundance estimation. After an extensive study on a traditional learning vector quantization (LVQ) neural network (NN) classifier built on shape-based features and different pattern representation methods, I developed a classification system combined multi-scale cooccurrence matrices feature with support vector machine classifier. This new method outperforms the traditional shape-based-NN classifier method by 12% in classification accuracy. Subsequent plankton abundance estimates are improved in the regions of low relative abundance by more than 50%. Both the NN and SVM classifiers have no rejection metrics. In this thesis, two rejection metrics were developed. One was based on the Euclidean distance in the feature space for NN classifier. The other used dual classifier (NN and SVM) voting as output. Using the dual-classification method alone yields almost as good abundance estimation as human labeling on a test-bed of real world data. However, the distance rejection metric for NN classifier might be more useful when the training samples are not “good” ie, representative of the field data. In summary, this thesis advances the current state-of-the-art plankton recognition system by demonstrating multi-scale texture-based features are more suitable for classifying field-collected images. The system was verified on a very large realworld dataset in systematic way for the first time. The accomplishments include developing a multi-scale occurrence matrices and support vector machine system, a dual-classification system, automatic correction in abundance estimation, and ability to get accurate abundance estimation from real-time automatic classification. The methods developed are generic and are likely to work on range of other image classification applications.This work was supported by National Science Foundation Grants OCE-9820099 and Woods Hole Oceanographic Institution academic program

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs
    corecore