6 research outputs found

    Low-complexity Multidimensional DCT Approximations

    Full text link
    In this paper, we introduce low-complexity multidimensional discrete cosine transform (DCT) approximations. Three dimensional DCT (3D DCT) approximations are formalized in terms of high-order tensor theory. The formulation is extended to higher dimensions with arbitrary lengths. Several multiplierless 8×8×88\times 8\times 8 approximate methods are proposed and the computational complexity is discussed for the general multidimensional case. The proposed methods complexity cost was assessed, presenting considerably lower arithmetic operations when compared with the exact 3D DCT. The proposed approximations were embedded into 3D DCT-based video coding scheme and a modified quantization step was introduced. The simulation results showed that the approximate 3D DCT coding methods offer almost identical output visual quality when compared with exact 3D DCT scheme. The proposed 3D approximations were also employed as a tool for visual tracking. The approximate 3D DCT-based proposed system performs similarly to the original exact 3D DCT-based method. In general, the suggested methods showed competitive performance at a considerably lower computational cost.Comment: 28 pages, 5 figures, 5 table

    Research and developments of distributed video coding

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The recent developed Distributed Video Coding (DVC) is typically suitable for the applications such as wireless/wired video sensor network, mobile camera etc. where the traditional video coding standard is not feasible due to the constrained computation at the encoder. With DVC, the computational burden is moved from encoder to decoder. The compression efficiency is achieved via joint decoding at the decoder. The practical application of DVC is referred to Wyner-Ziv video coding (WZ) where the side information is available at the decoder to perform joint decoding. This join decoding inevitably causes a very complex decoder. In current WZ video coding issues, many of them emphasise how to improve the system coding performance but neglect the huge complexity caused at the decoder. The complexity of the decoder has direct influence to the system output. The beginning period of this research targets to optimise the decoder in pixel domain WZ video coding (PDWZ), while still achieves similar compression performance. More specifically, four issues are raised to optimise the input block size, the side information generation, the side information refinement process and the feedback channel respectively. The transform domain WZ video coding (TDWZ) has distinct superior performance to the normal PDWZ due to the exploitation in spatial direction during the encoding. However, since there is no motion estimation at the encoder in WZ video coding, the temporal correlation is not exploited at all at the encoder in all current WZ video coding issues. In the middle period of this research, the 3D DCT is adopted in the TDWZ to remove redundancy in both spatial and temporal direction thus to provide even higher coding performance. In the next step of this research, the performance of transform domain Distributed Multiview Video Coding (DMVC) is also investigated. Particularly, three types transform domain DMVC frameworks which are transform domain DMVC using TDWZ based 2D DCT, transform domain DMVC using TDWZ based on 3D DCT and transform domain residual DMVC using TDWZ based on 3D DCT are investigated respectively. One of the important applications of WZ coding principle is error-resilience. There have been several attempts to apply WZ error-resilient coding for current video coding standard e.g. H.264/AVC or MEPG 2. The final stage of this research is the design of WZ error-resilient scheme for wavelet based video codec. To balance the trade-off between error resilience ability and bandwidth consumption, the proposed scheme emphasises the protection of the Region of Interest (ROI) area. The efficiency of bandwidth utilisation is achieved by mutual efforts of WZ coding and sacrificing the quality of unimportant area. In summary, this research work contributed to achieves several advances in WZ video coding. First of all, it is targeting to build an efficient PDWZ with optimised decoder. Secondly, it aims to build an advanced TDWZ based on 3D DCT, which then is applied into multiview video coding to realise advanced transform domain DMVC. Finally, it aims to design an efficient error-resilient scheme for wavelet video codec, with which the trade-off between bandwidth consumption and error-resilience can be better balanced

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Συμβολή στην ανάλυση και κωδικοποίηση συστοιχίας εικόνων τρισδιάστατης απεικόνισης

    Get PDF
    Τα τρισδιάστατα (3Δ) συστήματα απεικόνισης αποτελούν σήμερα το κύριο μέσο παρατήρησης για ένα πλήθος από εξειδικευμένες εφαρμογές και με την εξέλιξη των τεχνολογικών τους παραμέτρων και των δικτυακών υποδομών αναμένεται να αποτελέσουν στο άμεσο μέλλον την κύρια μέθοδο απεικόνισης για ένα ακόμη μεγαλύτερο πλήθος από καθημερινές εφαρμογές. Η έρευνα που πραγματοποιήθηκε στα πλαίσια της παρούσας διατριβής αποτελεί μία προχωρημένη μελέτη για ένα συγκεκριμένο είδος μεθόδου 3Δ απεικόνισης που ονομάζεται Ολοκληρωτική Φωτογράφιση (Ιntegral Photography - IP). Στο πρώτο τμήμα της μελέτης εξετάστηκαν οι δυνατότητες της μεθόδου και αναπτύχθηκε ένα πρωτότυπο ψηφιακό σύστημα καταγραφής εικόνων Ολοκληρωτικής Φωτογράφισης (ΟΦ) πραγματικών αντικειμένων του εγγύς πεδίου της συσκευής με χρήση ενός επίπεδου σαρωτή, ικανό να παράγει εικόνες με ιδιαίτερα υψηλή ανάλυση, σε σχέση με τα μέχρι τούδε προταθέντα ψηφιακά συστήματα. Στο δεύτερο τμήμα της παρούσας έρευνας αναπτύχθηκε, για πρώτη φορά, ένα αυτόματο σύστημα ευθυγράμμισης των αισθητήρων που χρησιμοποιούνται με τα οπτικά μέρη του συστήματος, το οποίο δεν προϋποθέτει καμία γνώση για τα χαρακτηριστικά του συστήματος χρησιμοποιώντας ένα πλήθος τεχνικών ανάλυσης εικόνας και αναγνώρισης προτύπων. Η παρούσα έρευνα ολοκληρώνεται με την ανάπτυξη εξειδικευμένων αλγορίθμων κωδικοποίησης των εικόνων ΟΦ, οι οποίες καταφέρνουν να μειώσουν σε εξαιρετικό βαθμό τον εγγενή πλεονασμό που περιέχουν αυτές

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Shape variation modelling, analysis and statistical control for assembly system with compliant parts

    Get PDF
    Modern competitive market demands frequent change in product variety, increased production volume and shorten product/process change over time. These market requirements point towards development of key enabling technologies (KETs) to shorten product and process development cycle, improved production quality and reduced time-to-launch. One of the critical prerequisite to develop the aforementioned KETs is efficient and accurate modelling of product and process dimensional errors. It is especially critical for assembly processes with compliant parts as used in automotive body, appliance or wing and fuselage assemblies. Currently, the assembly process is designed under the assumption of ideal (nominal) products and then check by using variation simulation analysis (VSA). However, the VSA simulations are oversimplified as they are unable to accurately model or predict the effects of geometric and dimensional variations of compliant parts, as well as variations of key characteristics related to fixturing and joining process. This results in product failures and/or reduced quality due to un-modelled interactions in assembly process. Therefore, modelling and prediction of the geometric shape errors of complex sheet metal parts are of tremendous importance for many industrial applications. Further, as production yield and product quality are determined for production volume of real parts, thus not only shape errors but also shape variation model is required for robust assembly system development. Currently, parts shape variation can be measured during production by using recently introduced non-contact gauges which are fast, in-line and can capture entire part surface information. However, current applications of non-contact scanners are limited to single part inspection or reverse engineering applications and cannot be used for monitoring and statistical process control of shape variation. Further, the product shape variation can be reduced through appropriate assembly fixture design. Current approaches for assembly fixture design seldom consider shape variation of production parts during assembly process which result in poor quality and yield. To address the aforementioned challenges, this thesis proposes the following two enablers focused on modelling of shape errors and shape variation of compliant parts applicable during assembly process design phase as well as production phase: (i) modelling and characterisation of shape errors of individual compliant part with capabilities to quantify fabrication errors at part level; and (ii) modelling and characterisation of shape variation of a batch of compliant parts with capabilities to quantify the shape variation at production level. The first enabler focuses on shape errors modelling and characterisation which includes developing a functional data analysis model for identification and characterisation of real part shape errors that can link design (CAD model) with manufacturing (shape errors). A new functional data analysis model, named Geometric Modal Analysis (GMA), is proposed to extract dominant shape error xixmodes from the fabricated part measurement data. This model is used to decompose shape errors of 3D sheet metal part into orthogonal shape error modes which can be used for product and process interactions. Further, the enabler can be used for statistical process control to monitor shape quality; fabrication process mapping and diagnosis; geometric dimensioning and tolerancing simulation with free form shape errors; or compact storage of shape information. The second enabler aims to model and characterise shape variation of a batch of compliant parts by extending the GMA approach. The developed functional model called Statistical Geometric Modal Analysis (SGMA) represents the statistical shape variation through modal characteristics and quantifies shape variation of a batch of sheet metal parts a single or a few composite parts. The composite part(s) represent major error modes induced by the production process. The SGMA model, further, can be utilised for assembly fixture optimisation, tolerance analysis and synthesis. Further, these two enablers can be applied for monitoring and reduction of shape variation from assembly process by developing: (a) efficient statistical process control technique (based on enabler ‘i’) to monitor part shape variation utilising the surface information captured using non-contact scanners; and (b) efficient assembly fixture layout optimisation technique (based on enabler ‘ii’) to obtain improved quality products considering shape variation of production parts. Therefore, this thesis proposes the following two applications: The first application focuses on statistical process control of part shape variation using surface data captured by in-process or off-line scanners as Cloud-of-Points (CoPs). The methodology involves obtaining reduced set of statistically uncorrelated and independent variables from CoPs (utilising GMA method) which are then used to develop integrated single bivariate T2-Q monitoring chart. The joint probability density estimation using non-parametric Kernel Density Estimator (KDE) has enhanced sensitivity to detect part shape variation. The control chart helps speedy detection of part shape errors including global or local shape defects. The second application determines optimal fixture layout considering production batch of compliant sheet metal parts. Fixtures control the position and orientation of parts in an assembly process and thus significantly contribute to process capability that determines production yield and product quality. A new approach is proposed to improve the probability of joining feasibility index by determining an N-2-1 fixture layout optimised for a production batch. The SGMA method has been utilised for fixture layout optimisation considering a batch of compliant sheet metal parts. All the above developed methodologies have been validated and verified with industrial case studies of automotive sheet metal door assembly process. Further, they are compared with state-of-the-art methodologies to highlight the boarder impact of the research work to meet the increasing market requirements such as improved in-line quality and increased productivity