46 research outputs found
Minimisation of image watermarking side effects through subjective optimisation
This study investigates the use of structural similarity index (SSIM) on the minimised side effect to image watermarking. For the fast implementation and more compatibility with the standard discrete cosine transform (DCT)-based codecs, watermark insertion is carried out on the DCT coefficients and hence an SSIM model for DCT-based watermarking is developed. For faster implementation, the SSIM index is maximised over independent 4 × 4 non-overlapped blocks, but the disparity between the adjacent blocks reduces the overall image quality. This problem is resolved through optimisation of overlapped blocks, but, the higher image quality is achieved at a cost of high computational complexity. To reduce the computational complexity while preserving the good quality, optimisation of semi-overlapped blocks is introduced. The authors show that while SSIM-based optimisation over overlapped blocks has as high as 64 times the complexity of the 4 × 4 non-overlapped method, with semi-overlapped optimisation the high quality of overlapped method is preserved only at a cost of less than 8 times the non-overlapped method
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Intelligent Subgrouping of Multitrack Audio
Subgrouping facilitates the simultaneous manipulation of a number of audio tracks and is a central aspect of mix engineering. However, the decision process of subgrouping is a poorly documented technique. This research sheds light on this ubiquitous but poorly de ned mix practice, provides rules and constraints on how it should be approached as well as demonstrates its bene t to an automatic mixing system. I rst explored the relationship that subgrouping has with perceived mix quality by examining a number of mix projects. This was in order to decipher the actual process of creating subgroups and to see if any of the decisions made were intrinsically linked to mix quality. I found mix quality to be related to the number of subgroups and type of subgroup processing used. This subsequently led me to interviewing distinguished professionals in the audio engineering eld, with the intention of gaining a deeper understanding of the process. The outcome of these interviews and the previous analyses of mix projects allowed me to propose rules that could be used for real life mixing and automatic mixing. Some of the rules I established were used to research and develop a method for the automatic creation of subgroups using machine learning techniques. I also investigated the relationship between music production quality and human emotion. This was to see if music production quality had an emotional e ect on a particular type of listener. The results showed that the emotional impact of mixing only really mattered to those with critical listening skills. This result is important for automatic mixing systems in general, as it would imply that quality only really matters to a minority of people. I concluded my research on subgrouping by conducting an experiment to see if subgrouping would bene t the perceived clarity and quality of a mix. The results of a subjective listening test showed this to be true
Contribution des filtres LPTV et des techniques d'interpolation au tatouage numérique
Les Changements d'Horloge Périodiques (PCC) et les filtres Linéaires Variant Périodiquement dans le Temps (LPTV) sont utilisés dans le domaine des télécommunications multi-utilisateurs. Dans cette thèse, nous montrons que, dans l'ensemble des techniques de tatouage par étalement de spectre, ils peuvent se substituer à la modulation par code pseudo-aléatoire. Les modules de décodage optimal, de resynchronisation, de pré-annulation des interférences et de quantification de la transformée d'étalement s'appliquent également aux PCC et aux filtres LPTV. Pour le modèle de signaux stationnaires blancs gaussiens, ces techniques présentent des performances identiques à l'étalement à Séquence Directe (DS) classique. Cependant, nous montrons que, dans le cas d'un signal corrélé localement, la luminance d'une image naturelle notamment, la périodicité des PCC et des filtres LPTV associée à un parcours d'image de type Peano-Hilbert conduit à de meilleures performances. Les filtres LPTV sont en outre un outil plus puissant qu'une simple modulation DS. Nous les utilisons pour effectuer un masquage spectral simultanément à l'étalement, ainsi qu'un rejet des interférences de l'image dans le domaine spectral. Cette dernière technique possède de très bonnes performances au décodage. Le second axe de cette thèse est l'étude des liens entre interpolation et tatouage numérique. Nous soulignons d'abord le rôle de l'interpolation dans les attaques sur la robustesse du tatouage. Nous construisons ensuite des techniques de tatouage bénéficiant des propriétés perceptuelles de l'interpolation. La première consiste en des masques perceptuels utilisant le bruit d'interpolation. Dans la seconde, un schéma de tatouage informé est construit autour de l'interpolation. Cet algorithme, qu'on peut relier aux techniques de catégorisation aléatoire, utilise des règles d'insertion et de décodage originales, incluant un masquage perceptuel intrinsèque. Outre ces bonnes propriétés perceptuelles, il présente un rejet des interférences de l'hôte et une robustesse à diverses attaques telles que les transformations valumétriques. Son niveau de sécurité est évalué à l'aide d'algorithmes d'attaque pratiques. ABSTRACT : Periodic Clock Changes (PCC) and Linear Periodically Time Varying (LPTV) filters have previously been applied to multi-user telecommunications in the Signal and Communications group of IRIT laboratory. In this thesis, we show that in each digital watermarking scheme involving spread-spectrum, they can be substituted to modulation by a pseudo-noise. The additional steps of optimal decoding, resynchronization, pre-cancellation of interference and quantization of a spread transform apply also to PCCs and LPTV filters. For white Gaussian stationary signals, these techniques offer similar performance as classical Direct Sequence (DS) spreading. However we show that, in the case of locally correlated signals such as image luminance, the periodicity of PCCs and LPTV filters associated to a Peano-Hilbert scan leads to better performance. Moreover, LPTV filters are a more powerful tool than simple DS modulation. We use LPTV filters to conduct spectrum masking simultaneous to spreading, as well as image interference cancellation in the spectral domain. The latter technique offers good decoding performance. The second axis of this thesis is the study of the links between interpolation and digital watermarking.We stress the role of interpolation in attacks on the watermark.We propose then watermarking techniques that benefit from interpolation perceptual properties. The first technique consists in constructing perceptualmasks proportional to an interpolation error. In the second technique, an informed watermarking scheme derives form interpolation. This scheme exhibits good perceptual properties, host-interference rejection and robustness to various attacks such as valumetric transforms. Its security level is assessed by ad hoc practical attack algorithms
Improving time efficiency of feedforward neural network learning
Feedforward neural networks have been widely studied and used in many applications in science and engineering. The training of this type of networks is mainly undertaken using the well-known backpropagation based learning algorithms. One major problem with this type of algorithms is the slow training convergence speed, which hinders their applications. In order to improve the training convergence speed of this type of algorithms, many researchers have developed different improvements and enhancements. However, the slow convergence problem has not been fully addressed. This thesis makes several contributions by proposing new backpropagation learning algorithms based on the terminal attractor concept to improve the existing backpropagation learning algorithms such as the gradient descent and Levenberg-Marquardt algorithms. These new algorithms enable fast convergence both at a distance from and in a close range of the ideal weights. In particular, a new fast convergence mechanism is proposed which is based on the fast terminal attractor concept. Comprehensive simulation studies are undertaken to demonstrate the effectiveness of the proposed backpropagataion algorithms with terminal attractors. Finally, three practical application cases of time series forecasting, character recognition and image interpolation are chosen to show the practicality and usefulness of the proposed learning algorithms with comprehensive comparative studies with existing algorithms
Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics
This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p
Schémas de tatouage d'images, schémas de tatouage conjoint à la compression, et schémas de dissimulation de données
In this manuscript we address data-hiding in images and videos. Specifically we address robust watermarking for images, robust watermarking jointly with compression, and finally non robust data-hiding.The first part of the manuscript deals with high-rate robust watermarking. After having briefly recalled the concept of informed watermarking, we study the two major watermarking families : trellis-based watermarking and quantized-based watermarking. We propose, firstly to reduce the computational complexity of the trellis-based watermarking, with a rotation based embedding, and secondly to introduce a trellis-based quantization in a watermarking system based on quantization.The second part of the manuscript addresses the problem of watermarking jointly with a JPEG2000 compression step or an H.264 compression step. The quantization step and the watermarking step are achieved simultaneously, so that these two steps do not fight against each other. Watermarking in JPEG2000 is achieved by using the trellis quantization from the part 2 of the standard. Watermarking in H.264 is performed on the fly, after the quantization stage, choosing the best prediction through the process of rate-distortion optimization. We also propose to integrate a Tardos code to build an application for traitors tracing.The last part of the manuscript describes the different mechanisms of color hiding in a grayscale image. We propose two approaches based on hiding a color palette in its index image. The first approach relies on the optimization of an energetic function to get a decomposition of the color image allowing an easy embedding. The second approach consists in quickly obtaining a color palette of larger size and then in embedding it in a reversible way.Dans ce manuscrit nous abordons l’insertion de données dans les images et les vidéos. Plus particulièrement nous traitons du tatouage robuste dans les images, du tatouage robuste conjointement à la compression et enfin de l’insertion de données (non robuste).La première partie du manuscrit traite du tatouage robuste à haute capacité. Après avoir brièvement rappelé le concept de tatouage informé, nous étudions les deux principales familles de tatouage : le tatouage basé treillis et le tatouage basé quantification. Nous proposons d’une part de réduire la complexité calculatoire du tatouage basé treillis par une approche d’insertion par rotation, ainsi que d’autre part d’introduire une approche par quantification basée treillis au seind’un système de tatouage basé quantification.La deuxième partie du manuscrit aborde la problématique de tatouage conjointement à la phase de compression par JPEG2000 ou par H.264. L’idée consiste à faire en même temps l’étape de quantification et l’étape de tatouage, de sorte que ces deux étapes ne « luttent pas » l’une contre l’autre. Le tatouage au sein de JPEG2000 est effectué en détournant l’utilisation de la quantification basée treillis de la partie 2 du standard. Le tatouage au sein de H.264 est effectué à la volée, après la phase de quantification, en choisissant la meilleure prédiction via le processus d’optimisation débit-distorsion. Nous proposons également d’intégrer un code de Tardos pour construire une application de traçage de traîtres.La dernière partie du manuscrit décrit les différents mécanismes de dissimulation d’une information couleur au sein d’une image en niveaux de gris. Nous proposons deux approches reposant sur la dissimulation d’une palette couleur dans son image d’index. La première approche consiste à modéliser le problème puis à l’optimiser afin d’avoir une bonne décomposition de l’image couleur ainsi qu’une insertion aisée. La seconde approche consiste à obtenir, de manière rapide et sûre, une palette de plus grande dimension puis à l’insérer de manière réversible
A Field Guide to Genetic Programming
xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction --
Representation, initialisation and operators in Tree-based GP --
Getting ready to run genetic programming --
Example genetic programming run --
Alternative initialisations and operators in Tree-based GP --
Modular, grammatical and developmental Tree-based GP --
Linear and graph genetic programming --
Probalistic genetic programming --
Multi-objective genetic programming --
Fast and distributed genetic programming --
GP theory and its applications --
Applications --
Troubleshooting GP --
Conclusions.Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell
1.2 Getting Started
1.3 Prerequisites
1.4 Overview of this Field Guide I
Basics
2 Representation, Initialisation and GP
2.1 Representation
2.2 Initialising the Population
2.3 Selection
2.4 Recombination and Mutation Operators in Tree-based
3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set 19
3.2 Step 2: Function Set 20
3.2.1 Closure 21
3.2.2 Sufficiency 23
3.2.3 Evolving Structures other than Programs 23
3.3 Step 3: Fitness Function 24
3.4 Step 4: GP Parameters 26
3.5 Step 5: Termination and solution designation 27
4 Example Genetic Programming Run
4.1 Preparatory Steps 29
4.2 Step-by-Step Sample Run 31
4.2.1 Initialisation 31
4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming
5 Alternative Initialisations and Operators in
5.1 Constructing the Initial Population
5.1.1 Uniform Initialisation
5.1.2 Initialisation may Affect Bloat
5.1.3 Seeding
5.2 GP Mutation
5.2.1 Is Mutation Necessary?
5.2.2 Mutation Cookbook
5.3 GP Crossover
5.4 Other Techniques 32
5.5 Tree-based GP 39
6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures 47
6.1.1 Automatically Defined Functions 48
6.1.2 Program Architecture and Architecture-Altering 50
6.2 Constraining Structures 51
6.2.1 Enforcing Particular Structures 52
6.2.2 Strongly Typed GP 52
6.2.3 Grammar-based Constraints 53
6.2.4 Constraints and Bias 55
6.3 Developmental Genetic Programming 57
6.4 Strongly Typed Autoconstructive GP with PushGP 59
7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming 61
7.1.1 Motivations 61
7.1.2 Linear GP Representations 62
7.1.3 Linear GP Operators 64
7.2 Graph-Based Genetic Programming 65
7.2.1 Parallel Distributed GP (PDGP) 65
7.2.2 PADO 67
7.2.3 Cartesian GP 67
7.2.4 Evolving Parallel Programs using Indirect Encodings 68
8 Probabilistic Genetic Programming
8.1 Estimation of Distribution Algorithms 69
8.2 Pure EDA GP 71
8.3 Mixing Grammars and Probabilities 74
9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate 76
9.2.1 Multi-objective Bloat and Complexity Control 77
9.2.2 Other Objectives 78
9.2.3 Non-Pareto Criteria 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias 81
10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83
10.2 Reducing Cost of Fitness with Caches 86
10.3 Parallel and Distributed GP are Not Equivalent 88
10.4 Running GP on Parallel Hardware 89
10.4.1 Master–slave GP 89
10.4.2 GP Running on GPUs 90
10.4.3 GP on FPGAs 92
10.4.4 Sub-machine-code GP 93
10.5 Geographically Distributed GP 93
11 GP Theory and its Applications 97
11.1 Mathematical Models 98
11.2 Search Spaces 99
11.3 Bloat 101
11.3.1 Bloat in Theory 101
11.3.2 Bloat Control in Practice 104
III
Practical Genetic Programming
12 Applications
12.1 Where GP has Done Well
12.2 Curve Fitting, Data Modelling and Symbolic Regression
12.3 Human Competitive Results – the Humies
12.4 Image and Signal Processing
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control
12.7 Medicine, Biology and Bioinformatics
12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii
12.9 Entertainment and Computer Games 127
12.10The Arts 127
12.11Compression 128
13 Troubleshooting GP
13.1 Is there a Bug in the Code?
13.2 Can you Trust your Results?
13.3 There are No Silver Bullets
13.4 Small Changes can have Big Effects
13.5 Big Changes can have No Effect
13.6 Study your Populations
13.7 Encourage Diversity
13.8 Embrace Approximation
13.9 Control Bloat
13.10 Checkpoint Results
13.11 Report Well
13.12 Convince your Customers
14 Conclusions
Tricks of the Trade
A Resources
A.1 Key Books
A.2 Key Journals
A.3 Key International Meetings
A.4 GP Implementations
A.5 On-Line Resources 145
B TinyGP 151
B.1 Overview of TinyGP 151
B.2 Input Data Files for TinyGP 153
B.3 Source Code 154
B.4 Compiling and Running TinyGP 162
Bibliography 167
Inde