10 research outputs found

    NON-LINEAR AND SPARSE REPRESENTATIONS FOR MULTI-MODAL RECOGNITION

    Get PDF
    In the first part of this dissertation, we address the problem of representing 2D and 3D shapes. In particular, we introduce a novel implicit shape representation based on Support Vector Machine (SVM) theory. Each shape is represented by an analytic decision function obtained by training an SVM, with a Radial Basis Function (RBF) kernel, so that the interior shape points are given higher values. This empowers support vector shape (SVS) with multifold advantages. First, the representation uses a sparse subset of feature points determined by the support vectors, which significantly improves the discriminative power against noise, fragmentation and other artifacts that often come with the data. Second, the use of the RBF kernel provides scale, rotation, and translation invariant features, and allows a shape to be represented accurately regardless of its complexity. Finally, the decision function can be used to select reliable feature points. These features are described using gradients computed from highly consistent decision functions instead of conventional edges. Our experiments on 2D and 3D shapes demonstrate promising results. The availability of inexpensive 3D sensors like Kinect necessitates the design of new representation for this type of data. We present a 3D feature descriptor that represents local topologies within a set of folded concentric rings by distances from local points to a projection plane. This feature, called as Concentric Ring Signature (CORS), possesses similar computational advantages to point signatures yet provides more accurate matches. CORS produces compact and discriminative descriptors, which makes it more robust to noise and occlusions. It is also well-known to computer vision researchers that there is no universal representation that is optimal for all types of data or tasks. Sparsity has proved to be a good criterion for working with natural images. This motivates us to develop efficient sparse and non-linear learning techniques for automatically extracting useful information from visual data. Specifically, we present dictionary learning methods for sparse and redundant representations in a high-dimensional feature space. Using the kernel method, we describe how the well-known dictionary learning approaches such as the method of optimal directions and KSVD can be made non-linear. We analyse their kernel constructions and demonstrate their effectiveness through several experiments on classification problems. It is shown that non-linear dictionary learning approaches can provide significantly better discrimination compared to their linear counterparts and kernel PCA, especially when the data is corrupted by different types of degradations. Visual descriptors are often high dimensional. This results in high computational complexity for sparse learning algorithms. Motivated by this observation, we introduce a novel framework, called sparse embedding (SE), for simultaneous dimensionality reduction and dictionary learning. We formulate an optimization problem for learning a transformation from the original signal domain to a lower-dimensional one in a way that preserves the sparse structure of data. We propose an efficient optimization algorithm and present its non-linear extension based on the kernel methods. One of the key features of our method is that it is computationally efficient as the learning is done in the lower-dimensional space and it discards the irrelevant part of the signal that derails the dictionary learning process. Various experiments show that our method is able to capture the meaningful structure of data and can perform significantly better than many competitive algorithms on signal recovery and object classification tasks. In many practical applications, we are often confronted with the situation where the data that we use to train our models are different from that presented during the testing. In the final part of this dissertation, we present a novel framework for domain adaptation using a sparse and hierarchical network (DASH-N), which makes use of the old data to improve the performance of a system operating on a new domain. Our network jointly learns a hierarchy of features together with transformations that rectify the mismatch between different domains. The building block of DASH-N is the latent sparse representation. It employs a dimensionality reduction step that can prevent the data dimension from increasing too fast as traversing deeper into the hierarchy. Experimental results show that our method consistently outperforms the current state-of-the-art by a significant margin. Moreover, we found that a multi-layer {DASH-N} has an edge over the single-layer DASH-N

    Bayesian Dictionary Learning for Single and Coupled Feature Spaces

    Get PDF
    Over-complete bases offer the flexibility to represent much wider range of signals with more elementary basis atoms than signal dimension. The use of over-complete dictionaries for sparse representation has been a new trend recently and has increasingly become recognized as providing high performance for applications such as denoise, image super-resolution, inpaiting, compression, blind source separation and linear unmixing. This dissertation studies the dictionary learning for single or coupled feature spaces and its application in image restoration tasks. A Bayesian strategy using a beta process prior is applied to solve both problems. Firstly, we illustrate how to generalize the existing beta process dictionary learning method (BP) to learn dictionary for single feature space. The advantage of this approach is that the number of dictionary atoms and their relative importance may be inferred non-parametrically. Next, we propose a new beta process joint dictionary learning method (BP-JDL) for coupled feature spaces, where the learned dictionaries also reflect the relationship between the two spaces. Compared to previous couple feature spaces dictionary learning algorithms, our algorithm not only provides dictionaries that customized to each feature space, but also adds more consistent and accurate mapping between the two feature spaces. This is due to the unique property of the beta process model that the sparse representation can be decomposed to values and dictionary atom indicators. The proposed algorithm is able to learn sparse representations that correspond to the same dictionary atoms with the same sparsity but different values in coupled feature spaces, thus bringing consistent and accurate mapping between coupled feature spaces. Two applications, single image super-resolution and inverse halftoning, are chosen to evaluate the performance of the proposed Bayesian approach. In both cases, the Bayesian approach, either for single feature space or coupled feature spaces, outperforms state-of-the-art methods in comparative domains

    Colour coded

    Get PDF
    This 300 word publication to be published by the Society of Dyers and Colourists (SDC) is a collection of the best papers from a 4-year European project that has considered colour from the perspective of both the arts and sciences.The notion of art and science and the crossovers between the two resulted in application and funding for cross disciplinary research to host a series of training events between 2006 and 2010 Marie Curie Conferences & Training Courses (SCF) Call Identifier: FP6-Mobility-4, Euros 532,363.80 CREATE – Colour Research for European Advanced Technology Employment. The research crossovers between the fields of art, science and technology was also a subject that was initiated through Bristol’s Festival if Ideas events in May 2009. The author coordinated and chaired an event during which the C.P Snow lecture “On Two Cultures’ (1959) was re-presented by Actor Simon Cook and then a lecture made by Raymond Tallis on the notion of the Polymath. The CREATE project has a worldwide impact for researchers, academics and scientists. Between January and October 2009, the site has received 221, 414 visits. The most popular route into the site is via the welcome page. The main groups of visitors originate in the UK (including Northern Ireland), Italy, France, Finland, Norway, Hungary, USA, Finland and Spain. A basic percentage breakdown of the traffic over ten months indicates: USA -15%; UK - 16%; Italy - 13%; France -12%; Hungary - 10%; Spain - 6%; Finland - 9%; Norway - 5%. The remaining approximate 14% of visitors are from other countries including Belgium, The Netherlands and Germany (approx 3%). A discussion group has been initiated by the author as part of the CREATE project to facilitate an ongoing dialogue between artists and scientists. http://createcolour.ning.com/group/artandscience www.create.uwe.ac.uk.Related papers to this research: A report on the CREATE Italian event: Colour in cultural heritage.C. Parraman, A. Rizzi, ‘Developing the CREATE network in Europe’, in Colour in Art, Design and Nature, Edinburgh, 24 October 2008.C. Parraman, “Mixing and describing colour”. CREATE (Training event 1), France, 2008

    Real-time Ultrasound Signals Processing: Denoising and Super-resolution

    Get PDF
    Ultrasound acquisition is widespread in the biomedical field, due to its properties of low cost, portability, and non-invasiveness for the patient. The processing and analysis of US signals, such as images, 2D videos, and volumetric images, allows the physician to monitor the evolution of the patient's disease, and support diagnosis, and treatments (e.g., surgery). US images are affected by speckle noise, generated by the overlap of US waves. Furthermore, low-resolution images are acquired when a high acquisition frequency is applied to accurately characterise the behaviour of anatomical features that quickly change over time. Denoising and super-resolution of US signals are relevant to improve the visual evaluation of the physician and the performance and accuracy of processing methods, such as segmentation and classification. The main requirements for the processing and analysis of US signals are real-time execution, preservation of anatomical features, and reduction of artefacts. In this context, we present a novel framework for the real-time denoising of US 2D images based on deep learning and high-performance computing, which reduces noise while preserving anatomical features in real-time execution. We extend our framework to the denoise of arbitrary US signals, such as 2D videos and 3D images, and we apply denoising algorithms that account for spatio-temporal signal properties into an image-to-image deep learning model. As a building block of this framework, we propose a novel denoising method belonging to the class of low-rank approximations, which learns and predicts the optimal thresholds of the Singular Value Decomposition. While previous denoise work compromises the computational cost and effectiveness of the method, the proposed framework achieves the results of the best denoising algorithms in terms of noise removal, anatomical feature preservation, and geometric and texture properties conservation, in a real-time execution that respects industrial constraints. The framework reduces the artefacts (e.g., blurring) and preserves the spatio-temporal consistency among frames/slices; also, it is general to the denoising algorithm, anatomical district, and noise intensity. Then, we introduce a novel framework for the real-time reconstruction of the non-acquired scan lines through an interpolating method; a deep learning model improves the results of the interpolation to match the target image (i.e., the high-resolution image). We improve the accuracy of the prediction of the reconstructed lines through the design of the network architecture and the loss function. %The design of the deep learning architecture and the loss function allow the network to improve the accuracy of the prediction of the reconstructed lines. In the context of signal approximation, we introduce our kernel-based sampling method for the reconstruction of 2D and 3D signals defined on regular and irregular grids, with an application to US 2D and 3D images. Our method improves previous work in terms of sampling quality, approximation accuracy, and geometry reconstruction with a slightly higher computational cost. For both denoising and super-resolution, we evaluate the compliance with the real-time requirement of US applications in the medical domain and provide a quantitative evaluation of denoising and super-resolution methods on US and synthetic images. Finally, we discuss the role of denoising and super-resolution as pre-processing steps for segmentation and predictive analysis of breast pathologies

    Sparse Methods for Robust and Efficient Visual Recognition

    Get PDF
    Visual recognition has been a subject of extensive research in computer vision. A vast literature exists on feature extraction and learning methods for recognition. However, due to large variations in visual data, robust visual recognition is still an open problem. In recent years, sparse representation-based methods have become popular for visual recognition. By learning a compact dictionary of data and exploiting the notion of sparsity, start-of-the-art results have been obtained on many recognition tasks. However, existing data-driven sparse model techniques may not be optimal for some challenging recognition problems. In this dissertation, we consider some of these recognition tasks and present approaches based on sparse coding for robust and efficient recognition in such cases. First we study the problem of low-resolution face recognition. This is a challenging problem, and methods have been proposed using super-resolution and machine learning based techniques. However, these methods cannot handle variations like illumination changes which can happen at low resolutions, and degrade the performance. We propose a generative approach for classifying low resolution faces, by exploiting 3D face models. Further, we propose a joint sparse coding framework for robust classification at low resolutions. The effectiveness of the method is demonstrated on different face datasets. In the second part, we study a robust feature-level fusion method for multimodal biometric recognition. Although score-level and decision-level fusion methods exist in biometric literature, feature-level fusion is challenging due to different output formats of biometric modalities. In this work, we propose a novel sparse representation-based method for multimodal fusion, and present experimental results for a large multimodal dataset. Robustness to noise and occlusion are demonstrated. In the third part, we consider the problem of domain adaptation, where we want to learn effective classifiers for cases where the test images come from a different distribution than the training data. Typically, due to high cost of human annotation, very few labeled samples are available for images in the test domain. Specifically, we study the problem of adapting sparse dictionary-based classification methods for such cases. We describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both domains in the projected low dimensional space. The proposed method is efficient and performs on par or better than many competing state-of-the-art methods. Lastly, we study an emerging analysis framework of sparse coding for image classification. We show that the analysis sparse coding can give similar performance as the typical synthesis sparse coding methods, while being much faster at sparse encoding. In the end, we conclude the dissertation with discussions and possible future directions

    Reoriented Illustration: Towards the Networked Image

    Get PDF
    In recent years there has emerged an increasing theoretical and contextual impetus from within the discipline of illustration that would seek to define the practice by authorial approaches to the production and distribution of illustrated content. The priority of this investigation is the attempt to imagine a theoretical landscape or environment in which an ‘authorial turn’ within the discipline might emerge and anchor itself to strategies outside of persisting colloquial or industrial notions of illustration practice. Specifically, this paper aims to tie such thinking to existing practices and concepts relevant to the contemporary construction, distribution and exchange of networked images

    2009-2010 Course Catalog

    Get PDF
    2009-2010 Course Catalo

    The Music Sound

    Get PDF
    A guide for music: compositions, events, forms, genres, groups, history, industry, instruments, language, live music, musicians, songs, musicology, techniques, terminology , theory, music video. Music is a human activity which involves structured and audible sounds, which is used for artistic or aesthetic, entertainment, or ceremonial purposes. The traditional or classical European aspects of music often listed are those elements given primacy in European-influenced classical music: melody, harmony, rhythm, tone color/timbre, and form. A more comprehensive list is given by stating the aspects of sound: pitch, timbre, loudness, and duration. Common terms used to discuss particular pieces include melody, which is a succession of notes heard as some sort of unit; chord, which is a simultaneity of notes heard as some sort of unit; chord progression, which is a succession of chords (simultaneity succession); harmony, which is the relationship between two or more pitches; counterpoint, which is the simultaneity and organization of different melodies; and rhythm, which is the organization of the durational aspects of music
    corecore