3 research outputs found

    Signal structure: from manifolds to molecules and structured sparsity

    Get PDF
    Effective representation methods and proper signal priors are crucial in most signal processing applications. In this thesis we focus on different structured models and we design appropriate schemes that allow the discovery of low dimensional latent structures that characterise and identify the signals. Motivated by the highly non-linear structure of most datasets, we firstly investigate the geometry of manifolds. Manifolds are low dimensional, non-linear structures that are naturally employed to describe sets of strongly related signals such as the images of a 3-D object captured from different viewpoints. However, the use of manifolds in applications is not straightforward due to their usually non-analytic and non-linear form. We propose here a way to `disassemble' a manifold into simpler components by approximating it with affine subspaces. Our objective is to discover a set of low dimensional affine subspaces that can represent manifold data accurately while preserving the manifold's structure. To this end, we employ a greedy technique that iteratively merges manifold samples into groups based on the difference of local tangents. We use our algorithm to approximate synthetic and real manifolds and to demonstrate that it is competitive to state-of-the-art techniques. Then, we consider structured sparse representations of signals and we propose a new sparsity model, where signals are essentially composed of a small number of structured {\it molecules }. We define the molecules to be linear combinations of a small number of atoms in a redundant dictionary. Our multi-level model takes into account the energy distribution of the significant signal components in addition to their support. It permits to define typical visual patterns and recognise them in prototypical or deformed form. We define a new structural difference measure between molecules and their deformed versions, which is based on their sparse codes and we create an algorithm for decomposing signals into molecules that can account for different deviations in the internal molecule structure. Our experiments verify the benefits of the new image model in various restoration tasks and they confirm that the development of proper models that extend the mere notion of sparsity can be very useful for various inverse problems in imaging. Finally, we investigate the problem of learning molecule representations directly in the sparse code domain. We constrain sparse codes to be linear combinations of a few, possibly deformed, molecules and we design an algorithm that can learn the structure from the codes without transforming them back into the signal domain. To this end, we take advantage of our structural difference which is based on the sparse codes and we devise a scheme for representing the codes with molecules and learn the molecules at the same time. To illustrate the effectiveness of our proposed algorithm we apply it to various synthetic and real datasets and we compare the results with traditional sparse coding and dictionary learning techniques. From the experiments, we verify the superior performance of our scheme in interpreting and recognising correctly the underlying structure. In short, in this thesis we are interested in low-dimensional, structured models. Among the various choices, we focus on manifolds and sparse representations and we propose schemes that enhance their structural properties and highlight their effectiveness in signal representations

    Subspace clustering of microarray data based on domain transformation

    No full text
    Abstract. We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that genes are often coexpressed under subsets of experimental conditions, we present a novel subspace clustering algorithm. In contrast to previous approaches, our method is based on the observation that the number of subspace clusters is related with the number of maximal subspace clusters to which any gene pair can belong. By performing discretization to gene expression profiles, the similarity between two genes is transformed as a sequence of symbols that represents the maximal subspace cluster for the gene pair. This domain transformation (from genes into gene-gene relations) allows us to make the number of possible subspace clusters dependent on the number of genes. Based on the symbolic representations of genes, we present an efficient subspace clustering algorithm that is scalable to the number of dimensions. In addition, the running time can be drastically reduced by utilizing inverted index and pruning non-interesting subspaces. Experimental results indicate that the proposed method efficiently identifies co-expressed gene subspace clusters for a yeast cell cycle dataset.
    corecore