22 research outputs found

    Correntropy Maximization via ADMM - Application to Robust Hyperspectral Unmixing

    Full text link
    In hyperspectral images, some spectral bands suffer from low signal-to-noise ratio due to noisy acquisition and atmospheric effects, thus requiring robust techniques for the unmixing problem. This paper presents a robust supervised spectral unmixing approach for hyperspectral images. The robustness is achieved by writing the unmixing problem as the maximization of the correntropy criterion subject to the most commonly used constraints. Two unmixing problems are derived: the first problem considers the fully-constrained unmixing, with both the non-negativity and sum-to-one constraints, while the second one deals with the non-negativity and the sparsity-promoting of the abundances. The corresponding optimization problems are solved efficiently using an alternating direction method of multipliers (ADMM) approach. Experiments on synthetic and real hyperspectral images validate the performance of the proposed algorithms for different scenarios, demonstrating that the correntropy-based unmixing is robust to outlier bands.Comment: 23 page

    Statistical signal processing of nonstationary tensor-valued data

    Get PDF
    Real-world signals, such as the evolution of three-dimensional vector fields over time, can exhibit highly structured probabilistic interactions across their multiple constitutive dimensions. This calls for analysis tools capable of directly capturing the inherent multi-way couplings present in such data. Yet, current analyses typically employ multivariate matrix models and their associated linear algebras which are agnostic to the global data structure and can only describe local linear pairwise relationships between data entries. To address this issue, this thesis uses the property of linear separability -- a notion intrinsic to multi-dimensional data structures called tensors -- as a linchpin to consider the probabilistic, statistical and spectral separability under one umbrella. This helps to both enhance physical meaning in the analysis and reduce the dimensionality of tensor-valued problems. We first introduce a new identifiable probability distribution which appropriately models the interactions between random tensors, whereby linear relationships are considered between tensor fibres as opposed to between individual entries as in standard matrix analysis. Unlike existing models, the proposed tensor probability distribution formulation is shown to yield a unique maximum likelihood estimator which is demonstrated to be statistically efficient. Both matrices and vectors are lower-order tensors, and this gives us a unique opportunity to consider some matrix signal processing models under the more powerful framework of multilinear tensor algebra. By introducing a model for the joint distribution of multiple random tensors, it is also possible to treat random tensor regression analyses and subspace methods within a unified separability framework. Practical utility of the proposed analysis is demonstrated through case studies over synthetic and real-world tensor-valued data, including the evolution over time of global atmospheric temperatures and international interest rates. Another overarching theme in this thesis is the nonstationarity inherent to real-world signals, which typically consist of both deterministic and stochastic components. This thesis aims to help bridge the gap between formal probabilistic theory of stochastic processes and empirical signal processing methods for deterministic signals by providing a spectral model for a class of nonstationary signals, whereby the deterministic and stochastic time-domain signal properties are designated respectively by the first- and second-order moments of the signal in the frequency domain. By virtue of the assumed probabilistic model, novel tests for nonstationarity detection are devised and demonstrated to be effective in low-SNR environments. The proposed spectral analysis framework, which is intrinsically complex-valued, is facilitated by augmented complex algebra in order to fully capture the joint distribution of the real and imaginary parts of complex random variables, using a compact formulation. Finally, motivated by the need for signal processing algorithms which naturally cater for the nonstationarity inherent to real-world tensors, the above contributions are employed simultaneously to derive a general statistical signal processing framework for nonstationary tensors. This is achieved by introducing a new augmented complex multilinear algebra which allows for a concise description of the multilinear interactions between the real and imaginary parts of complex tensors. These contributions are further supported by new physically meaningful empirical results on the statistical analysis of nonstationary global atmospheric temperatures.Open Acces

    Mathematics and Digital Signal Processing

    Get PDF
    Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems

    Relevant data representation by a Kernel-based framework

    Get PDF
    Nowadays, the analysis of a large amount of data has emerged as an issue of great interest taking increasing place in the scientific community, especially in automation, signal processing, pattern recognition, and machine learning. In this sense, the identification, description, classification, visualization, and clustering of events or patterns are important problems for engineering developments and scientific issues, such as biology, medicine, economy, artificial vision, artificial intelligence, and industrial production. Nonetheless, it is difficult to interpret the available information due to its complexity and a large amount of obtained features. In addition, the analysis of the input data requires the development of methodologies that allow to reveal the relevant behaviors of the studied process, particularly, when such signals contain hidden structures varying over a given domain, e.g., space and/or time. When the analyzed signal contains such kind of properties, directly applying signal processing and machine learning procedures without considering a suitable model that deals with both the statistical distribution and the data structure, can lead in unstable performance results. Regarding this, kernel functions appear as an alternative approach to address the aforementioned issues by providing flexible mathematical tools that allow enhancing data representation for supporting signal processing and machine learning systems. Moreover, kernelbased methods are powerful tools for developing better-performing solutions by adapting the kernel to a given problem, instead of learning data relationships from explicit raw vector representations. However, building suitable kernels requires some user prior knowledge about input data, which is not available in most of the practical cases. Furthermore, using the definitions of traditional kernel methods directly, possess a challenging estimation problem that often leads to strong simplifications that restrict the kind of representation that we can use on the data. In this study, we propose a data representation framework based on kernel methods to learn automatically relevant sample relationships in learning systems. Namely, the proposed framework is divided into five kernel-based approaches, which aim to compute relevant data representations by adapting them according to both the imposed sample relationships constraints and the learning scenario (unsupervised or supervised task). First, we develop a kernel-based representation approach that allows revealing the main input sample relations by including relevant data structures using graph-based sparse constraints. Thus, salient data structures are highlighted aiming to favor further unsupervised clustering stages. This approach can be viewed as a graph pruning strategy within a spectral clustering framework which allows enhancing both the local and global data consistencies for a given input similarity matrix. Second, we introduce a kernel-based representation methodology that captures meaningful data relations in terms of their statistical distribution. Thus, an information theoretic learning (ITL) based penalty function is introduced to estimate a kernel-based similarity that maximizes the whole information potential variability. So, we seek for a reproducing kernel Hilbert space (RKHS) that spans the widest information force magnitudes among data points to support further clustering stages. Third, an entropy-like functional on positive definite matrices based on Renyi’s definition is adapted to develop a kernel-based representation approach which considers the statistical distribution and the salient data structures. Thereby, relevant input patterns are highlighted in unsupervised learning tasks. Particularly, the introduced approach is tested as a tool to encode relevant local and global input data relationships in dimensional reduction applications. Fourth, a supervised kernel-based representation is introduced via a metric learning procedure in RKHS that takes advantage of the user-prior knowledge, when available, regarding the studied learning task. Such an approach incorporates the proposed ITL-based kernel functional estimation strategy to adapt automatically the relevant representation using both the supervised information and the input data statistical distribution. As a result, relevant sample dependencies are highlighted by weighting the input features that mostly encode the supervised learning task. Finally, a new generalized kernel-based measure is proposed by taking advantage of different RKHSs. In this way, relevant dependencies are highlighted automatically by considering the input data domain-varying behavior and the user-prior knowledge (supervised information) when available. The proposed measure is an extension of the well-known crosscorrentropy function based on Hilbert space embeddings. Throughout the study, the proposed kernel-based framework is applied to biosignal and image data as an alternative to support aided diagnosis systems and image-based object analysis. Indeed, the introduced kernel-based framework improve, in most of the cases, unsupervised and supervised learning performances, aiding researchers in their quest to process and to favor the understanding of complex dataResumen: Hoy en día, el análisis de datos se ha convertido en un tema de gran interés para la comunidad científica, especialmente en campos como la automatización, el procesamiento de señales, el reconocimiento de patrones y el aprendizaje de máquina. En este sentido, la identificación, descripción, clasificación, visualización, y la agrupación de eventos o patrones son problemas importantes para desarrollos de ingeniería y cuestiones científicas, tales como: la biología, la medicina, la economía, la visión artificial, la inteligencia artificial y la producción industrial. No obstante, es difícil interpretar la información disponible debido a su complejidad y la gran cantidad de características obtenidas. Además, el análisis de los datos de entrada requiere del desarrollo de metodologías que permitan revelar los comportamientos relevantes del proceso estudiado, en particular, cuando tales señales contienen estructuras ocultas que varían sobre un dominio dado, por ejemplo, el espacio y/o el tiempo. Cuando la señal analizada contiene este tipo de propiedades, los rendimientos pueden ser inestables si se aplican directamente técnicas de procesamiento de señales y aprendizaje automático sin tener en cuenta la distribución estadística y la estructura de datos. Al respecto, las funciones núcleo (kernel) aparecen como un enfoque alternativo para abordar las limitantes antes mencionadas, proporcionando herramientas matemáticas flexibles que mejoran la representación de los datos de entrada. Por otra parte, los métodos basados en funciones núcleo son herramientas poderosas para el desarrollo de soluciones de mejor rendimiento mediante la adaptación del núcleo de acuerdo al problema en estudio. Sin embargo, la construcción de funciones núcleo apropiadas requieren del conocimiento previo por parte del usuario sobre los datos de entrada, el cual no está disponible en la mayoría de los casos prácticos. Por otra parte, a menudo la estimación de las funciones núcleo conllevan sesgos el modelo, siendo necesario apelar a simplificaciones matemáticas que no siempre son acordes con la realidad. En este estudio, se propone un marco de representación basado en métodos núcleo para resaltar relaciones relevantes entre los datos de forma automática en sistema de aprendizaje de máquina. A saber, el marco propuesto consta de cinco enfoques núcleo, en aras de adaptar la representación de acuerdo a las relaciones impuestas sobre las muestras y sobre el escenario de aprendizaje (sin/con supervisión). En primer lugar, se desarrolla un enfoque de representación núcleo que permite revelar las principales relaciones entre muestras de entrada mediante la inclusión de estructuras relevantes utilizando restricciones basadas en modelado por grafos. Por lo tanto, las estructuras de datos más sobresalientes se destacan con el objetivo de favorecer etapas posteriores de agrupamiento no supervisado. Este enfoque puede ser visto como una estrategia de depuración de grafos dentro de un marco de agrupamiento espectral que permite mejorar las consistencias locales y globales de los datos En segundo lugar, presentamos una metodología de representación núcleo que captura relaciones significativas entre muestras en términos de su distribución estadística. De este modo, se introduce una función de costo basada en aprendizaje por teoría de la información para estimar una similitud que maximice la variabilidad del potencial de información de los datos de entrada. Así, se busca un espacio de Hilbert generado por el núcleo que contenga altas fuerzas de información entre los puntos para favorecer el agrupamiento entre los mismos. En tercer lugar, se propone un esquema de representación que incluye un funcional de entropía para matrices definidas positivas a partir de la definición de Renyi. En este sentido, se pretenden incluir la distribución estadística de las muestras y sus estructuras relevantes. Por consiguiente, los patrones de entrada pertinentes se destacan en tareas de aprendizaje sin supervisión. En particular, el enfoque introducido se prueba como una herramienta para codificar las relaciones locales y globales de los datos en tareas de reducción de dimensión. En cuarto lugar, se introduce una metodología de representación núcleo supervisada a través de un aprendizaje de métrica en el espacio de Hilbert generado por una función núcleo en aras de aprovechar el conocimiento previo del usuario con respecto a la tarea de aprendizaje. Este enfoque incorpora un funcional por teoría de información que permite adaptar automáticamente la representación utilizando tanto información supervisada y la distribución estadística de los datos de entrada. Como resultado, las dependencias entre las muestras se resaltan mediante la ponderación de las características de entrada que codifican la tarea de aprendizaje supervisado. Por último, se propone una nueva medida núcleo mediante el aprovechamiento de diferentes espacios de representación. De este modo, las dependencias más relevantes entre las muestras se resaltan automáticamente considerando el dominio de interés de los datos de entrada y el conocimiento previo del usuario (información supervisada). La medida propuesta es una extensión de la función de cross-correntropia a partir de inmersiones en espacios de Hilbert. A lo largo del estudio, el esquema propuesto se valida sobre datos relacionados con bioseñales e imágenes como una alternativa para apoyar sistemas de apoyo diagnóstico y análisis objetivo basado en imágenes. De hecho, el marco introducido permite mejorar, en la mayoría de los casos, el rendimiento de sistemas de aprendizaje supervisado y no supervisado, favoreciendo la precisión de la tarea y la interpretabilidad de los datosDoctorad

    Learning understandable classifier models.

    Get PDF
    The topic of this dissertation is the automation of the process of extracting understandable patterns and rules from data. An unprecedented amount of data is available to anyone with a computer connected to the Internet. The disciplines of Data Mining and Machine Learning have emerged over the last two decades to face this challenge. This has led to the development of many tools and methods. These tools often produce models that make very accurate predictions about previously unseen data. However, models built by the most accurate methods are usually hard to understand or interpret by humans. In consequence, they deliver only decisions, and are short of any explanations. Hence they do not directly lead to the acquisition of new knowledge. This dissertation contributes to bridging the gap between the accurate opaque models and those less accurate but more transparent for humans. This dissertation first defines the problem of learning from data. It surveys the state-of-the-art methods for supervised learning of both understandable and opaque models from data, as well as unsupervised methods that detect features present in the data. It describes popular methods of rule extraction from unintelligible models which rewrite them into an understandable form. Limitations of rule extraction are described. A novel definition of understandability which ties computational complexity and learning is provided to show that rule extraction is an NP-hard problem. Next, a discussion whether one can expect that even an accurate classifier has learned new knowledge. The survey ends with a presentation of two approaches to building of understandable classifiers. On the one hand, understandable models must be able to accurately describe relations in the data. On the other hand, often a description of the output of a system in terms of its input requires the introduction of intermediate concepts, called features. Therefore it is crucial to develop methods that describe the data with understandable features and are able to use those features to present the relation that describes the data. Novel contributions of this thesis follow the survey. Two families of rule extraction algorithms are considered. First, a method that can work with any opaque classifier is introduced. Artificial training patterns are generated in a mathematically sound way and used to train more accurate understandable models. Subsequently, two novel algorithms that require that the opaque model is a Neural Network are presented. They rely on access to the network\u27s weights and biases to induce rules encoded as Decision Diagrams. Finally, the topic of feature extraction is considered. The impact on imposing non-negativity constraints on the weights of a neural network is considered. It is proved that a three layer network with non-negative weights can shatter any given set of points and experiments are conducted to assess the accuracy and interpretability of such networks. Then, a novel path-following algorithm that finds robust sparse encodings of data is presented. In summary, this dissertation contributes to improved understandability of classifiers in several tangible and original ways. It introduces three distinct aspects of achieving this goal: infusion of additional patterns from the underlying pattern distribution into rule learners, the derivation of decision diagrams from neural networks, and achieving sparse coding with neural networks with non-negative weights
    corecore