16 research outputs found

    High Dimensional Covariance Estimation for Spatio-Temporal Processes

    Full text link
    High dimensional time series and array-valued data are ubiquitous in signal processing, machine learning, and science. Due to the additional (temporal) direction, the total dimensionality of the data is often extremely high, requiring large numbers of training examples to learn the distribution using unstructured techniques. However, due to difficulties in sampling, small population sizes, and/or rapid system changes in time, it is often the case that very few relevant training samples are available, necessitating the imposition of structure on the data if learning is to be done. The mean and covariance are useful tools to describe high dimensional distributions because (via the Gaussian likelihood function) they are a data-efficient way to describe a general multivariate distribution, and allow for simple inference, prediction, and regression via classical techniques. In this work, we develop various forms of multidimensional covariance structure that explicitly exploit the array structure of the data, in a way analogous to the widely used low rank modeling of the mean. This allows dramatic reductions in the number of training samples required, in some cases to a single training sample. Covariance models of this form have been increasing in interest recently, and statistical performance bounds for high dimensional estimation in sample-starved scenarios are of great relevance. This thesis focuses on the high-dimensional covariance estimation problem, exploiting spatio-temporal structure to reduce sample complexity. Contributions are made in the following areas: (1) development of a variety of rich Kronecker product-based covariance models allowing the exploitation of spatio-temporal and other structure with applications to sample-starved real data problems, (2) strong performance bounds for high-dimensional estimation of covariances under each model, and (3) a strongly adaptive online method for estimating changing optimal low-dimensional metrics (inverse covariances) for high-dimensional data from a series of similarity labels.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137082/1/greenewk_1.pd

    Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

    Get PDF
    In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Instituto Argentino de Radioastronomí

    Probabilistic modeling of tensorial data for enhancing spatial resolution in magnetic resonance imaging.

    Get PDF
    Las imágenes médicas usan los principios de la Resonancia Magnética (IRM) para medir de forma no invasiva las propiedades de este movimiento. Cuando se aplica al cerebro humano, proporciona información única sobre la conectividad del tejido, lo que hace que la resonancia magnética sea una de las tecnologías clave en un esfuerzo científico continuo a gran escala para mapear el conector del cerebro humano. En consecuencia, es un tema de investigación oportuno e importante para crear modelos matemáticos que infieren parámetros biológicamente significativos a partir de dichos datos. La MRI y la difusión-MRI (dMRI) se han utilizado en aplicaciones que abarcan desde el procesamiento de señales, la visión por computadora y las neurociencias. Aunque los protocolos clínicos actuales permiten adquisiciones rápidas en un número diferente de cortes en varios planos, la resolución espacial no es lo suficientemente alta en muchos casos para el diagnóstico clínico. El principal problema ocurre debido a las limitaciones de hardware en los escáneres de adquisición. Por lo tanto, MRI y dMRI tienen un compromiso difícil entre una buena resolución espacial y una relación de ruido de señal (SNR). Esto conduce a adquisiciones de datos con baja resolución espacial. Se convierte en un problema serio para el análisis clínico por dos razones principales. Primero, una baja resolución espacial en datos visuales reduce la calidad en procesos médicos importantes tales como: diagnóstico de enfermedades, segmentación (tejido, nervios y hueso), construcción anatómica de atlas, reconstrucción detallada de fibras (tractografía), modelos de conductividad cerebral, etc. Segundo, para obtener imágenes de alta resolución se requiere una adquisición a largo plazo. Sin embargo, los protocolos clínicos actuales no permiten una exposición prolongada de la radiación (MRI y dMRI) en sujetos humanos

    Efficient Design, Training, and Deployment of Artificial Neural Networks

    Get PDF
    Over the last decade, artificial neural networks, especially deep neural networks, have emerged as the main modeling tool in Machine Learning, allowing us to tackle an increasing number of real-world problems in various fields, most notably, in computer vision, natural language processing, biomedical and financial analysis. The success of deep neural networks can be attributed to many factors, namely the increasing amount of data available, the developments of dedicated hardware, the advancements in optimization techniques, and especially the invention of novel neural network architectures. Nowadays, state-of-the-arts neural networks that achieve the best performance in any field are usually formed by several layers, comprising millions, or even billions of parameters. Despite spectacular performances, optimizing a single state-of- the-arts neural network often requires a tremendous amount of computation, which can take several days using high-end hardware. More importantly, it took several years of experimentation for the community to gradually discover effective neural network architectures, moving from AlexNet, VGGNet, to ResNet, and then DenseNet. In addition to the expensive and time-consuming experimentation process, deep neural networks, which require powerful processors to operate during the deployment phase, cannot be easily deployed to mobile or embedded devices. For these reasons, improving the design, training, and deployment of deep neural networks has become an important area of research in the Machine Learning field. This thesis makes several contributions in the aforementioned research area, which can be grouped into two main categories. The first category consists of research works that focus on designing efficient neural network architectures not only in terms of accuracy but also computational complexity. In the first contribution under this category, the computational efficiency is first addressed at the filter level through the incorporation of a handcrafted design for convolutional neural networks, which are the basis of most deep neural networks. More specifically, the multilinear convolution filter is proposed to replace the linear convolution filter, which is a fundamental element in a convolutional neural network. The new filter design not only better captures multidimensional structures inherent in CNNs but also requires far fewer parameters to be estimated. While using efficient algebraic transforms and approximation techniques to tackle the design problem can significantly reduce the memory and computational footprint of neural network models, this approach requires a lot of trial and error. In addition, the simple neuron model used in most neural networks nowadays, which only performs a linear transformation followed by a nonlinear activation, cannot effectively mimic the diverse activities of biological neurons. For this reason, the second and third contributions transition from a handcrafted, manual design approach to an algorithmic approach in which the type of transformations performed by each neuron as well as the topology of neural networks are optimized in a systematic and completely data-dependent manner. As a result, the algorithms proposed in the second and third contributions are capable of designing highly accurate and compact neural networks while requiring minimal human efforts or intervention in the design process. Despite significant progress has been made to reduce the runtime complexity of neural network models on embedded devices, the majority of them have been demonstrated on powerful embedded devices, which are costly in applications that require large-scale deployment such as surveillance systems. In these scenarios, complete on-device processing solutions can be infeasible. On the contrary, hybrid solutions, where some preprocessing steps are conducted on the client side while the heavy computation takes place on the server side, are more practical. The second category of contributions made in this thesis focuses on efficient learning methodologies for hybrid solutions that take into ac- count both the signal acquisition and inference steps. More concretely, the first contribution under this category is the formulation of the Multilinear Compressive Learning framework in which multidimensional signals are compressively acquired, and inference is made based on the compressed signals, bypassing the signal reconstruction step. In the second contribution, the relationships be- tween the input signal resolution, the compression rate, and the learning performance of Multilinear Compressive Learning systems are empirically analyzed systematically, leading to the discovery of a surrogate performance indicator that can be used to approximately rank the learning performances of different sensor configurations without conducting the entire optimization process. Nowadays, many communication protocols provide support for adaptive data transmission to maximize the data throughput and minimize energy consumption depending on the network’s strength. The last contribution of this thesis proposes an extension of the Multilinear Compressive Learning framework with an adaptive compression capability, which enables us to take advantage of the adaptive rate transmission feature in existing communication protocols to maximize the informational content throughput of the whole system. Finally, all methodological contributions of this thesis are accompanied by extensive empirical analyses demonstrating their performance and computational advantages over existing methods in different computer vision applications such as object recognition, face verification, human activity classification, and visual information retrieval

    Machine Learning Methods with Noisy, Incomplete or Small Datasets

    Get PDF
    In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    A Tensor Decomposition Based Multiway Structured Sparse SAR Imaging Algorithm with Kronecker Constraint

    No full text
    This paper investigates a structured sparse SAR imaging algorithm for point scattering model based on tensor decomposition. Several SAR imaging schemes have been developed by researchers for improving the imaging quality. For a typical SAR target scenario, the scatterers distribution usually has the feature of structured sparsity. Without considering this feature thoroughly, the existing schemes have still certain drawbacks. The classic matching pursuit algorithms can obtain clearer imaging results, but the cost is resulting in an extreme complexity and a huge computation resource consumption. Therefore, this paper put forward a tensor-based SAR imaging algorithm by means of multiway structured sparsity which makes full use of the above geometrical feature of the scatterers distribution. The spotlight SAR observation signal is formulated as a Tucker model considering the Kronecker constraint, and then a sparse reconstruction algorithm is introduced by utilizing the structured sparsity of the scene. The proposed tensor-based SAR imaging model is able to take advantage of the Kronecker information in each mode, which ensures the robustness for the signal reconstruction. Both the algorithm complexity analysis and numerical simulations show that the proposed method requires less computation than the existing sparsity-driven SAR imaging algorithms. The imaging realizations based on the practical measured data also indicate that the proposed algorithm is superior to the reference methods even in the severe noisy environment, under the condition of multiway structured sparsity

    SIS 2017. Statistics and Data Science: new challenges, new generations

    Get PDF
    The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data
    corecore