Search CORE

65 research outputs found

Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting

Author: Hirsch Matthew Waggener
Lanman Douglas R.
Raskar Ramesh
Wetzstein Gordon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

We introduce tensor displays: a family of compressive light field displays comprising all architectures employing a stack of time-multiplexed, light-attenuating layers illuminated by uniform or directional backlighting (i.e., any low-resolution light field emitter). We show that the light field emitted by an N-layer, M-frame tensor display can be represented by an Nth-order, rank-M tensor. Using this representation we introduce a unified optimization framework, based on nonnegative tensor factorization (NTF), encompassing all tensor display architectures. This framework is the first to allow joint multilayer, multiframe light field decompositions, significantly reducing artifacts observed with prior multilayer-only and multiframe-only decompositions; it is also the first optimization method for designs combining multiple layers with directional backlighting. We verify the benefits and limitations of tensor displays by constructing a prototype using modified LCD panels and a custom integral imaging backlight. Our efficient, GPU-based NTF implementation enables interactive applications. Through simulations and experiments we show that tensor displays reveal practical architectures with greater depths of field, wider fields of view, and thinner form factors, compared to prior automultiscopic displays.United States. Defense Advanced Research Projects Agency (DARPA SCENICC program)National Science Foundation (U.S.) (NSF Grant IIS-1116452)United States. Defense Advanced Research Projects Agency (DARPA MOSAIC program)United States. Defense Advanced Research Projects Agency (DARPA Young Faculty Award)Alfred P. Sloan Foundation (Fellowship

CiteSeerX

DSpace@MIT

Crossref

Single channel overlapped-speech detection and separation of spontaneous conversations

Author: Kadhim Hasan Mohammad-Ali
Publication venue
Publication date: 01/01/2018
Field of study

PhD ThesisIn the thesis, spontaneous conversation containing both speech mixture and speech dialogue is considered. The speech mixture refers to speakers speaking simultaneously (i.e. the overlapped-speech). The speech dialogue refers to only one speaker is actively speaking and the other is silent. That Input conversation is firstly processed by the overlapped-speech detection. Two output signals are then segregated into dialogue and mixture formats. The dialogue is processed by speaker diarization. Its outputs are the individual speech of each speaker. The mixture is processed by speech separation. Its outputs are independent separated speech signals of the speaker. When the separation input contains only the mixture, blind speech separation approach is used. When the separation is assisted by the outputs of the speaker diarization, it is informed speech separation. The research presents novel: overlapped-speech detection algorithm, and two speech separation algorithms. The proposed overlapped-speech detection is an algorithm to estimate the switching instants of the input. Optimization loop is adapted to adopt the best capsulated audio features and to avoid the worst. The optimization depends on principles of the pattern recognition, and k-means clustering. For of 300 simulated conversations, averages of: False-Alarm Error is 1.9%, Missed-Speech Error is 0.4%, and Overlap-Speaker Error is 1%. Approximately, these errors equal the errors of best recent reliable speaker diarization corpuses. The proposed blind speech separation algorithm consists of four sequential techniques: filter-bank analysis, Non-negative Matrix Factorization (NMF), speaker clustering and filter-bank synthesis. Instead of the required speaker segmentation, effective standard framing is contributed. Average obtained objective tests (SAR, SDR and SIR) of 51 simulated conversations are: 5.06dB, 4.87dB and 12.47dB respectively. For the proposed informed speech separation algorithm, outputs of the speaker diarization are a generated-database. The database associated the speech separation by creating virtual targeted-speech and mixture. The contributed virtual signals are trained to facilitate the separation by homogenising them with the NMF-matrix elements of the real mixture. Contributed masking optimized the resulting speech. Average obtained SAR, SDR and SIR of 341 simulated conversations are 9.55dB, 1.12dB, and 2.97dB respectively. Per the objective tests of the two speech separation algorithms, they are in the mid-range of the well-known NMF-based audio and speech separation methods

Newcastle University eTheses

Learning latent variable models : efficient algorithms and applications

Author: Ruffini Matteo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

Learning latent variable models is a fundamental machine learning problem, and the models belonging to this class - which include topic models, hidden Markov models, mixture models and many others - have a variety of real-world applications, like text mining, clustering and time series analysis. For many practitioners, the decade-old Expectation Maximization method (EM) is still the tool of choice, despite its known proneness to local minima and long running times. To overcome these issues, algorithms based on the spectral method of moments have been recently proposed. These techniques recover the parameters of a latent variable model by solving - typically via tensor decomposition - a system of non-linear equations relating the low-order moments of the observable data with the parameters of the model to be learned. Moment-based algorithms are in general faster than EM as they require a single pass over the data, and have provable guarantees of learning accuracy in polynomial time. Nevertheless, methods of moments have room for improvements: their ability to deal with real-world data is often limited by a lack of robustness to input perturbations. Also, almost no theory studies their behavior when some of the model assumptions are violated by the input data. Extending the theory of methods of moments to learn latent variable models and providing meaningful applications to real-world contexts is the focus of this thesis. ssuming data to be generated by a certain latent variable model, the standard approach of methods of moments consists of two steps: first, finding the equations that relate the moments of the observable data with the model parameters and then, to solve these equations to retrieve estimators of the parameters of the model. In Part I of this thesis we will focus on both steps, providing and analyzing novel and improved model-specific moments estimators and techniques to solve the equations of the moments. In both the cases we will introduce theoretical results, providing guarantees on the behavior of the proposed methods, and we will perform experimental comparisons with existing algorithms. In Part II, we will analyze the behavior of methods of moments when data violates some of the model assumptions performed by a user. First, we will observe that in this context most of the theoretical infrastructure underlying methods of moments is not valid anymore, and consequently we will develop a theoretical foundation to methods of moments in the misspecified setting, developing efficient methods, guaranteed to provide meaningful results even when some of the model assumptions are violated. During all the thesis, we will apply the developed theoretical results to challenging real-world applications, focusing on two main domains: topic modeling and healthcare analytics. We will extend the existing theory of methods of moments to learn models that are traditionally used to do topic modeling – like the single-topic model and Latent Dirichlet Allocation – providing improved learning techniques and comparing them with existing methods, which we prove to outperform in terms of speed and learning accuracy. Furthermore, we will propose applications of latent variable models to the analysis of electronic healthcare records, which, similarly to text mining, are very likely to become massive datasets; we will propose a method to discover recurrent phenotypes in populations of patients and to cluster them in groups with similar clinical profiles - a task where the efficiency properties of methods of moments will constitute a competitive advantage over traditional approaches.Aprender modelos de variable latente es un problema fundamental de machine learning, y los modelos que pertenecen a esta clase, como topic models, hidden Markov models o mixture models, tienen muchas aplicaciones en el mundo real, por ejemplo text mining, clustering y time series analysis. El método de Expectation Maximization (EM) sigue siendo la herramienta más utilizada, a pesar de su conocida tendencia a producir soluciones subóptimas y sus largos tiempos de ejecución. Para superar estos problemas, se han propuesto recientemente algoritmos basados en el método de los momentos. Estas técnicas aprenden los parámetros de un modelo resolviendo un sistema de ecuaciones no lineales que relacionan los momentos de los datos observables con los parámetros del modelo que se debe aprender. Los métodos de los momentos son en general más rápidos que EM, ya que requieren una sola pasada sobre los datos y tienen garantías de producir estimadores consistentes en tiempo polinomial. A pesar de estas ventajas, los métodos de los momentos todavía tienen margen de mejora: cuando se utilizan con datos reales, los métodos de los momentos se revelan inestables, con una fuerte sensibilidad a las perturbaciones. Además, las garantías de estos métodos son válidas solo si el usuario conoce el modelo probabilístico que genera los datos, y no existe alguna teoría que estudie lo que pasa cuando ese modelo es desconocido o no correctamente especificado. El objetivo de esta tesis es ampliar la teoría de métodos de los momentos, estudiar sus aplicaciones para aprender modelos de variable latente, extendiendo la teoría actual. Además se proporcionarán aplicaciones significativas a contextos reales. Típicamente, el método de los momentos consta de de dos partes: primero, encontrar las ecuaciones que relacionan los momentos de los datos observables con los parámetros del modelo y segundo, resolver estas ecuaciones para recuperar estimadores consistentes de los parámetros del modelo. En la Parte I de esta tesis, nos centraremos en ambos pasos, proporcionando y analizando nuevos estimadores de momentos para una variedad de modelos, y técnicas para resolver las ecuaciones de los momentos. En ambos casos, introduciremos resultados teóricos, proporcionaremos garantías sobre el comportamiento de los métodos propuestos y realizaremos comparaciones experimentales con algoritmos existentes. En la Parte II, analizaremos el comportamiento de los métodos de los momentos cuando algunas de las hipótesis de modelo se encuentran violadas por los datos. Como primera cosa, observaremos que en este contexto, la mayoría de la infraestructura teórica que subyace a estos métodos pierde de validez y, por lo tanto, desarrollaremos una base teórica nueva, presentando métodos eficientes, garantizados para proporcionar resultados razonables incluso cuando algunas de las hipótesis del modelo son violadas. En toda la tesis aplicamos los resultados obtenidos a nivel teórico a aplicaciones del mundo real, centrándonos en dos áreas principales: topic modeling y healthcare analytics. Ampliamos la teoría existente de los métodos de momentos para aprender los modelos que se usan tradicionalmente en el ámbito de topic modeling, como el single-topic model y la Latent Dirichlet Allocation, proporcionando nuevas técnicas de aprendizaje y comparándolas con los métodos existentes. Además, estudiamos aplicaciones de modelos de variable latente en el análisis de datos del ámbito healthcare; proponemos un método para descubrir fenotipos recurrentes en poblaciones de pacientes y agruparlos en clusters con perfiles clínicos similares, una tarea donde las propiedades de eficiencia de los métodos de los momentos constituyen una ventaja competitiva sobre los métodos tradicionales.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

Author: Liang Paul Pu
Morency Louis-Philippe
Zadeh Amir
Publication venue
Publication date: 07/09/2022
Field of study

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design computer agents with intelligent capabilities such as understanding, reasoning, and learning through integrating multiple communicative modalities, including linguistic, acoustic, visual, tactile, and physiological messages. With the recent interest in video understanding, embodied autonomous agents, text-to-image generation, and multisensor fusion in application domains such as healthcare and robotics, multimodal machine learning has brought unique computational and theoretical challenges to the machine learning community given the heterogeneity of data sources and the interconnections often found between modalities. However, the breadth of progress in multimodal research has made it difficult to identify the common themes and open questions in the field. By synthesizing a broad range of application domains and theoretical frameworks from both historical and recent perspectives, this paper is designed to provide an overview of the computational and theoretical foundations of multimodal machine learning. We start by defining two key principles of modality heterogeneity and interconnections that have driven subsequent innovations, and propose a taxonomy of 6 core technical challenges: representation, alignment, reasoning, generation, transference, and quantification covering historical and recent trends. Recent technical achievements will be presented through the lens of this taxonomy, allowing researchers to understand the similarities and differences across new approaches. We end by motivating several open problems for future research as identified by our taxonomy

arXiv.org e-Print Archive

Learning, Inference, and Unmixing of Weak, Structured Signals in Noise

Author: Prasadan Arvind
Publication venue
Publication date
Field of study

In this thesis, we study two methods that can be used to learn, infer, and unmix weak, structured signals in noise: the Dynamic Mode Decomposition algorithm and the sparse Principal Component Analysis problem. Both problems take as input samples of a multivariate signal that is corrupted by noise, and produce a set of structured signals. We present performance guarantees for each algorithm and validate our findings with numerical simulations. First, we study the Dynamic Mode Decomposition (DMD) algorithm. We demonstrate that DMD can be used to solve the source separation problem. That is, we apply DMD to a data matrix whose rows are linearly independent, additive mixtures of latent time series. We show that when the latent time series are uncorrelated at a lag of one time-step then the recovered dynamic modes will approximate the columns of the mixing matrix. That is, DMD unmixes linearly mixed sources that have a particular correlation structure. We next broaden our analysis beyond the noise-free, fully observed data setting. We study the DMD algorithm with a truncated-SVD denoising step, and present recovery guarantees for both the noisy data and missing data settings. We also present some preliminary characterizations of DMD performed directly on noisy data. We end with some complementary perspectives on DMD, including an optimization-based formulation. Second, we study the sparse Principal Component Analysis (PCA) problem. We demonstrate that the sparse inference problem can be viewed in a variable selection framework and analyze the performance of various decision statistics. A major contribution of this work is the introduction of False Discovery Rate (FDR) control for the principal component estimation problem, made possible by the sparse structure. We derive lower bounds on the size of detectable coordinates of the principal component vectors, and utilize these lower bounds to derive lower bounds on the worst-case risk.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155061/1/prasadan_1.pd

Deep Blue Documents at the University of Michigan

Finding human genetic variation in whole genome expression data with applications for “missing” heritability: The GWCoGAPS algorithm, the PatternMarkers statistic, and the ProjectoR package

Author: Stein-O'Brien Genevieve Lauren
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 22/05/2018
Field of study

Starting from a single fertilized egg, the compendium of human cells is generated via stochastic perturbations of earlier generations. Concurrently, canalization of developmental pathways limits the type and degree of variation to ensure viability; thus, it is unsurprising that deviations early in life have been linked to late manifesting diseases. Human pluripotent stem cells (hPSCs) are a highly robust and uniquely human experimental system in which to model the sources and consequences of this variability. Further, variation in hPSCs’ transcriptomes has been directly linked to both genomic background and biases in differentiation efficiency. Taking advantage of this link between genomic background and developmental phenotypes, we developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian non-negative matrix factorization (NMF), to find conserved transcriptional signatures representative of the functional effect of human genetic variation. Time course RNA-seq data obtained from three human embryonic stem cells (ESC) and three human induced pluripotent stem cells (IPSC) in three different experimental conditions was analyzed. GWCoGAPS distinguished shared developmental trajectories from unique transcriptional signatures of each of the cell lines. Further analysis of these “identity” signatures found they were predictive of lineage biases during neuronal differentiation. Additionally, lineage biases were consistent with early differences in morphogenetic phenotypes within monolayer culture, thus, linking transcriptional genomic signatures to stable quantifiable cellular features. To test whether the cell line signatures were genome specific, we next developed the projectoR algorithm to assess a given signatures robustness in independent data sets. By using the identity signatures as inputs to projectoR, we were able to identify samples from the same donor genome in datasets from multiple tissues and across technical platforms, including RNA-seq results from post-mortem brain, micro arrayed embryoid bodies, and publicly available datasets. The identification of signatures that define the functional rather than physical background of an individual’s genome has the potential to profoundly influence our view of human variation and disease

JScholarship

Mathematics and Digital Signal Processing

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems

Directory of Open Access Books (DOAB)

Artificial Intelligence for Multimedia Signal Processing

Author
Publication venue: 'MDPI AG'
Publication date: 16/09/2022
Field of study

Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

Directory of Open Access Books (DOAB)

Learning latent variable models : efficient algorithms and applications

Author: Ruffini Matteo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 14/02/2019
Field of study

Tesis Doctorals en Xarxa

Remote Sensing Data Compression

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

A huge amount of data is acquired nowadays by different remote sensing systems installed on satellites, aircrafts, and UAV. The acquired data then have to be transferred to image processing centres, stored and/or delivered to customers. In restricted scenarios, data compression is strongly desired or necessary. A wide diversity of coding methods can be used, depending on the requirements and their priority. In addition, the types and properties of images differ a lot, thus, practical implementation aspects have to be taken into account. The Special Issue paper collection taken as basis of this book touches on all of the aforementioned items to some degree, giving the reader an opportunity to learn about recent developments and research directions in the field of image compression. In particular, lossless and near-lossless compression of multi- and hyperspectral images still remains current, since such images constitute data arrays that are of extremely large size with rich information that can be retrieved from them for various applications. Another important aspect is the impact of lossless compression on image classification and segmentation, where a reasonable compromise between the characteristics of compression and the final tasks of data processing has to be achieved. The problems of data transition from UAV-based acquisition platforms, as well as the use of FPGA and neural networks, have become very important. Finally, attempts to apply compressive sensing approaches in remote sensing image processing with positive outcomes are observed. We hope that readers will find our book useful and interestin

Directory of Open Access Books (DOAB)