Search CORE

88 research outputs found

Binary Independent Component Analysis with OR Mixtures

Author: Nguyen Huy
Zheng Rong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/12/2010
Field of study

Independent component analysis (ICA) is a computational method for separating a multivariate signal into subcomponents assuming the mutual statistical independence of the non-Gaussian source signals. The classical Independent Components Analysis (ICA) framework usually assumes linear combinations of independent sources over the field of realvalued numbers R. In this paper, we investigate binary ICA for OR mixtures (bICA), which can find applications in many domains including medical diagnosis, multi-cluster assignment, Internet tomography and network resource management. We prove that bICA is uniquely identifiable under the disjunctive generation model, and propose a deterministic iterative algorithm to determine the distribution of the latent random variables and the mixing matrix. The inverse problem concerning inferring the values of latent variables are also considered along with noisy measurements. We conduct an extensive simulation study to verify the effectiveness of the propose algorithm and present examples of real-world applications where bICA can be applied.Comment: Manuscript submitted to IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

Crossref

Independent Component Analysis for Binary Data

Author: Barin Pacela Vitória
Publication venue: Helsingfors universitet
Publication date: 01/01/2021
Field of study

Independent Component Analysis (ICA) aims to separate the observed signals into their underlying independent components responsible for generating the observations. Most research in ICA has focused on continuous signals, while the methodology for binary and discrete signals is less developed. Yet, binary observations are equally present in various fields and applications, such as causal discovery, signal processing, and bioinformatics. In the last decade, Boolean OR and XOR mixtures have been shown to be identifiable by ICA, but such models suffer from limited expressivity, calling for new methods to solve the problem. In this thesis, "Independent Component Analysis for Binary Data", we estimate the mixing matrix of ICA from binary observations and an additionally observed auxiliary variable by employing a linear model inspired by the Identifiable Variational Autoencoder (iVAE), which exploits the non-stationarity of the data. The model is optimized with a gradient-based algorithm that uses second-order optimization with limited memory, resulting in a training time in the order of seconds for the particular study cases. We investigate which conditions can lead to the reconstruction of the mixing matrix, concluding that the method is able to identify the mixing matrix when the number of observed variables is greater than the number of sources. In such cases, the linear binary iVAE can reconstruct the mixing matrix up to order and scale indeterminacies, which are considered in the evaluation with the Mean Cosine Similarity Score. Furthermore, the model can reconstruct the mixing matrix even under a limited sample size. Therefore, this work demonstrates the potential for applications in real-world data and also offers a possibility to study and formalize identifiability in future work. In summary, the most important contributions of this thesis are the empirical study of the conditions that enable the mixing matrix reconstruction using the binary iVAE, and the empirical results on the performance and efficiency of the model. The latter was achieved through a new combination of existing methods, including modifications and simplifications of a linear binary iVAE model and the optimization of such a model under limited computational resources

Helsingin yliopiston digitaalinen arkisto

Infinite Divisibility of Information

Author: Li Cheuk Ting
Publication venue
Publication date: 13/08/2020
Field of study

We study an information analogue of infinitely divisible probability distributions, where the i.i.d. sum is replaced by the joint distribution of an i.i.d. sequence. A random variable

X

is called informationally infinitely divisible if, for any

n\ge1

, there exists an i.i.d. sequence of random variables

Z_{1},\ldots,Z_{n}

that contains the same information as

X

, i.e., there exists an injective function

f

such that

X=f(Z_{1},\ldots,Z_{n})

. While there does not exist informationally infinitely divisible discrete random variable, we show that any discrete random variable

X

has a bounded multiplicative gap to infinite divisibility, that is, if we remove the injectivity requirement on

f

, then there exists i.i.d.

Z_{1},\ldots,Z_{n}

and

f

satisfying

X=f(Z_{1},\ldots,Z_{n})

, and the entropy satisfies

H(X)/n\le H(Z_{1})\le1.59H(X)/n+2.43

. We also study a new class of discrete probability distributions, called spectral infinitely divisible distributions, where we can remove the multiplicative gap

1.59

. Furthermore, we study the case where

X=(Y_{1},\ldots,Y_{m})

is itself an i.i.d. sequence,

m\ge2

, for which the multiplicative gap

1.59

can be replaced by

1+5\sqrt{(\log m)/m}

. This means that as

m

increases,

(Y_{1},\ldots,Y_{m})

becomes closer to being spectral infinitely divisible in a uniform manner. This can be regarded as an information analogue of Kolmogorov's uniform theorem. Applications of our result include independent component analysis, distributed storage with a secrecy constraint, and distributed random number generation.Comment: 22 page

arXiv.org e-Print Archive

Codage réseau pour des applications multimédias avancées

Author: Nemoianu Irina-Delia
Publication venue: HAL CCSD
Publication date: 20/06/2013
Field of study

Network coding is a paradigm that allows an efficient use of the capacity of communication networks. It maximizes the throughput in a multi-hop multicast communication and reduces the delay. In this thesis, we focus our attention to the integration of the network coding framework to multimedia applications, and in particular to advanced systems that provide enhanced video services to the users. Our contributions concern several instances of advanced multimedia communications: an efficient framework for transmission of a live stream making joint use of network coding and multiple description coding; a novel transmission strategy for lossy wireless networks that guarantees a trade-off between loss resilience and short delay based on a rate-distortion optimized scheduling of the video frames, that we also extended to the case of interactive multi-view streaming; a distributed social caching system that, using network coding in conjunction with the knowledge of the users' preferences in terms of views, is able to select a replication scheme such that to provide a high video quality by accessing only other members of the social group without incurring the access cost associated with a connection to a central server and without exchanging large tables of metadata to keep track of the replicated parts; and, finally, a study on using blind source separation techniques to reduce the overhead incurred by network coding schemes based on error-detecting techniques such as parity coding and message digest generation. All our contributions are aimed at using network coding to enhance the quality of video transmission in terms of distortion and delay perceivedLe codage réseau est un paradigme qui permet une utilisation efficace du réseau. Il maximise le débit dans un réseau multi-saut en multicast et réduit le retard. Dans cette thèse, nous concentrons notre attention sur l’intégration du codage réseau aux applications multimédias, et en particulier aux systèmes avancès qui fournissent un service vidéo amélioré pour les utilisateurs. Nos contributions concernent plusieurs scénarios : un cadre de fonctions efficace pour la transmission de flux en directe qui utilise à la fois le codage réseau et le codage par description multiple, une nouvelle stratégie de transmission pour les réseaux sans fil avec perte qui garantit un compromis entre la résilience vis-à-vis des perte et la reduction du retard sur la base d’une optimisation débit-distorsion de l'ordonnancement des images vidéo, que nous avons également étendu au cas du streaming multi-vue interactive, un système replication sociale distribuée qui, en utilisant le réseau codage en relation et la connaissance des préférences des utilisateurs en termes de vue, est en mesure de sélectionner un schéma de réplication capable de fournir une vidéo de haute qualité en accédant seulement aux autres membres du groupe social, sans encourir le coût d’accès associé à une connexion à un serveur central et sans échanger des larges tables de métadonnées pour tenir trace des éléments répliqués, et, finalement, une étude sur l’utilisation de techniques de séparation aveugle de source -pour réduire l’overhead encouru par les schémas de codage réseau- basé sur des techniques de détection d’erreur telles que le codage de parité et la génération de message digest

Thèses en Ligne

thèses en ligne de ParisTech

On streaming approximation algorithms for constraint satisfaction problems

Author: Singer Noah G.
Publication venue
Publication date: 13/04/2023
Field of study

In this thesis, we explore streaming algorithms for approximating constraint satisfaction problems (CSPs). The setup is roughly the following: A computer has limited memory space, sees a long "stream" of local constraints on a set of variables, and tries to estimate how many of the constraints may be simultaneously satisfied. The past ten years have seen a number of works in this area, and this thesis includes both expository material and novel contributions. Throughout, we emphasize connections to the broader theories of CSPs, approximability, and streaming models, and highlight interesting open problems. The first part of our thesis is expository: We present aspects of previous works that completely characterize the approximability of specific CSPs like Max-Cut and Max-Dicut with

\sqrt{n}

-space streaming algorithm (on

n

-variable instances), while characterizing the approximability of all CSPs in

\sqrt n

space in the special case of "composable" (i.e., sketching) algorithms, and of a particular subclass of CSPs with linear-space streaming algorithms. In the second part of the thesis, we present two of our own joint works. We begin with a work with Madhu Sudan and Santhoshini Velusamy in which we prove linear-space streaming approximation-resistance for all ordering CSPs (OCSPs), which are "CSP-like" problems maximizing over sets of permutations. Next, we present joint work with Joanna Boyland, Michael Hwang, Tarun Prasad, and Santhoshini Velusamy in which we investigate the

\sqrt n

-space streaming approximability of symmetric Boolean CSPs with negations. We give explicit

\sqrt n

-space sketching approximability ratios for several families of CSPs, including Max-

k

AND; develop simpler optimal sketching approximation algorithms for threshold predicates; and show that previous lower bounds fail to characterize the

\sqrt n

-space streaming approximability of Max-

3

AND.Comment: Harvard College senior thesis; 119 pages plus references; abstract shortened for arXiv; formatted with Dissertate template (feel free to copy!); exposits papers arXiv:2105.01782 (APPROX 2021) and arXiv:2112.06319 (APPROX 2022

arXiv.org e-Print Archive