4,376 research outputs found

    Capacity of Continuous Channels with Memory via Directed Information Neural Estimator

    Full text link
    Calculating the capacity (with or without feedback) of channels with memory and continuous alphabets is a challenging task. It requires optimizing the directed information (DI) rate over all channel input distributions. The objective is a multi-letter expression, whose analytic solution is only known for a few specific cases. When no analytic solution is present or the channel model is unknown, there is no unified framework for calculating or even approximating capacity. This work proposes a novel capacity estimation algorithm that treats the channel as a `black-box', both when feedback is or is not present. The algorithm has two main ingredients: (i) a neural distribution transformer (NDT) model that shapes a noise variable into the channel input distribution, which we are able to sample, and (ii) the DI neural estimator (DINE) that estimates the communication rate of the current NDT model. These models are trained by an alternating maximization procedure to both estimate the channel capacity and obtain an NDT for the optimal input distribution. The method is demonstrated on the moving average additive Gaussian noise channel, where it is shown that both the capacity and feedback capacity are estimated without knowledge of the channel transition kernel. The proposed estimation framework opens the door to a myriad of capacity approximation results for continuous alphabet channels that were inaccessible until now

    Minimum-Information LQG Control - Part I: Memoryless Controllers

    Full text link
    With the increased demand for power efficiency in feedback-control systems, communication is becoming a limiting factor, raising the need to trade off the external cost that they incur with the capacity of the controller's communication channels. With a proper design of the channels, this translates into a sequential rate-distortion problem, where we minimize the rate of information required for the controller's operation under a constraint on its external cost. Memoryless controllers are of particular interest both for the simplicity and frugality of their implementation and as a basis for studying more complex controllers. In this paper we present the optimality principle for memoryless linear controllers that utilize minimal information rates to achieve a guaranteed external-cost level. We also study the interesting and useful phenomenology of the optimal controller, such as the principled reduction of its order

    TREET: TRansfer Entropy Estimation via Transformer

    Full text link
    Transfer entropy (TE) is a measurement in information theory that reveals the directional flow of information between processes, providing valuable insights for a wide range of real-world applications. This work proposes Transfer Entropy Estimation via Transformers (TREET), a novel transformer-based approach for estimating the TE for stationary processes. The proposed approach employs Donsker-Vardhan (DV) representation to TE and leverages the attention mechanism for the task of neural estimation. We propose a detailed theoretical and empirical study of the TREET, comparing it to existing methods. To increase its applicability, we design an estimated TE optimization scheme that is motivated by the functional representation lemma. Afterwards, we take advantage of the joint optimization scheme to optimize the capacity of communication channels with memory, which is a canonical optimization problem in information theory, and show the memory capabilities of our estimator. Finally, we apply TREET to real-world feature analysis. Our work, applied with state-of-the-art deep learning methods, opens a new door for communication problems which are yet to be solved.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Data-Driven Neural Polar Codes for Unknown Channels With and Without Memory

    Full text link
    In this work, a novel data-driven methodology for designing polar codes for channels with and without memory is proposed. The methodology is suitable for the case where the channel is given as a "black-box" and the designer has access to the channel for generating observations of its inputs and outputs, but does not have access to the explicit channel model. The proposed method leverages the structure of the successive cancellation (SC) decoder to devise a neural SC (NSC) decoder. The NSC decoder uses neural networks (NNs) to replace the core elements of the original SC decoder, the check-node, the bit-node and the soft decision. Along with the NSC, we devise additional NN that embeds the channel outputs into the input space of the SC decoder. The proposed method is supported by theoretical guarantees that include the consistency of the NSC. Also, the NSC has computational complexity that does not grow with the channel memory size. This sets its main advantage over successive cancellation trellis (SCT) decoder for finite state channels (FSCs) that has complexity of O(S3NlogN)O(|\mathcal{S}|^3 N\log N), where S|\mathcal{S}| denotes the number of channel states. We demonstrate the performance of the proposed algorithms on memoryless channels and on channels with memory. The empirical results are compared with the optimal polar decoder, given by the SC and SCT decoders. We further show that our algorithms are applicable for the case where there SC and SCT decoders are not applicable

    Universal Estimation of Directed Information

    Full text link
    Four estimators of the directed information rate between a pair of jointly stationary ergodic finite-alphabet processes are proposed, based on universal probability assignments. The first one is a Shannon--McMillan--Breiman type estimator, similar to those used by Verd\'u (2005) and Cai, Kulkarni, and Verd\'u (2006) for estimation of other information measures. We show the almost sure and L1L_1 convergence properties of the estimator for any underlying universal probability assignment. The other three estimators map universal probability assignments to different functionals, each exhibiting relative merits such as smoothness, nonnegativity, and boundedness. We establish the consistency of these estimators in almost sure and L1L_1 senses, and derive near-optimal rates of convergence in the minimax sense under mild conditions. These estimators carry over directly to estimating other information measures of stationary ergodic finite-alphabet processes, such as entropy rate and mutual information rate, with near-optimal performance and provide alternatives to classical approaches in the existing literature. Guided by these theoretical results, the proposed estimators are implemented using the context-tree weighting algorithm as the universal probability assignment. Experiments on synthetic and real data are presented, demonstrating the potential of the proposed schemes in practice and the utility of directed information estimation in detecting and measuring causal influence and delay.Comment: 23 pages, 10 figures, to appear in IEEE Transactions on Information Theor

    An information theoretic learning framework based on Renyi’s α entropy for brain effective connectivity estimation

    Get PDF
    The interactions among neural populations distributed across different brain regions are at the core of cognitive and perceptual processing. Therefore, the ability of studying the flow of information within networks of connected neural assemblies is of fundamental importance to understand such processes. In that regard, brain connectivity measures constitute a valuable tool in neuroscience. They allow assessing functional interactions among brain regions through directed or non-directed statistical dependencies estimated from neural time series. Transfer entropy (TE) is one such measure. It is an effective connectivity estimation approach based on information theory concepts and statistical causality premises. It has gained increasing attention in the literature because it can capture purely nonlinear directed interactions, and is model free. That is to say, it does not require an initial hypothesis about the interactions present in the data. These properties make it an especially convenient tool in exploratory analyses. However, like any information-theoretic quantity, TE is defined in terms of probability distributions that in practice need to be estimated from data. A challenging task, whose outcome can significantly affect the results of TE. Also, it lacks a standard spectral representation, so it cannot reveal the local frequency band characteristics of the interactions it detects.Las interacciones entre poblaciones neuronales distribuidas en diferentes regiones del cerebro son el núcleo del procesamiento cognitivo y perceptivo. Por lo tanto, la capacidad de estudiar el flujo de información dentro de redes de conjuntos neuronales conectados es de fundamental importancia para comprender dichos procesos. En ese sentido, las medidas de conectividad cerebral constituyen una valiosa herramienta en neurociencia. Permiten evaluar interacciones funcionales entre regiones cerebrales a través de dependencias estadísticas dirigidas o no dirigidas estimadas a partir de series de tiempo. La transferencia de entropía (TE) es una de esas medidas. Es un enfoque de estimación de conectividad efectiva basada en conceptos de teoría de la información y premisas de causalidad estadística. Ha ganado una atención cada vez mayor en la literatura porque puede capturar interacciones dirigidas puramente no lineales y no depende de un modelo. Es decir, no requiere de una hipótesis inicial sobre las interacciones presentes en los datos. Estas propiedades la convierten en una herramienta especialmente conveniente en análisis exploratorios. Sin embargo, como cualquier concepto basado en teoría de la información, la TE se define en términos de distribuciones de probabilidad que en la práctica deben estimarse a partir de datos. Una tarea desafiante, cuyo resultado puede afectar significativamente los resultados de la TE. Además, carece de una representación espectral estándar, por lo que no puede revelar las características de banda de frecuencia local de las interacciones que detecta.DoctoradoDoctor(a) en IngenieríaContents List of Figures xi List of Tables xv Notation xvi 1 Preliminaries 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Probability distribution estimation as an intermediate step in TE computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 The lack of a spectral representation for TE . . . . . . . . . . . . 7 1.3 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.1 Transfer entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.2 Granger causality . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3.3 Information theoretic learning from kernel matrices . . . . . . . . 12 1.4 Literature review on transfer entropy estimation . . . . . . . . . . . . . . 14 1.4.1 Transfer entropy in the frequency domain . . . . . . . . . . . . . . 17 1.5 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.5.1 General aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.5.2 Specific aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6 Outline and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.6.1 Kernel-based Renyi’s transfer entropy . . . . . . . . . . . . . . . . 24 1.6.2 Kernel-based Renyi’s phase transfer entropy . . . . . . . . . . . . 24 1.6.3 Kernel-based Renyi’s phase transfer entropy for the estimation of directed phase-amplitude interactions . . . . . . . . . . . . . . . . 25 1.7 EEG databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Contents ix 1.7.1 Motor imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.7.2 Working memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.8 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2 Kernel-based Renyi’s transfer entropy 34 2.1 Kernel-based Renyi’s transfer entropy . . . . . . . . . . . . . . . . . . . . 35 2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.1 VAR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.2 Modified linear Kus model . . . . . . . . . . . . . . . . . . . . . . 38 2.2.3 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.2.4 Parameter selection . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.3.1 VAR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.3.2 Modified linear Kus model . . . . . . . . . . . . . . . . . . . . . . 46 2.3.3 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.3.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3 Kernel-based Renyi’s phase transfer entropy 60 3.1 Kernel-based Renyi’s phase transfer entropy . . . . . . . . . . . . . . . . 61 3.1.1 Phase-based effective connectivity estimation approaches considered in this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.1 Neural mass models . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.2 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.2.3 Parameter selection . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.1 Neural mass models . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.2 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.3.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4 Kernel-based Renyi’s phase transfer entropy for the estimation of directed phase-amplitude interactions 84 4.1 Kernel-based Renyi’s phase transfer entropy for the estimation of directed phase-amplitude interactions . . . . . . . . . . . . . . . . . . . . . . . . . 85 x Contents 4.1.1 Transfer entropy for directed phase-amplitude interactions . . . . 85 4.1.2 Cross-frequency directionality . . . . . . . . . . . . . . . . . . . . 85 4.1.3 Phase transfer entropy and directed phase-amplitude interactions 86 4.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2.1 Simulated phase-amplitude interactions . . . . . . . . . . . . . . . 88 4.2.2 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.3 Parameter selection . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.1 Simulated phase-amplitude interactions . . . . . . . . . . . . . . . 92 4.3.2 EEG data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.3.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5 Final Remarks 100 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3 Academic products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3.1 Journal papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3.2 Conference papers . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.3.3 Conference presentations . . . . . . . . . . . . . . . . . . . . . . . 105 Appendix A Kernel methods and Renyi’s entropy estimation 106 A.1 Reproducing kernel Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . 106 A.1.1 Reproducing kernels . . . . . . . . . . . . . . . . . . . . . . . . . 106 A.1.2 Kernel-based learning . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.2 Kernel-based estimation of Renyi’s entropy . . . . . . . . . . . . . . . . . 109 Appendix B Surface Laplacian 113 Appendix C Permutation testing 115 Appendix D Kernel-based relevance analysis 117 Appendix E Cao’s criterion 120 Appendix F Neural mass model equations 122 References 12
    corecore