400 research outputs found

    The hardware implementation of an artificial neural network using stochastic pulse rate encoding principles

    Get PDF
    In this thesis the development of a hardware artificial neuron device and artificial neural network using stochastic pulse rate encoding principles is considered. After a review of neural network architectures and algorithmic approaches suitable for hardware implementation, a critical review of hardware techniques which have been considered in analogue and digital systems is presented. New results are presented demonstrating the potential of two learning schemes which adapt by the use of a single reinforcement signal. The techniques for computation using stochastic pulse rate encoding are presented and extended with new novel circuits relevant to the hardware implementation of an artificial neural network. The generation of random numbers is the key to the encoding of data into the stochastic pulse rate domain. The formation of random numbers and multiple random bit sequences from a single PRBS generator have been investigated. Two techniques, Simulated Annealing and Genetic Algorithms, have been applied successfully to the problem of optimising the configuration of a PRBS random number generator for the formation of multiple random bit sequences and hence random numbers. A complete hardware design for an artificial neuron using stochastic pulse rate encoded signals has been described, designed, simulated, fabricated and tested before configuration of the device into a network to perform simple test problems. The implementation has shown that the processing elements of the artificial neuron are small and simple, but that there can be a significant overhead for the encoding of information into the stochastic pulse rate domain. The stochastic artificial neuron has the capability of on-line weight adaption. The implementation of reinforcement schemes using the stochastic neuron as a basic element are discussed

    A Deterministic Annealing Framework for Global Optimization of Delay-Constrained Communication and Control Strategies

    Get PDF
    This dissertation is concerned with the problem of global optimization of delay constrained communication and control strategies. Specifically, the objective is to obtain optimal encoder and decoder functions that map between the source space and the channel space, to minimize a given cost functional. The cost surfaces associated with these problems are highly complex and riddled with local minima, rendering gradient descent based methods ineffective. This thesis proposes and develops a powerful non-convex optimization method based on the concept of deterministic annealing (DA) - which is derived from information theoretic principles with analogies to statistical physics, and was successfully employed in several problems including vector quantization, classification and regression. DA has several useful properties including reduced sensitivity to initialization and strong potential to avoid poor local minima. DA-based optimization methods are developed here for the following fundamental communication problems: the Wyner-Ziv setting where only a decoder has access to side information, the distributed setting where independent encoders transmit over independent channels to a central decoder, and analog multiple descriptions setting which is an extension of the well known source coding problem of multiple descriptions. Comparative numerical results are presented, which show strict superiority of the proposed method over gradient descent based optimization methods as well as prior approaches in literature. Detailed analysis of the highly non-trivial structure of obtained mappings is provided. The thesis further studies the related problem of global optimization of controller mappings in decentralized stochastic control problems, including Witsenhausen's celebrated 1968 counter-example. It is well-known that most decentralized control problems do not admit closed-form solutions and require numerical optimization. An optimization method is developed, based on DA, for a class of decentralized stochastic control problems. Comparative numerical results are presented for two test problems that show strict superiority of the proposed method over prior approaches in literature, and analyze the structure of obtained controller functions

    A Decision-Theoretic Approach to Resource Allocation in Wireless Multimedia Networks

    Full text link
    The allocation of scarce spectral resources to support as many user applications as possible while maintaining reasonable quality of service is a fundamental problem in wireless communication. We argue that the problem is best formulated in terms of decision theory. We propose a scheme that takes decision-theoretic concerns (like preferences) into account and discuss the difficulties and subtleties involved in applying standard techniques from the theory of Markov Decision Processes (MDPs) in constructing an algorithm that is decision-theoretically optimal. As an example of the proposed framework, we construct such an algorithm under some simplifying assumptions. Additionally, we present analysis and simulation results that show that our algorithm meets its design goals. Finally, we investigate how far from optimal one well-known heuristic is. The main contribution of our results is in providing insight and guidance for the design of near-optimal admission-control policies.Comment: To appear, Dial M for Mobility, 200

    Multiple Description Quantization via Gram-Schmidt Orthogonalization

    Full text link
    The multiple description (MD) problem has received considerable attention as a model of information transmission over unreliable channels. A general framework for designing efficient multiple description quantization schemes is proposed in this paper. We provide a systematic treatment of the El Gamal-Cover (EGC) achievable MD rate-distortion region, and show that any point in the EGC region can be achieved via a successive quantization scheme along with quantization splitting. For the quadratic Gaussian case, the proposed scheme has an intrinsic connection with the Gram-Schmidt orthogonalization, which implies that the whole Gaussian MD rate-distortion region is achievable with a sequential dithered lattice-based quantization scheme as the dimension of the (optimal) lattice quantizers becomes large. Moreover, this scheme is shown to be universal for all i.i.d. smooth sources with performance no worse than that for an i.i.d. Gaussian source with the same variance and asymptotically optimal at high resolution. A class of low-complexity MD scalar quantizers in the proposed general framework also is constructed and is illustrated geometrically; the performance is analyzed in the high resolution regime, which exhibits a noticeable improvement over the existing MD scalar quantization schemes.Comment: 48 pages; submitted to IEEE Transactions on Information Theor

    Healing failures and improving generalization in deep generative modelling

    Get PDF
    Deep generative modeling is a crucial and rapidly developing area of machine learning, with numerous potential applications, including data generation, anomaly detection, data compression, and more. Despite the significant empirical success of many generative models, some limitations still need to be addressed to improve their performance in certain cases. This thesis focuses on understanding the limitations of generative modeling in common scenarios and proposes corresponding techniques to alleviate these limitations and improve performance in practical generative modeling applications. Specifically, the thesis is divided into two sub-topics: one focusing on the training and the other on the generalization of generative models. A brief introduction to each sub-topic is provided below. Generative models are typically trained by optimizing their fit to the data distribution. This is achieved by minimizing a statistical divergence between the model and data distributions. However, there are cases where these divergences fail to accurately capture the differences between the model and data distributions, resulting in poor performance of the trained model. In the first part of the thesis, we discuss the two situations where the classic divergences are ineffective for training the models: 1. KL divergence fails to train implicit models for manifold modeling tasks. 2. Fisher divergence cannot distinguish the mixture proportions for modeling target multi-modality distribution. For both failure modes, we investigate the theoretical reasons underlying the failures of KL and Fisher divergences in modelling certain types of data distributions. We propose techniques that address the limitations of these divergences, enabling more reliable estimation of the underlying data distributions. While the generalization of classification or regression models has been extensively studied in machine learning, the generalization of generative models is a relatively under-explored area. In the second part of this thesis, we aim to address this gap by investigating the generalization properties of generative models. Specifically, we investigate two generalization scenarios: 1. In-distribution (ID) generalization of probabilistic models, where the test data and the training data are from the same distribution. 2. Out-of-distribution (OOD) generalization of probabilistic models, where the test data and the training data can come from different distributions. In the context of ID generalization, our emphasis rests on the Variational Auto-Encoder (VAE) model, and for OOD generalization, we primarily explore autoregressive models. By studying the generalization properties of the models, we demonstrate how to design new models or training criteria that improve the performance of practical applications, such as lossless compression and OOD detection. The findings of this thesis shed light on the intricate challenges faced by generative models in both training and generalization scenarios. Our investigations into the inefficacies of classic divergences like KL and Fisher highlight the importance of tailoring modeling techniques to the specific characteristics of data distributions. Additionally, by delving into the generalization aspects of generative models, this work pioneers insights into the ID and OOD scenarios, a domain not extensively covered in current literature. Collectively, the insights and techniques presented in this thesis provide valuable contributions to the community, fostering an environment for the development of more robust and reliable generative models. It's our hope that these take-home messages will serve as a foundation for future research and applications in the realm of deep generative modeling

    Speech coding at medium bit rates using analysis by synthesis techniques

    Get PDF
    Speech coding at medium bit rates using analysis by synthesis technique

    Auto-Encoding Variational Neural Machine Translation

    Get PDF
    We present a deep generative model of bilingual sentence pairs for machine translation. The model generates source and target sentences jointly from a shared latent representation and is parameterised by neural networks. We perform efficient training using amortised variational inference and reparameterised gradients. Additionally, we discuss the statistical implications of joint modelling and propose an efficient approximation to maximum a posteriori decoding for fast test-time predictions. We demonstrate the effectiveness of our model in three machine translation scenarios: in-domain training, mixed-domain training, and learning from a mix of gold-standard and synthetic data. Our experiments show consistently that our joint formulation outperforms conditional modelling (i.e. standard neural machine translation) in all such scenarios

    Managing performance vs. accuracy trade-offs with loop perforation

    Get PDF
    Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. Loop perforation provides a general technique to trade accuracy for performance by transforming loops to execute a subset of their iterations. A criticality testing phase filters out critical loops (whose perforation produces unacceptable behavior) to identify tunable loops (whose perforation produces more efficient and still acceptably accurate computations). A perforation space exploration algorithm perforates combinations of tunable loops to find Pareto-optimal perforation policies. Our results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factor of seven) while changing the result that the application produces by less than 10%
    • …
    corecore