25,013 research outputs found

    Set Cross Entropy: Likelihood-based Permutation Invariant Loss Function for Probability Distributions

    Full text link
    We propose a permutation-invariant loss function designed for the neural networks reconstructing a set of elements without considering the order within its vector representation. Unlike popular approaches for encoding and decoding a set, our work does not rely on a carefully engineered network topology nor by any additional sequential algorithm. The proposed method, Set Cross Entropy, has a natural information-theoretic interpretation and is related to the metrics defined for sets. We evaluate the proposed approach in two object reconstruction tasks and a rule learning task.Comment: The source code will be available at https://github.com/guicho271828/perminv . (comment for the revision: the result table was not correctly updated

    Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation

    Full text link
    We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous grouping and sequential grouping. Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network. In the second stage, the frame-level separated spectra are sequentially grouped to different speakers by a clustering network. The proposed deep CASA approach optimizes frame-level separation and speaker tracking in turn, and produces excellent results for both objectives. Experimental results on the benchmark WSJ0-2mix database show that the new approach achieves the state-of-the-art results with a modest model size.Comment: 10 pages, 5 figure

    Maximum Rank and Asymptotic Rank of Finite Dynamical Systems

    Full text link
    A finite dynamical system is a system of multivariate functions over a finite alphabet used to model a network of interacting entities. The main feature of a finite dynamical system is its interaction graph, which indicates which local functions depend on which variables; the interaction graph is a qualitative representation of the interactions amongst entities on the network. The rank of a finite dynamical system is the cardinality of its image; the periodic rank is the number of its periodic points. In this paper, we determine the maximum rank and the maximum periodic rank of a finite dynamical system with a given interaction graph over any non-Boolean alphabet. We also obtain a similar result for Boolean finite dynamical systems (also known as Boolean networks) whose interaction graphs are contained in a given digraph. We then prove that the average rank is relatively close (as the size of the alphabet is large) to the maximum. The results mentioned above only deal with the parallel update schedule. We finally determine the maximum rank over all block-sequential update schedules and the supremum periodic rank over all complete update schedules

    Seq2Slate: Re-ranking and Slate Optimization with RNNs

    Full text link
    Ranking is a central task in machine learning and information retrieval. In this task, it is especially important to present the user with a slate of items that is appealing as a whole. This in turn requires taking into account interactions between items, since intuitively, placing an item on the slate affects the decision of which other items should be placed alongside it. In this work, we propose a sequence-to-sequence model for ranking called seq2slate. At each step, the model predicts the next `best' item to place on the slate given the items already selected. The sequential nature of the model allows complex dependencies between the items to be captured directly in a flexible and scalable way. We show how to learn the model end-to-end from weak supervision in the form of easily obtained click-through data. We further demonstrate the usefulness of our approach in experiments on standard ranking benchmarks as well as in a real-world recommendation system

    A Fast Image Encryption Scheme based on Chaotic Standard Map

    Full text link
    In recent years, a variety of effective chaos-based image encryption schemes have been proposed. The typical structure of these schemes has the permutation and the diffusion stages performed alternatively. The confusion and diffusion effect is solely contributed by the permutation and the diffusion stage, respectively. As a result, more overall rounds than necessary are required to achieve a certain level of security. In this paper, we suggest to introduce certain diffusion effect in the confusion stage by simple sequential add-and-shift operations. The purpose is to reduce the workload of the time-consuming diffusion part so that fewer overall rounds and hence a shorter encryption time is needed. Simulation results show that at a similar performance level, the proposed cryptosystem needs less than one-third the encryption time of an existing cryptosystem. The effective acceleration of the encryption speed is thus achieved.Comment: 16 pages, 7 figure

    Introducing a Probabilistic Structure on Sequential Dynamical Systems, Simulation and Reduction of Probabilistic Sequential Networks

    Full text link
    A probabilistic structure on sequential dynamical systems is introduced here, the new model will be called Probabilistic Sequential Network, PSN. The morphisms of Probabilistic Sequential Networks are defined using two algebraic conditions. It is proved here that two homomorphic Probabilistic Sequential Networks have the same equilibrium or steady state probabilities if the morphism is either an epimorphism or a monomorphism. Additionally, the proof of the set of PSN with its morphisms form the category PSN, having the category of sequential dynamical systems SDS, as a full subcategory is given. Several examples of morphisms, subsystems and simulations are given.Comment: 14 page

    Finding the Minimal DFA of Very Large Finite State Automata with an Application to Token Passing Networks

    Full text link
    Finite state automata (FSA) are ubiquitous in computer science. Two of the most important algorithms for FSA processing are the conversion of a non-deterministic finite automaton (NFA) to a deterministic finite automaton (DFA), and then the production of the unique minimal DFA for the original NFA. We exhibit a parallel disk-based algorithm that uses a cluster of 29 commodity computers to produce an intermediate DFA with almost two billion states and then continues by producing the corresponding unique minimal DFA with less than 800,000 states. The largest previous such computation in the literature was carried out on a 512-processor CM-5 supercomputer in 1996. That computation produced an intermediate DFA with 525,000 states and an unreported number of states for the corresponding minimal DFA. The work is used to provide strong experimental evidence satisfying a conjecture on a series of token passing networks. The conjecture concerns stack sortable permutations for a finite stack and a 3-buffer. The origins of this problem lie in the work on restricted permutations begun by Knuth and Tarjan in the late 1960s. The parallel disk-based computation is also compared with both a single-threaded and multi-threaded RAM-based implementation using a 16-core 128 GB large shared memory computer.Comment: 14 pages, 4 figure

    Simulation of Probabilistic Sequential Systems

    Full text link
    In this paper we introduce the idea of probability in the definition of Sequential Dynamical Systems, thus obtaining a new concept, Probabilistic Sequential System. The introduction of a probabilistic structure on Sequential Dynamical Systems is an important and interesting problem. The notion of homomorphism of our new model, is a natural extension of homomorphism of sequential dynamical systems introduced and developed by Laubenbacher and Paregeis in several papers. Our model, give the possibility to describe the dynamic of the systems using Markov chains and all the advantage of stochastic theory. The notion of simulation is introduced using the concept of homomorphisms, as usual. Several examples of homomorphisms, subsystems and simulations are given

    Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling

    Full text link
    When using stochastic gradient descent to solve large-scale machine learning problems, a common practice of data processing is to shuffle the training data, partition the data across multiple machines if needed, and then perform several epochs of training on the re-shuffled (either locally or globally) data. The above procedure makes the instances used to compute the gradients no longer independently sampled from the training data set. Then does the distributed SGD method have desirable convergence properties in this practical situation? In this paper, we give answers to this question. First, we give a mathematical formulation for the practical data processing procedure in distributed machine learning, which we call data partition with global/local shuffling. We observe that global shuffling is equivalent to without-replacement sampling if the shuffling operations are independent. We prove that SGD with global shuffling has convergence guarantee in both convex and non-convex cases. An interesting finding is that, the non-convex tasks like deep learning are more suitable to apply shuffling comparing to the convex tasks. Second, we conduct the convergence analysis for SGD with local shuffling. The convergence rate for local shuffling is slower than that for global shuffling, since it will lose some information if there's no communication between partitioned data. Finally, we consider the situation when the permutation after shuffling is not uniformly distributed (insufficient shuffling), and discuss the condition under which this insufficiency will not influence the convergence rate. Our theoretical results provide important insights to large-scale machine learning, especially in the selection of data processing methods in order to achieve faster convergence and good speedup. Our theoretical findings are verified by extensive experiments on logistic regression and deep neural networks

    Multi-Issue Social Learning

    Full text link
    We consider social learning where agents can only observe part of the population (modeled as neighbors on an undirected graph), face many decision problems, and arrival order of the agents is unknown. The central question we pose is whether there is a natural observability graph that prevents the information cascade phenomenon. We introduce the `celebrities graph' and prove that indeed it allows for proper information aggregation in large populations even when the order at which agents decide is random and even when different issues are decided in different orders.Comment: Accepted to Mathematical social sciences journa
    • …
    corecore