53,415 research outputs found

    Stochastic Answer Networks for Machine Reading Comprehension

    Full text link
    We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension. Compared to previous work such as ReasoNet which used reinforcement learning to determine the number of steps, the unique feature is the use of a kind of stochastic prediction dropout on the answer module (final layer) of the neural network during the training. We show that this simple trick improves robustness and achieves results competitive to the state-of-the-art on the Stanford Question Answering Dataset (SQuAD), the Adversarial SQuAD, and the Microsoft MAchine Reading COmprehension Dataset (MS MARCO).Comment: 11 pages, 5 figures, Accepted to ACL 201

    A Neural Model of How The Brain Represents and Compares Numbers

    Full text link
    Many psychophysical experiments have shown that the representation of numbers and numerical quantities in humans and animals is related to number magnitude. A neural network model is proposed to quantitatively simulate error rates in quantification and numerical comparison tasks, and reaction times for number priming and numerical assessment and comparison tasks. Transient responses to inputs arc integrated before they activate an ordered spatial map that selectively responds to the number of events in a sequence. The dynamics of numerical comparison are encoded in activity pattern changes within this spatial map. Such changes cause a "directional comparison wave" whose properties mimic data about numerical comparison. These model mechanisms are variants of neural mechanisms that have elsewhere been used to explain data about motion perception, attention shifts, and target tracking. Thus, the present model suggests how numerical representations may have emerged as specializations of more primitive mechanisms in the cortical Where processing stream.National Science Foundation (IRI-97-20333); Defense Advanced research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Institute of Health (1-R29-DC02952-01

    Multilingual Language Processing From Bytes

    Full text link
    We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary. Because we operate directly on unicode bytes rather than language-specific words or characters, we can analyze text in many languages with a single model. Due to the small vocabulary size, these multilingual models are very compact, but produce results similar to or better than the state-of- the-art in Part-of-Speech tagging and Named Entity Recognition that use only the provided training datasets (no external data sources). Our models are learning "from scratch" in that they do not rely on any elements of the standard pipeline in Natural Language Processing (including tokenization), and thus can run in standalone fashion on raw text

    Storing cycles in Hopfield-type networks with pseudoinverse learning rule: admissibility and network topology

    Full text link
    Cyclic patterns of neuronal activity are ubiquitous in animal nervous systems, and partially responsible for generating and controlling rhythmic movements such as locomotion, respiration, swallowing and so on. Clarifying the role of the network connectivities for generating cyclic patterns is fundamental for understanding the generation of rhythmic movements. In this paper, the storage of binary cycles in neural networks is investigated. We call a cycle Σ\Sigma admissible if a connectivity matrix satisfying the cycle's transition conditions exists, and construct it using the pseudoinverse learning rule. Our main focus is on the structural features of admissible cycles and corresponding network topology. We show that Σ\Sigma is admissible if and only if its discrete Fourier transform contains exactly r=rank(Σ)r={rank}(\Sigma) nonzero columns. Based on the decomposition of the rows of Σ\Sigma into loops, where a loop is the set of all cyclic permutations of a row, cycles are classified as simple cycles, separable or inseparable composite cycles. Simple cycles contain rows from one loop only, and the network topology is a feedforward chain with feedback to one neuron if the loop-vectors in Σ\Sigma are cyclic permutations of each other. Composite cycles contain rows from at least two disjoint loops, and the neurons corresponding to the rows in Σ\Sigma from the same loop are identified with a cluster. Networks constructed from separable composite cycles decompose into completely isolated clusters. For inseparable composite cycles at least two clusters are connected, and the cluster-connectivity is related to the intersections of the spaces spanned by the loop-vectors of the clusters. Simulations showing successfully retrieved cycles in continuous-time Hopfield-type networks and in networks of spiking neurons are presented.Comment: 48 pages, 3 figure

    A Neural Model of How the Brain Represents and Compares Multi-Digit Numbers: Spatial and Categorical Processes

    Full text link
    Both animals and humans are capable of representing and comparing numerical quantities, but only humans seem to have evolved multi-digit place-value number systems. This article develops a neural model, called the Spatial Number Network, or SpaN model, which predicts how these shared numerical capabilities are computed using a spatial representation of number quantities in the Where cortical processing stream, notably the Inferior Parietal Cortex. Multi-digit numerical representations that obey a place-value principle are proposed to arise through learned interactions between categorical language representations in the What cortical processing stream and the Where spatial representation. It is proposed that learned semantic categories that symbolize separate digits, as well as place markers like "tens," "hundreds," "thousands," etc., are associated through learning with the corresponding spatial locations of the Where representation, leading to a place-value number system as an emergent property of What-Where information fusion. The model quantitatively simulates error rates in quantification and numerical comparison tasks, and reaction times for number priming and numerical assessment and comparison tasks. In the Where cortical process, it is proposed that transient responses to inputs are integrated before they activate an ordered spatial map that selectively responds to the number of events in a sequence. Neural mechanisms are defined which give rise to an ordered spatial numerical map ordering and Weber law characteristics as emergent properties. The dynamics of numerical comparison are encoded in activity pattern changes within this spatial map. Such changes cause a "directional comparison wave" whose properties mimic data about numerical comparison. These model mechanisms are variants of neural mechanisms that have elsewhere been used to explain data about motion perception, attention shifts, and target tracking. Thus, the present model suggests how numerical representations may have emerged as specializations of more primitive mechanisms in the cortical Where processing stream. The model's What-Where interactions can explain human psychophysical data, such as error rates and reaction times, about multi-digit (base 10) numerical stimuli, and describe how such a competence can develop through learning. The SpaN model and its explanatory range arc compared with other models of numerical representation.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Science Foundation (IRI-97-20333

    Neural network modeling of memory deterioration in Alzheimer's disease

    Get PDF
    The clinical course of Alzheimer's disease (AD) is generally characterized by progressive gradual deterioration, although large clinical variability exists. Motivated by the recent quantitative reports of synaptic changes in AD, we use a neural network model to investigate how the interplay between synaptic deletion and compensation determines the pattern of memory deterioration, a clinical hallmark of AD. Within the model we show that the deterioration of memory retrieval due to synaptic deletion can be much delayed by multiplying all the remaining synaptic weights by a common factor, which keeps the average input to each neuron at the same level. This parallels the experimental observation that the total synaptic area per unit volume (TSA) is initially preserved when synaptic deletion occurs. By using different dependencies of the compensatory factor on the amount of synaptic deletion one can define various compensation strategies, which can account for the observed variation in the severity and progression rate of AD

    Supervised Learning in Spiking Neural Networks for Precise Temporal Encoding

    Get PDF
    Precise spike timing as a means to encode information in neural networks is biologically supported, and is advantageous over frequency-based codes by processing input features on a much shorter time-scale. For these reasons, much recent attention has been focused on the development of supervised learning rules for spiking neural networks that utilise a temporal coding scheme. However, despite significant progress in this area, there still lack rules that have a theoretical basis, and yet can be considered biologically relevant. Here we examine the general conditions under which synaptic plasticity most effectively takes place to support the supervised learning of a precise temporal code. As part of our analysis we examine two spike-based learning methods: one of which relies on an instantaneous error signal to modify synaptic weights in a network (INST rule), and the other one on a filtered error signal for smoother synaptic weight modifications (FILT rule). We test the accuracy of the solutions provided by each rule with respect to their temporal encoding precision, and then measure the maximum number of input patterns they can learn to memorise using the precise timings of individual spikes as an indication of their storage capacity. Our results demonstrate the high performance of FILT in most cases, underpinned by the rule's error-filtering mechanism, which is predicted to provide smooth convergence towards a desired solution during learning. We also find FILT to be most efficient at performing input pattern memorisations, and most noticeably when patterns are identified using spikes with sub-millisecond temporal precision. In comparison with existing work, we determine the performance of FILT to be consistent with that of the highly efficient E-learning Chronotron, but with the distinct advantage that FILT is also implementable as an online method for increased biological realism.Comment: 26 pages, 10 figures, this version is published in PLoS ONE and incorporates reviewer comment
    • …
    corecore