50 research outputs found

    A neural blackboard architecture of sentence structure

    Get PDF
    We present a neural architecture for sentence representation. Sentences are represented in terms of word representations as constituents. A word representation consists of a neural assembly distributed over the brain. Sentence representation does not result from associations between neural word assemblies. Instead, word assemblies are embedded in a neural architecture, in which the structural (thematic) relations between words can be represented. Arbitrary thematic relations between arguments and verbs can be represented. Arguments can consist of nouns and phrases, as in sentences with relative clauses. A number of sentences can be stored simultaneously in this architecture. We simulate how probe questions about thematic relations can be answered. We discuss how differences in sentence complexity, such as the difference between subject-extracted versus object-extracted relative clauses and the difference between right-branching versus center-embedded structures, can be related to the underlying neural dynamics of the model. Finally, we illustrate how memory capacity for sentence representation can be related to the nature of reverberating neural activity, which is used to store information temporarily in this architecture

    The role of recurrent networks in neural architectures of grounded cognition: learning of control

    Get PDF
    Recurrent networks have been used as neural models of language processing, with mixed results. Here, we discuss the role of recurrent networks in a neural architecture of grounded cognition. In particular, we discuss how the control of binding in this architecture can be learned. We trained a simple recurrent network (SRN) and a feedforward network (FFN) for this task. The results show that information from the architecture is needed as input for these networks to learn control of binding. Thus, both control systems are recurrent. We found that the recurrent system consisting of the architecture and an SRN or an FFN as a "core" can learn basic (but recursive) sentence structures. Problems with control of binding arise when the system with the SRN is tested on number of new sentence structures. In contrast, control of binding for these structures succeeds with the FFN. Yet, for some structures with (unlimited) embeddings, difficulties arise due to dynamical binding conflicts in the architecture itself. In closing, we discuss potential future developments of the architecture presented here

    GATology for Linguistics: What Syntactic Dependencies It Knows

    Full text link
    Graph Attention Network (GAT) is a graph neural network which is one of the strategies for modeling and representing explicit syntactic knowledge and can work with pre-trained models, such as BERT, in downstream tasks. Currently, there is still a lack of investigation into how GAT learns syntactic knowledge from the perspective of model structure. As one of the strategies for modeling explicit syntactic knowledge, GAT and BERT have never been applied and discussed in Machine Translation (MT) scenarios. We design a dependency relation prediction task to study how GAT learns syntactic knowledge of three languages as a function of the number of attention heads and layers. We also use a paired t-test and F1-score to clarify the differences in syntactic dependency prediction between GAT and BERT fine-tuned by the MT task (MT-B). The experiments show that better performance can be achieved by appropriately increasing the number of attention heads with two GAT layers. With more than two layers, learning suffers. Moreover, GAT is more competitive in training speed and syntactic dependency prediction than MT-B, which may reveal a better incorporation of modeling explicit syntactic knowledge and the possibility of combining GAT and BERT in the MT tasks

    Syntactic Knowledge via Graph Attention with BERT in Machine Translation

    Full text link
    Although the Transformer model can effectively acquire context features via a self-attention mechanism, deeper syntactic knowledge is still not effectively modeled. To alleviate the above problem, we propose Syntactic knowledge via Graph attention with BERT (SGB) in Machine Translation (MT) scenarios. Graph Attention Network (GAT) and BERT jointly represent syntactic dependency feature as explicit knowledge of the source language to enrich source language representations and guide target language generation. Our experiments use gold syntax-annotation sentences and Quality Estimation (QE) model to obtain interpretability of translation quality improvement regarding syntactic knowledge without being limited to a BLEU score. Experiments show that the proposed SGB engines improve translation quality across the three MT tasks without sacrificing BLEU scores. We investigate what length of source sentences benefits the most and what dependencies are better identified by the SGB engines. We also find that learning of specific dependency relations by GAT can be reflected in the translation quality containing such relations and that syntax on the graph leads to new modeling of syntactic aspects of source sentences in the middle and bottom layers of BERT

    Population density equations for stochastic processes with memory kernels

    Get PDF
    We present a method for solving population density equations (PDEs)–-a mean-field technique describing homogeneous populations of uncoupled neurons—where the populations can be subject to non-Markov noise for arbitrary distributions of jump sizes. The method combines recent developments in two different disciplines that traditionally have had limited interaction: computational neuroscience and the theory of random networks. The method uses a geometric binning scheme, based on the method of characteristics, to capture the deterministic neurodynamics of the population, separating the deterministic and stochastic process cleanly. We can independently vary the choice of the deterministic model and the model for the stochastic process, leading to a highly modular numerical solution strategy. We demonstrate this by replacing the master equation implicit in many formulations of the PDE formalism by a generalization called the generalized Montroll-Weiss equation—a recent result from random network theory—describing a random walker subject to transitions realized by a non-Markovian process. We demonstrate the method for leaky- and quadratic-integrate and fire neurons subject to spike trains with Poisson and gamma-distributed interspike intervals. We are able to model jump responses for both models accurately to both excitatory and inhibitory input under the assumption that all inputs are generated by one renewal process
    corecore