Search CORE

893 research outputs found

An interactive web-interface for visualizing the inner workings of the question answering LSTM

Author: Loginova Ekaterina
Neumann Günter
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop

Author: Alishahi Afra
Chrupała Grzegorz
Linzen Tal
Publication venue
Publication date: 05/04/2019
Field of study

The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category

arXiv.org e-Print Archive

Tilburg University Repository

Recommended from our members

Jointly Learning to Label Sentences and Tokens

Author: AAAI
Rei Marek
Sogaard Anders
Publication venue: 'Organisation for Economic Co-Operation and Development (OECD)'
Publication date: 01/01/2019
Field of study

Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size. Methods for directly supervising language composition can allow us to guide the models based on existing knowledge, regularizing them towards more robust and interpretable representations. In this paper, we investigate how objectives at different granularities can be used to learn better language representations and we propose an architecture for jointly learning to label sentences and tokens. The predictions at each level are combined together using an attention mechanism, with token-level labels also acting as explicit supervision for composing sentence-level representations. Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classification and sequence labeling

Apollo (Cambridge)

Jointly Learning to Label Sentences and Tokens

Author: Rei Marek
Sogaard Anders
Publication venue: THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE
Publication date: 31/10/2018
Field of study

arXiv.org e-Print Archive

Copenhagen University Research Information System

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

Multilingual Name Entity Recognition and Intent Classification Employing Deep Learning Architectures

Author: Chatzisavvas Konstantinos Ch.
Paflioti Antonia
Rizou Sofia
Sarigiannidis George
Theofilatos Angelos
Vakali Athena
Publication venue: 'Elsevier BV'
Publication date: 04/11/2022
Field of study

Named Entity Recognition and Intent Classification are among the most important subfields of the field of Natural Language Processing. Recent research has lead to the development of faster, more sophisticated and efficient models to tackle the problems posed by those two tasks. In this work we explore the effectiveness of two separate families of Deep Learning networks for those tasks: Bidirectional Long Short-Term networks and Transformer-based networks. The models were trained and tested on the ATIS benchmark dataset for both English and Greek languages. The purpose of this paper is to present a comparative study of the two groups of networks for both languages and showcase the results of our experiments. The models, being the current state-of-the-art, yielded impressive results and achieved high performance.Comment: 24 pages, 5 figures, 11 tables, dataset availabl

arXiv.org e-Print Archive

The emergence of number and syntax units in LSTM language models

Author: Baroni Marco
Dehaene Stanislas
Desbordes Theo
Hupkes Dieuwke
Kruszewski German
Lakretz Yair
Publication venue
Publication date: 01/01/2019
Field of study

Recent work has shown that LSTMs trained on a generic language modeling objective capture syntax-sensitive generalizations such as long-distance number agreement. We have however no mechanistic understanding of how they accomplish this remarkable feat. Some have conjectured it depends on heuristics that do not truly take hierarchical structure into account. We present here a detailed study of the inner mechanics of number tracking in LSTMs at the single neuron level. We discover that long-distance number information is largely managed by two `number units'. Importantly, the behaviour of these units is partially controlled by other units independently shown to track syntactic structure. We conclude that LSTMs are, to some extent, implementing genuinely syntactic processing mechanisms, paving the way to a more general understanding of grammatical encoding in LSTMs.Comment: To appear in Proceedings of NAACL, Minneapolis, MN, 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE