Search CORE

2,854 research outputs found

Adaptation and contextualization of deep neural network models

Author: Kollias Dimitrios
Kollias Stefanos
Leontidis Georgios
Stafylopatis Andreas-Georgios
Tagaris Athanasios
Yu Miao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2017
Field of study

The ability of Deep Neural Networks (DNNs) to provide very high accuracy in classification and recognition problems makes them the major tool for developments in such problems. It is, however, known that DNNs are currently used in a ‘black box’ manner, lacking transparency and interpretability of their decision-making process. Moreover, DNNs should use prior information on data classes, or object categories, so as to provide efficient classification of new data, or objects, without forgetting their previous knowledge. In this paper, we propose a novel class of systems that are able to adapt and contextualize the structure of trained DNNs, providing ways for handling the above-mentioned problems. A hierarchical and distributed system memory is generated and used for this purpose. The main memory is composed of the trained DNN architecture for classification/prediction, i.e., its structure and weights, as well as of an extracted - equivalent – Clustered Representation Set (CRS) generated by the DNN during training at its final - before the output – hidden layer. The latter includes centroids - ‘points of attraction’ - which link the extracted representation to a specific area in the existing system memory. Drift detection, occurring, for example, in personalized data analysis, can be accomplished by comparing the distances of new data from the centroids, taking into account the intra-cluster distances. Moreover, using the generated CRS, the system is able to contextualize its decision-making process, when new data become available. A new public medical database on Parkinson’s disease is used as testbed to illustrate the capabilities of the proposed architecture

Contextualized Word Representations for Reading Comprehension

Author: Berant Jonathan
Salant Shimi
Publication venue
Publication date: 01/01/2018
Field of study

Reading a document and extracting an answer to a question about its content has attracted substantial attention recently. While most work has focused on the interaction between the question and the document, in this work we evaluate the importance of context when the question and document are processed independently. We take a standard neural architecture for this task, and show that by providing rich contextualized word representations from a large pre-trained language model as well as allowing the model to choose between context-dependent and context-independent word representations, we can obtain dramatic improvements and reach performance comparable to state-of-the-art on the competitive SQuAD dataset.Comment: 6 pages, 1 figure, NAACL 201

arXiv.org e-Print Archive

Introduction to the special issue on deep learning approaches for machine translation

Author: Allauzen Alexandre
Barrault loïc
Cho Kyunghun
Ruiz Costa-Jussà Marta
Schwenk Holger
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Deep learning is revolutionizing speech and natural language technologies since it is offering an effective way to train systems and obtaining significant improvements. The main advantage of deep learning is that, by developing the right architecture, the system automatically learns features from data without the need of explicitly designing them. This machine learning perspective is conceptually changing how speech and natural language technologies are addressed. In the case of Machine Translation (MT), deep learning was first introduced in standard statistical systems. By now, end-to-end neural MT systems have reached competitive results. This special issue introductory paper addresses how deep learning has been gradually introduced in MT. This introduction covers all topics contained in the papers included in this special issue, which basically are: integration of deep learning in statistical MT; development of the end-to-end neural MT system; and introduction of deep learning in interactive MT and MT evaluation. Finally, this introduction sketches some research directions that MT is taking guided by deep learning.Peer ReviewedPostprint (published version