7,690 research outputs found
What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models
Despite the remarkable evolution of deep neural networks in natural language
processing (NLP), their interpretability remains a challenge. Previous work
largely focused on what these models learn at the representation level. We
break this analysis down further and study individual dimensions (neurons) in
the vector representation learned by end-to-end neural models in NLP tasks. We
propose two methods: Linguistic Correlation Analysis, based on a supervised
method to extract the most relevant neurons with respect to an extrinsic task,
and Cross-model Correlation Analysis, an unsupervised method to extract salient
neurons w.r.t. the model itself. We evaluate the effectiveness of our
techniques by ablating the identified neurons and reevaluating the network's
performance for two tasks: neural machine translation (NMT) and neural language
modeling (NLM). We further present a comprehensive analysis of neurons with the
aim to address the following questions: i) how localized or distributed are
different linguistic properties in the models? ii) are certain neurons
exclusive to some properties and not others? iii) is the information more or
less distributed in NMT vs. NLM? and iv) how important are the neurons
identified through the linguistic correlation method to the overall task? Our
code is publicly available as part of the NeuroX toolkit (Dalvi et al. 2019).Comment: AAA 2019, pages 10, AAAI Conference on Artificial Intelligence (AAAI
2019
Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English
The necessity of using a fixed-size word vocabulary in order to control the
model complexity in state-of-the-art neural machine translation (NMT) systems
is an important bottleneck on performance, especially for morphologically rich
languages. Conventional methods that aim to overcome this problem by using
sub-word or character-level representations solely rely on statistics and
disregard the linguistic properties of words, which leads to interruptions in
the word structure and causes semantic and syntactic losses. In this paper, we
propose a new vocabulary reduction method for NMT, which can reduce the
vocabulary of a given input corpus at any rate while also considering the
morphological properties of the language. Our method is based on unsupervised
morphology learning and can be, in principle, used for pre-processing any
language pair. We also present an alternative word segmentation method based on
supervised morphological analysis, which aids us in measuring the accuracy of
our model. We evaluate our method in Turkish-to-English NMT task where the
input language is morphologically rich and agglutinative. We analyze different
representation methods in terms of translation accuracy as well as the semantic
and syntactic properties of the generated output. Our method obtains a
significant improvement of 2.3 BLEU points over the conventional vocabulary
reduction technique, showing that it can provide better accuracy in open
vocabulary translation of morphologically rich languages.Comment: The 20th Annual Conference of the European Association for Machine
Translation (EAMT), Research Paper, 12 page
PathologyGAN: Learning deep representations of cancer tissue
We apply Generative Adversarial Networks (GANs) to the domain of digital
pathology. Current machine learning research for digital pathology focuses on
diagnosis, but we suggest a different approach and advocate that generative
models could drive forward the understanding of morphological characteristics
of cancer tissue. In this paper, we develop a framework which allows GANs to
capture key tissue features and uses these characteristics to give structure to
its latent space. To this end, we trained our model on 249K H&E breast cancer
tissue images, extracted from 576 TMA images of patients from the Netherlands
Cancer Institute (NKI) and Vancouver General Hospital (VGH) cohorts. We show
that our model generates high quality images, with a Frechet Inception Distance
(FID) of 16.65. We further assess the quality of the images with cancer tissue
characteristics (e.g. count of cancer, lymphocytes, or stromal cells), using
quantitative information to calculate the FID and showing consistent
performance of 9.86. Additionally, the latent space of our model shows an
interpretable structure and allows semantic vector operations that translate
into tissue feature transformations. Furthermore, ratings from two expert
pathologists found no significant difference between our generated tissue
images from real ones. The code, generated images, and pretrained model are
available at https://github.com/AdalbertoCq/Pathology-GANComment: MIDL 2020 final versio
One-Shot Neural Cross-Lingual Transfer for Paradigm Completion
We present a novel cross-lingual transfer method for paradigm completion, the
task of mapping a lemma to its inflected forms, using a neural encoder-decoder
model, the state of the art for the monolingual task. We use labeled data from
a high-resource language to increase performance on a low-resource language. In
experiments on 21 language pairs from four different language families, we
obtain up to 58% higher accuracy than without transfer and show that even
zero-shot and one-shot learning are possible. We further find that the degree
of language relatedness strongly influences the ability to transfer
morphological knowledge.Comment: Accepted at ACL 201
A Learning Framework for Morphological Operators using Counter-Harmonic Mean
We present a novel framework for learning morphological operators using
counter-harmonic mean. It combines concepts from morphology and convolutional
neural networks. A thorough experimental validation analyzes basic
morphological operators dilation and erosion, opening and closing, as well as
the much more complex top-hat transform, for which we report a real-world
application from the steel industry. Using online learning and stochastic
gradient descent, our system learns both the structuring element and the
composition of operators. It scales well to large datasets and online settings.Comment: Submitted to ISMM'1
Deep learning in computational microscopy
We propose to use deep convolutional neural networks (DCNNs) to perform 2D and 3D computational imaging. Specifically, we investigate three different applications. We first try to solve the 3D inverse scattering problem based on learning a huge number of training target and speckle pairs. We also demonstrate a new DCNN architecture to perform Fourier ptychographic Microscopy (FPM) reconstruction, which achieves high-resolution phase recovery with considerably less data than standard FPM. Finally, we employ DCNN models that can predict focused 2D fluorescent microscopic images from blurred images captured at overfocused or underfocused planes.Published versio
- …