74,921 research outputs found
Forgetting Exceptions is Harmful in Language Learning
We show that in language learning, contrary to received wisdom, keeping
exceptional training instances in memory can be beneficial for generalization
accuracy. We investigate this phenomenon empirically on a selection of
benchmark natural language processing tasks: grapheme-to-phoneme conversion,
part-of-speech tagging, prepositional-phrase attachment, and base noun phrase
chunking. In a first series of experiments we combine memory-based learning
with training set editing techniques, in which instances are edited based on
their typicality and class prediction strength. Results show that editing
exceptional instances (with low typicality or low class prediction strength)
tends to harm generalization accuracy. In a second series of experiments we
compare memory-based learning and decision-tree learning methods on the same
selection of tasks, and find that decision-tree learning often performs worse
than memory-based learning. Moreover, the decrease in performance can be linked
to the degree of abstraction from exceptions (i.e., pruning or eagerness). We
provide explanations for both results in terms of the properties of the natural
language processing tasks and the learning algorithms.Comment: 31 pages, 7 figures, 10 tables. uses 11pt, fullname, a4wide tex
styles. Pre-print version of article to appear in Machine Learning 11:1-3,
Special Issue on Natural Language Learning. Figures on page 22 slightly
compressed to avoid page overloa
Do not forget: Full memory in memory-based learning of word pronunciation
Memory-based learning, keeping full memory of learning material, appears a
viable approach to learning NLP tasks, and is often superior in generalisation
accuracy to eager learning approaches that abstract from learning material.
Here we investigate three partial memory-based learning approaches which remove
from memory specific task instance types estimated to be exceptional. The three
approaches each implement one heuristic function for estimating exceptionality
of instance types: (i) typicality, (ii) class prediction strength, and (iii)
friendly-neighbourhood size. Experiments are performed with the memory-based
learning algorithm IB1-IG trained on English word pronunciation. We find that
removing instance types with low prediction strength (ii) is the only tested
method which does not seriously harm generalisation accuracy. We conclude that
keeping full memory of types rather than tokens, and excluding minority
ambiguities appear to be the only performance-preserving optimisations of
memory-based learning.Comment: uses conll98, epsf, and ipamacs (WSU IPA
Don't Blame Distributional Semantics if it can't do Entailment
Distributional semantics has had enormous empirical success in Computational
Linguistics and Cognitive Science in modeling various semantic phenomena, such
as semantic similarity, and distributional models are widely used in
state-of-the-art Natural Language Processing systems. However, the theoretical
status of distributional semantics within a broader theory of language and
cognition is still unclear: What does distributional semantics model? Can it
be, on its own, a fully adequate model of the meanings of linguistic
expressions? The standard answer is that distributional semantics is not fully
adequate in this regard, because it falls short on some of the central aspects
of formal semantic approaches: truth conditions, entailment, reference, and
certain aspects of compositionality. We argue that this standard answer rests
on a misconception: These aspects do not belong in a theory of expression
meaning, they are instead aspects of speaker meaning, i.e., communicative
intentions in a particular context. In a slogan: words do not refer, speakers
do. Clearing this up enables us to argue that distributional semantics on its
own is an adequate model of expression meaning. Our proposal sheds light on the
role of distributional semantics in a broader theory of language and cognition,
its relationship to formal semantics, and its place in computational models.Comment: To appear in Proceedings of the 13th International Conference on
Computational Semantics (IWCS 2019), Gothenburg, Swede
On the Feasibility of Transfer-learning Code Smells using Deep Learning
Context: A substantial amount of work has been done to detect smells in
source code using metrics-based and heuristics-based methods. Machine learning
methods have been recently applied to detect source code smells; however, the
current practices are considered far from mature. Objective: First, explore the
feasibility of applying deep learning models to detect smells without extensive
feature engineering, just by feeding the source code in tokenized form. Second,
investigate the possibility of applying transfer-learning in the context of
deep learning models for smell detection. Method: We use existing metric-based
state-of-the-art methods for detecting three implementation smells and one
design smell in C# code. Using these results as the annotated gold standard, we
train smell detection models on three different deep learning architectures.
These architectures use Convolution Neural Networks (CNNs) of one or two
dimensions, or Recurrent Neural Networks (RNNs) as their principal hidden
layers. For the first objective of our study, we perform training and
evaluation on C# samples, whereas for the second objective, we train the models
from C# code and evaluate the models over Java code samples. We perform the
experiments with various combinations of hyper-parameters for each model.
Results: We find it feasible to detect smells using deep learning methods. Our
comparative experiments find that there is no clearly superior method between
CNN-1D and CNN-2D. We also observe that performance of the deep learning models
is smell-specific. Our transfer-learning experiments show that
transfer-learning is definitely feasible for implementation smells with
performance comparable to that of direct-learning. This work opens up a new
paradigm to detect code smells by transfer-learning especially for the
programming languages where the comprehensive code smell detection tools are
not available
- β¦