62,923 research outputs found
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
Learning Semantic Correspondences in Technical Documentation
We consider the problem of translating high-level textual descriptions to
formal representations in technical documentation as part of an effort to model
the meaning of such documentation. We focus specifically on the problem of
learning translational correspondences between text descriptions and grounded
representations in the target documentation, such as formal representation of
functions or code templates. Our approach exploits the parallel nature of such
documentation, or the tight coupling between high-level text and the low-level
representations we aim to learn. Data is collected by mining technical
documents for such parallel text-representation pairs, which we use to train a
simple semantic parsing model. We report new baseline results on sixteen novel
datasets, including the standard library documentation for nine popular
programming languages across seven natural languages, and a small collection of
Unix utility manuals.Comment: accepted to ACL-201
Biases in human behavior
The paper shows that biases in individualâs decision-making may result from the process of mental editing by which subjects produce a ârepresentationâ of the decision problem. During this process, individuals make systematic use of default classifications in order to reduce the short-term memory load and the complexity of symbolic manipulation. The result is the construction of an imperfect mental representation of the problem that nevertheless has the advantage of being simple, and yielding âsatisficingâ decisions. The imperfection origins in a trade-off that exists between the simplicity of representation of a strategy and his efficiency. To obtain simplicity, the strategyâs rules have to be memorized and represented with some degree of abstraction, that allow to drastically reduce their number. Raising the level of abstraction with which a strategyâs rule is represented, means to extend the domain of validity of the rule beyond the field in which the rule has been experimented, and may therefore induce to include unintentionally domains in which the rule is inefficient. Therefore the rise of errors in the mental representation of a problem may be the "natural" effect of the categorization and the identification of the building blocks of a strategy. The biases may be persistent and give rise to lock-in effect, in which individuals remain trapped in sub-optimal strategies, as it is proved by experimental results on stability of sub-optimal strategies in games like Target The Two. To understand why sub-optimal strategies, that embody errors, are locally stable, i.e. cannot be improved by small changes in the rules, it is considered Kauffmanâ NK model, because, among other properties, it shows that if there are interdependencies among the rules of a system, than the system admits many sub-optimal solutions that are locally stable, i.e. cannot be improved by simple mutations. But the fitness function in NK model is a random one, while in our context it is more reasonable to define the fitness of a strategy as efficiency of the program. If we introduce this kind of fitness, then the stability properties of the NK model do not hold any longer: the paper shows that while the elementary statements of a strategy are interdependent, it is possible to achieve an optimal configuration of the strategy via mutations and in consequence the sub-optimal solutions are not locally stable under mutations. The paper therefore provides a different explanation of the existence and stability of suboptimal strategies, based on the difficulty to redefine the sub-problems that constitute the building blocks of the problemâs representation
Building Program Vector Representations for Deep Learning
Deep learning has made significant breakthroughs in various fields of
artificial intelligence. Advantages of deep learning include the ability to
capture highly complicated features, weak involvement of human engineering,
etc. However, it is still virtually impossible to use deep learning to analyze
programs since deep architectures cannot be trained effectively with pure back
propagation. In this pioneering paper, we propose the "coding criterion" to
build program vector representations, which are the premise of deep learning
for program analysis. Our representation learning approach directly makes deep
learning a reality in this new field. We evaluate the learned vector
representations both qualitatively and quantitatively. We conclude, based on
the experiments, the coding criterion is successful in building program
representations. To evaluate whether deep learning is beneficial for program
analysis, we feed the representations to deep neural networks, and achieve
higher accuracy in the program classification task than "shallow" methods, such
as logistic regression and the support vector machine. This result confirms the
feasibility of deep learning to analyze programs. It also gives primary
evidence of its success in this new field. We believe deep learning will become
an outstanding technique for program analysis in the near future.Comment: This paper was submitted to ICSE'1
Calculating and understanding the value of any type of match evidence when there are potential testing errors
It is well known that Bayesâ theorem (with likelihood ratios) can be used to calculate the impact of evidence, such as a âmatchâ of some feature of a person. Typically the feature of interest is the DNA profile, but the method applies in principle to any feature of a person or object, including not just DNA, fingerprints, or footprints, but also more basic features such as skin colour, height, hair colour or even name. Notwithstanding concerns about the extensiveness of databases of such features, a serious challenge to the use of Bayes in such legal contexts is that its standard formulaic representations are not readily understandable to non-statisticians. Attempts to get round this problem usually involve representations based around some variation of an event tree. While this approach works well in explaining the most trivial instance of Bayesâ theorem (involving a single hypothesis and a single piece of evidence) it does not scale up to realistic situations. In particular, even with a single piece of match evidence, if we wish to incorporate the possibility that there are potential errors (both false positives and false negatives) introduced at any stage in the investigative process, matters become very complex. As a result we have observed expert witnesses (in different areas of speciality) routinely ignore the possibility of errors when presenting their evidence. To counter this, we produce what we believe is the first full probabilistic solution of the simple case of generic match evidence incorporating both classes of testing errors. Unfortunately, the resultant event tree solution is too complex for intuitive comprehension. And, crucially, the event tree also fails to represent the causal information that underpins the argument. In contrast, we also present a simple-to-construct graphical Bayesian Network (BN) solution that automatically performs the calculations and may also be intuitively simpler to understand. Although there have been multiple previous applications of BNs for analysing forensic evidenceâincluding very detailed models for the DNA matching problem, these models have not widely penetrated the expert witness community. Nor have they addressed the basic generic match problem incorporating the two types of testing error. Hence we believe our basic BN solution provides an important mechanism for convincing expertsâand eventually the legal communityâthat it is possible to rigorously analyse and communicate the full impact of match evidence on a case, in the presence of possible error
Automatic case acquisition from texts for process-oriented case-based reasoning
This paper introduces a method for the automatic acquisition of a rich case
representation from free text for process-oriented case-based reasoning. Case
engineering is among the most complicated and costly tasks in implementing a
case-based reasoning system. This is especially so for process-oriented
case-based reasoning, where more expressive case representations are generally
used and, in our opinion, actually required for satisfactory case adaptation.
In this context, the ability to acquire cases automatically from procedural
texts is a major step forward in order to reason on processes. We therefore
detail a methodology that makes case acquisition from processes described as
free text possible, with special attention given to assembly instruction texts.
This methodology extends the techniques we used to extract actions from cooking
recipes. We argue that techniques taken from natural language processing are
required for this task, and that they give satisfactory results. An evaluation
based on our implemented prototype extracting workflows from recipe texts is
provided.Comment: Sous presse, publication pr\'evue en 201
- âŠ