8,022 research outputs found
An attentive neural architecture for joint segmentation and parsing and its application to real estate ads
In processing human produced text using natural language processing (NLP)
techniques, two fundamental subtasks that arise are (i) segmentation of the
plain text into meaningful subunits (e.g., entities), and (ii) dependency
parsing, to establish relations between subunits. In this paper, we develop a
relatively simple and effective neural joint model that performs both
segmentation and dependency parsing together, instead of one after the other as
in most state-of-the-art works. We will focus in particular on the real estate
ad setting, aiming to convert an ad to a structured description, which we name
property tree, comprising the tasks of (1) identifying important entities of a
property (e.g., rooms) from classifieds and (2) structuring them into a tree
format. In this work, we propose a new joint model that is able to tackle the
two tasks simultaneously and construct the property tree by (i) avoiding the
error propagation that would arise from the subtasks one after the other in a
pipelined fashion, and (ii) exploiting the interactions between the subtasks.
For this purpose, we perform an extensive comparative study of the pipeline
methods and the new proposed joint model, reporting an improvement of over
three percentage points in the overall edge F1 score of the property tree.
Also, we propose attention methods, to encourage our model to focus on salient
tokens during the construction of the property tree. Thus we experimentally
demonstrate the usefulness of attentive neural architectures for the proposed
joint model, showcasing a further improvement of two percentage points in edge
F1 score for our application.Comment: Preprint - Accepted for publication in Expert Systems with
Application
A Syntactic Neural Model for General-Purpose Code Generation
We consider the problem of parsing natural language descriptions into source
code written in a general-purpose programming language like Python. Existing
data-driven methods treat this problem as a language generation task without
considering the underlying syntax of the target programming language. Informed
by previous work in semantic parsing, in this paper we propose a novel neural
architecture powered by a grammar model to explicitly capture the target syntax
as prior knowledge. Experiments find this an effective way to scale up to
generation of complex programs from natural language descriptions, achieving
state-of-the-art results that well outperform previous code generation and
semantic parsing approaches.Comment: To appear in ACL 201
Parsing with CYK over Distributed Representations
Syntactic parsing is a key task in natural language processing. This task has
been dominated by symbolic, grammar-based parsers. Neural networks, with their
distributed representations, are challenging these methods. In this article we
show that existing symbolic parsing algorithms can cross the border and be
entirely formulated over distributed representations. To this end we introduce
a version of the traditional Cocke-Younger-Kasami (CYK) algorithm, called
D-CYK, which is entirely defined over distributed representations. Our D-CYK
uses matrix multiplication on real number matrices of size independent of the
length of the input string. These operations are compatible with traditional
neural networks. Experiments show that our D-CYK approximates the original CYK
algorithm. By showing that CYK can be entirely performed on distributed
representations, we open the way to the definition of recurrent layers of
CYK-informed neural networks.Comment: The algorithm has been greatly improved. Experiments have been
redesigne
- …