551 research outputs found
Polyglot Semantic Parsing in APIs
Traditional approaches to semantic parsing (SP) work by training individual
models for each available parallel dataset of text-meaning pairs. In this
paper, we explore the idea of polyglot semantic translation, or learning
semantic parsing models that are trained on multiple datasets and natural
languages. In particular, we focus on translating text to code signature
representations using the software component datasets of Richardson and Kuhn
(2017a,b). The advantage of such models is that they can be used for parsing a
wide variety of input natural languages and output programming languages, or
mixed input languages, using a single unified model. To facilitate modeling of
this type, we develop a novel graph-based decoding framework that achieves
state-of-the-art performance on the above datasets, and apply this method to
two other benchmark SP tasks.Comment: accepted for NAACL-2018 (camera ready version
Lexicrunch : an expert system for word morphology
Natural language programs typically store words like pig and
pigs as independent entries in their dictionaries, thus neglecting
the obvious morphological relationship between them. Lexicrunch
tries to induce such relationships from examples of root forms of
words and the corresponding inflected forms.
The program collates ,he examples into classes according to
the difference between the inflected form and its root -- e.g. the
classes for the plural noun inflection in English might include
"root forms to which an -s is added" pig, apple, etc.) and "root
forms which take -es" (fox, box, etc. . It then characterizes
each class using a modified version of Quinlan's ID3 procedure.
The resulting rule will be along the lines of, "If a noun
ends in -x, form its plural by adding -es; otherwise, add -s."
The program then needs to store only root forms in its dictionary;
it can reconstruct plurals on demand by applying its rule. It
thereby eliminates redundancy and compacts the lexicon.
Lexicrunch's formalism for representing morphological rules wag
influenced by the Two-level model of Koskenniemi.
The program was tested on the past tense inflection in
English, the first person singular present indicative of Finnish,
and the past participle in French. It appeared to pick up most of
the regularities in the data successfully. However, a meta-level
extension to the program is indicated to enable it to capture
regularities across its rules
- …