12,743 research outputs found
Specifying Logic Programs in Controlled Natural Language
Writing specifications for computer programs is not easy since one has to
take into account the disparate conceptual worlds of the application domain and
of software development. To bridge this conceptual gap we propose controlled
natural language as a declarative and application-specific specification
language. Controlled natural language is a subset of natural language that can
be accurately and efficiently processed by a computer, but is expressive enough
to allow natural usage by non-specialists. Specifications in controlled natural
language are automatically translated into Prolog clauses, hence become formal
and executable. The translation uses a definite clause grammar (DCG) enhanced
by feature structures. Inter-text references of the specification, e.g.
anaphora, are resolved with the help of discourse representation theory (DRT).
The generated Prolog clauses are added to a knowledge base. We have implemented
a prototypical specification system that successfully processes the
specification of a simple automated teller machine.Comment: 16 pages, compressed, uuencoded Postscript, published in Proceedings
CLNLP 95, COMPULOGNET/ELSNET/EAGLES Workshop on Computational Logic for
Natural Language Processing, Edinburgh, April 3-5, 199
Structural variation in generated health reports
We present a natural language generator that produces a range of medical reports on the clinical histories of
cancer patients, and discuss the problem of conceptual restatement in generating various textual views of the
same conceptual content. We focus on two features of our system: the demand for 'loose paraphrases' between
the various reports on a given patient, with a high degree of semantic overlap but some necessary amount of distinctive content; and the requirement for paraphrasing at primarily the discourse level
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Types of middle voice in Indonesian language (Tipe-tipe diatesis medial dalam Bahasa Indonesia)
As the national language, Indonesian language has been often used as the object of linguistic study conducted by both local and foreign linguists. In this case, this study is concerned with the types and morphological structure of verbs in Indonesian middle voice. Data was gained through interview method to some speakers of Indonesian. The data was completed by those taken from daily newspaper Bali Post. Based on the analysis it was found that the middle voice in Indonesian can be distinguished into lexical, morphological, and periphrastic middle. Lexical middle is only constructed by zero intransitive verbs and the action conducted by ACTOR refers back to the ACTOR. Morphological middle results from affixes (ber-) and (ber-/-an) attached to verb and noun bases. Periphrastic middle is possibly derived from morphological middle. Affixes commonly used to produce periphrastic middle are transitive affixes (meN-), (meN-/-kan), and (meN-/-i) with their various forms depending on the initial phoneme of the bases. Semantically, middle voice in Indonesian Language is classified into ten types in accordance with what is proposed by Kemmer, they are (1) grooming or body action middle, (2) change in body posture middle, (3) non-translational motion middle, (4) translation motion middle, (5) indirect middle, (6) emotion middle, (7) cognitive middle, (8) spontaneous middle, (9) reciprocal situation middle, and (10) middle of action of emotion
Exploiting Lexical Conceptual Structure for paraphrase generation
Abstract. Lexical Conceptual Structure (LCS) represents verbs as semantic structures with a limited number of semantic predicates. This paper attempts to exploit how LCS can be used to explain the regularities underlying lexical and syntactic paraphrases, such as verb alternation, compound word decomposition, and lexical derivation. We propose a paraphrase generation model which transforms LCSs of verbs, and then conduct an empirical experiment taking the paraphrasing of Japanese light-verb constructions as an example. Experimental results justify that syntactic and semantic properties of verbs encoded in LCS are useful to semantically constrain the syntactic transformation in paraphrase generation.
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Following the major success of neural language models (LMs) such as BERT or
GPT-2 on a variety of language understanding tasks, recent work focused on
injecting (structured) knowledge from external resources into these models.
While on the one hand, joint pretraining (i.e., training from scratch, adding
objectives based on external knowledge to the primary LM objective) may be
prohibitively computationally expensive, post-hoc fine-tuning on external
knowledge, on the other hand, may lead to the catastrophic forgetting of
distributional knowledge. In this work, we investigate models for complementing
the distributional knowledge of BERT with conceptual knowledge from ConceptNet
and its corresponding Open Mind Common Sense (OMCS) corpus, respectively, using
adapter training. While overall results on the GLUE benchmark paint an
inconclusive picture, a deeper analysis reveals that our adapter-based models
substantially outperform BERT (up to 15-20 performance points) on inference
tasks that require the type of conceptual knowledge explicitly present in
ConceptNet and OMCS
- …