19 research outputs found
Semanticizing syntactic patterns in NLP processing using SPARQL-DL queries
Some recent works on natural language semantic parsing make use of syntax and semantics together using different combination models. In our work we attempt to use SPARQL-DL as an interface between syntactic information given by the Stanford statistical parser (namely part-of-speech tagged text and typed dependency representation) and semantic information obtained from the FrameNet database. We use SPARQL-DL queries to check the presence of syntactic patterns within a sentence and identify their role as frame elements. The choice of SPARQL-DL is due to its usage as a common reference language for semantic applications and its high expressivity, which let rules to be generalized exploiting the inference capabilities of the underlying reasoner
Tool comparison of semantic parsers
Natural Language Processing (NLP) is a vital aspect for artificial intelligence systems to achieve integration into human lives, which has been a goal for researchers in this industry. While NLP focuses on an array of problems, semantic parsing will be specifically focused on throughout this paper. These parsers have been considerably targeted for improvement through the scientific community and demand for semantic parsers that achieve high accuracy has increased. There have been many approaches developed for this specific purpose and in this paper, a deep analysis was performed to compare the performance of semantic parsing systems. The implications of this comparison provides a viewpoint of how semantic parsers from different eras compare on a set of shared metrics
Recommended from our members
Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task
Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT). In this paper, we present an error analysis of a new cross-lingual task: the 5W task, a sentence-level understanding task which seeks to return the English 5W's (Who, What, When, Where and Why) corresponding to a Chinese sentence. We analyze systems that we developed, identifying specific problems in language processing and MT that cause errors. The best cross-lingual 5W system was still 19% worse than the best monolingual 5W system, which shows that MT significantly degrades sentence-level understanding. Neither source-language nor target-language analysis was able to circumvent problems in MT, although each approach had advantages relative to the other. A detailed error analysis across multiple systems suggests directions for future research on the problem
Unsupervised Semantic Frame Induction using Triclustering
We use dependency triples automatically extracted from a Web-scale corpus to
perform unsupervised semantic frame induction. We cast the frame induction
problem as a triclustering problem that is a generalization of clustering for
triadic data. Our replicable benchmarks demonstrate that the proposed
graph-based approach, Triframes, shows state-of-the art results on this task on
a FrameNet-derived dataset and performing on par with competitive methods on a
verb class clustering task.Comment: 8 pages, 1 figure, 4 tables, accepted at ACL 201
Enhancing factoid question answering using frame semantic-based approaches
FrameNet is used to enhance the performance of semantic QA systems. FrameNet is a linguistic resource that encapsulates Frame Semantics and provides scenario-based generalizations over lexical items that share similar semantic backgrounds.Doctor of Philosoph
Unsupervised semantic frame induction using triclustering
We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction. We cast the frame induction problem as a triclustering problem that is a generalization of clustering for triadic data. Our replicable benchmarks demonstrate that the proposed graph-based approach, Triframes, shows state-of-the art results on this task on a FrameNet-derived dataset and performing on par with competitive methods on a verb class clustering task
Event-Based Modelling in Question Answering
In der natürlichen Sprachverarbeitung haben Frage-Antwort-Systeme in der letzten Dekade stark an Bedeutung gewonnen. Vor allem durch robuste Werkzeuge wie statistische Syntax-Parser und Eigennamenerkenner ist es möglich geworden, linguistisch strukturierte Informationen aus unannotierten Textkorpora zu gewinnen. Zusätzlich werden durch die Text REtrieval Conference (TREC) jährlich Maßstäbe für allgemeine domänen-unabhängige Frage-Antwort-Szenarien definiert. In der Regel funktionieren Frage-Antwort-Systeme nur gut, wenn sie robuste Verfahren für die unterschiedlichen Fragetypen, die in einer Fragemenge vorkommen, implementieren. Ein charakteristischer Fragetyp sind die sogenannten Ereignisfragen. Obwohl Ereignisse schon seit Mitte des vorigen Jahrhunderts in der theoretischen Linguistik, vor allem in der Satzsemantik, Gegenstand intensive Forschung sind, so blieben sie bislang im Bezug auf Frage-Antwort-Systeme weitgehend unerforscht. Deshalb widmet sich diese Diplomarbeit diesem Problem. Ziel dieser Arbeit ist zum Einen eine Charakterisierung von Ereignisstruktur in Frage-Antwort Systemen, die unter Berücksichtigung der theoretischen Linguistik sowie einer Analyse der TREC 2005 Fragemenge entstehen soll. Zum Anderen soll ein Ereignis-basiertes Antwort-Extraktionsverfahren entworfen und implementiert werden, das sich auf den Ergebnissen dieser Analyse stützt. Informationen von diversen linguistischen Ebenen sollen daten-getrieben in einem uniformen Modell integriert werden. Spezielle linguistische Ressourcen, wie z.B. WordNet und Subkategorisierungslexika werden dabei eine zentrale Rolle einnehmen. Ferner soll eine Ereignisstruktur vorgestellt werden, die das Abpassen von Ereignissen unabhängig davon, ob sie von Vollverben oder Nominalisierungen evoziert werden, erlaubt. Mit der Implementierung eines Ereignis-basierten Antwort-Extraktionsmoduls soll letztendlich auch die Frage beantwortet werden, ob eine explizite Ereignismodellierung die Performanz eines Frage-Antwort-Systems verbessern kann
Frame-semantic parsing
Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets (i.e., content words and phrases) in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. This model uses latent variables and semi-supervised learning to improve frame disambiguation for targets unseen at training time. The second stage finds the target's locally expressed semantic arguments. At inference time, a fast exact dual decomposition algorithm collectively predicts all the arguments of a frame at once in order to respect declaratively stated linguistic constraints, resulting in qualitatively better structures than naïve local predictors. Both components are feature-based and discriminatively trained on a small set of annotated frame-semantic parses. On the SemEval 2007 benchmark data set, the approach, along with a heuristic identifier of frame-evoking targets, outperforms the prior state of the art by significant margins. Additionally, we present experiments on the much larger FrameNet 1.5 data set. We have released our frame-semantic parser as open-source software.United States. Defense Advanced Research Projects Agency (DARPA grant NBCH-1080004)National Science Foundation (U.S.) (NSF grant IIS-0836431)National Science Foundation (U.S.) (NSF grant IIS-0915187)Qatar National Research Fund (NPRP 08-485-1-083
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201