Search CORE

482,347 research outputs found

What conceptual graph workbenches need for natural language processing

Author: Mann G.
Publication venue
Publication date: 01/01/1995
Field of study

An important capability of the conceptual graph knowledge engineering tools now under development will be the transformation of natural language texts into graphs (conceptual parsing) and its reverse, the production of text from graphs (conceptual generation). Are the existing basic designs adequate for these tasks? Experience developing the BEELINE system's natural language capabilities suggests that good entry/editing tools, a generous but not unlimited storage capacity and efficient, bidirectional lexical access techniques are needed to support the supply of data structures at both the linguistic and conceptual knowledge levels. An active formalism capable of supporting declarative and procedural programs containing both linguistic and knowledge level terms is also important. If these requirements are satisfied, future text-readers can be included as part of a conceptual knowledge workbench without unexpected problems

Research Repository

Social Web Communities

Author: Alani Harith
Staab Steffen
Stumme Gerd
Publication venue
Publication date: 01/01/2008
Field of study

Blogs, Wikis, and Social Bookmark Tools have rapidly emerged onthe Web. The reasons for their immediate success are that people are happy to share information, and that these tools provide an infrastructure for doing so without requiring any specific skills. At the moment, there exists no foundational research for these systems, and they provide only very simple structures for organising knowledge. Individual users create their own structures, but these can currently not be exploited for knowledge sharing. The objective of the seminar was to provide theoretical foundations for upcoming Web 2.0 applications and to investigate further applications that go beyond bookmark- and file-sharing. The main research question can be summarized as follows: How will current and emerging resource sharing systems support users to leverage more knowledge and power from the information they share on Web 2.0 applications? Research areas like Semantic Web, Machine Learning, Information Retrieval, Information Extraction, Social Network Analysis, Natural Language Processing, Library and Information Sciences, and Hypermedia Systems have been working for a while on these questions. In the workshop, researchers from these areas came together to assess the state of the art and to set up a road map describing the next steps towards the next generation of social software

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Social Web Communities

Author: Alani Harith
Staab Steffen
Stumme Gerd
Publication venue
Publication date: 01/01/2008
Field of study

Blogs, Wikis, and Social Bookmark Tools have rapidly emerged on the Web. The reasons for their immediate success are that people are happy to share information, and that these tools provide an infrastructure for doing so without requiring any specific skills. At the moment, there exists no foundational research for these systems, and they provide only very simple structures for organising knowledge. Individual users create their own structures, but these can currently not be exploited for knowledge sharing. The objective of the seminar was to provide theoretical foundations for upcoming Web 2.0 applications and to investigate further applications that go beyond bookmark- and file-sharing. The main research question can be summarized as follows: How will current and emerging resource sharing systems support users to leverage more knowledge and power from the information they share on Web 2.0 applications? Research areas like Semantic Web, Machine Learning, Information Retrieval, Information Extraction, Social Network Analysis, Natural Language Processing, Library and Information Sciences, and Hypermedia Systems have been working for a while on these questions. In the workshop, researchers from these areas came together to assess the state of the art and to set up a road map describing the next steps towards the next generation of social software

Southampton (e-Prints Soton)

Natural language generation as neural sequence learning and beyond

Author: Zhang Xingxing
Publication venue: The University of Edinburgh
Publication date: 30/11/2017
Field of study

Natural Language Generation (NLG) is the task of generating natural language (e.g., English sentences) from machine readable input. In the past few years, deep neural networks have received great attention from the natural language processing community due to impressive performance across different tasks. This thesis addresses NLG problems with deep neural networks from two different modeling views. Under the first view, natural language sentences are modelled as sequences of words, which greatly simplifies their representation and allows us to apply classic sequence modelling neural networks (i.e., recurrent neural networks) to various NLG tasks. Under the second view, natural language sentences are modelled as dependency trees, which are more expressive and allow to capture linguistic generalisations leading to neural models which operate on tree structures. Specifically, this thesis develops several novel neural models for natural language generation. Contrary to many existing models which aim to generate a single sentence, we propose a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences. Beyond the hierarchical recurrent structure, we also propose a means to model context dynamically during generation. We apply this model to the task of Chinese poetry generation and show that it outperforms competitive poetry generation systems. Neural based natural language generation models usually work well when there is a lot of training data. When the training data is not sufficient, prior knowledge for the task at hand becomes very important. To this end, we propose a deep reinforcement learning framework to inject prior knowledge into neural based NLG models and apply it to sentence simplification. Experimental results show promising performance using our reinforcement learning framework. Both poetry generation and sentence simplification are tackled with models following the sequence learning view, where sentences are treated as word sequences. In this thesis, we also explore how to generate natural language sentences as tree structures. We propose a neural model, which combines the advantages of syntactic structure and recurrent neural networks. More concretely, our model defines the probability of a sentence by estimating the generation probability of its dependency tree. At each time step, a node is generated based on the representation of the generated subtree. We show experimentally that this model achieves good performance in language modeling and can also generate dependency trees

Edinburgh Research Archive

DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text

Author: Joty Shafiq
Liu Ye
Niu Tong
Wan Yao
Yavuz Semih
Yu Philip S.
Zhao Wenting
Zhou Yingbo
Publication venue
Publication date: 31/10/2023
Field of study

Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when solely relying on their internal knowledge, especially when answering questions that require less commonly known information. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge. Nonetheless, recent approaches have primarily emphasized retrieval from unstructured text corpora, owing to its seamless integration into prompts. When using structured data such as knowledge graphs, most methods simplify it into natural text, neglecting the underlying structures. Moreover, a significant gap in the current landscape is the absence of a realistic benchmark for evaluating the effectiveness of grounding LLMs on heterogeneous knowledge sources (e.g., knowledge base and text). To fill this gap, we have curated a comprehensive dataset that poses two unique challenges: (1) Two-hop multi-source questions that require retrieving information from both open-domain structured and unstructured knowledge sources; retrieving information from structured knowledge sources is a critical component in correctly answering the questions. (2) The generation of symbolic queries (e.g., SPARQL for Wikidata) is a key requirement, which adds another layer of challenge. Our dataset is created using a combination of automatic generation through predefined reasoning chains and human annotation. We also introduce a novel approach that leverages multiple retrieval tools, including text passage retrieval and symbolic language-assisted retrieval. Our model outperforms previous approaches by a significant margin, demonstrating its effectiveness in addressing the above-mentioned reasoning challenges

arXiv.org e-Print Archive

A Study Towards Spanish Abstract Meaning Representation

Author: Migueles Abraira Noelia
Publication venue
Publication date: 27/06/2017
Field of study

Taking into account the increasing attention that researchers of Natural Language Understanding (NLU) and Natural Language Generation (NLG) are paying to Computational Semantics, we analyze the feasibility of annotating Spanish Abstract Meaning Representations. The Abstract Meaning Representation (AMR) project aims to create a large- scale sembank of simple structures that represent unified, complete semantic information contained in English sentences. Although AMR is not destined to be an interlingua, one of its key features is the ability to focus on events rather than on word forms. They do this, for instance, by abstracting away from morpho-syntactic idiosyncrasies. In this thesis, we investigate the requirements to – and we come up with a proposal to – annotate Spanish AMRs, based on the premise that many of these idiosyncrasies mark differences between languages. To our knowledge, this is the first work towards the development of Abstract Meaning Representation for Spanish

Archivo Digital para la Docencia y la Investigación

Semantic Structure based Query Graph Prediction for Question Answering over Knowledge Graph

Author: Li Mingchen
Publication venue: ScholarWorks @ Georgia State University
Publication date: 14/12/2022
Field of study

Building query graphs from questions is an important step in complex question answering over knowledge graph (Complex KGQA). In general, a question can be correctly answered if its query graph is built correctly and the right answer is then retrieved by issuing the query graph against the KG. Therefore, this paper focuses on query graph generation from natural language questions. Existing approaches for query graph generation ignore the semantic structure of a question, resulting in a large number of noisy query graph candidates that undermine prediction accuracies. In this paper, we define six semantic structures from common questions in KGQA and develop a novel Structure-BERT to predict the semantic structure of a question, and then rank the remaining candidates with a BERT-based ranking model. Extensive experiments on two popular benchmarks MetaQA and WebQuestionsSP demonstrate the effectiveness of our method as compared to state-of-the-arts

ScholarWorks @ Georgia State University

A uniform computational model for natural language parsing and generation

Author: Neumann Günter
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 07/10/2004
Field of study

In the area of natural language processing in recent years, there has been a strong tendency towards reversible natural language grammars, i.e., the use of one and the same grammar for grammatical analysis (parsing) and grammatical synthesis (generation) in a natural language system. The idea of representing grammatical knowledge only once and of using it for performing both tasks seems to be quite plausible, and there are many arguments based on practical and psychological considerations for adopting such a view (in section 2.1 we discuss the most important arguments in more detail). Nevertheless, in almost all large natural language systems in which parsing and generation are considered in similar depth, different algorithms are used - even when the same grammar is used. At present, the first attempts are being made at uniform architectures which are based on the paradigm of natural language processing as deduction (they are described and discussed in section 2.3 in detail). Here, grammatical processing is performed by means of the same underlying deduction mechanism, which can be parameterized for the specific tasks at hand. Natural language processing based on a uniform deduction process has a formal elegance and results in more compact systems. There is one further advantage that is of both theoretical and practical relevance: a uniform architecture offers the possibility of viewing parsing and generation as strongly interleaved tasks. Interleaving parsing and generation is important if we assume that natural language understanding and production are not performed in an isolated way but rather can work together to obtain a flexible use of language. In particular this means a.) the use of one mode of operation for monitoring the other and b.) the use of structures resulting from one direction directly in the other. For example, during generation integrated parsing can be used to monitor the generation process and to cause some kind of revision, e.g., to reduce the risk of misunderstandings. Research on monitoring and revision strategies is a very active area in cognitive science; however, currently there exists no algorithmic model of such a behaviour. A uniform architecture can be an important step in that direction. Unfortunately, the currently proposed uniform architectures are very inefficient and it is yet unclear how an efficiency-oriented uniform model could be achieved. An obvious problem is that in each direction different input structures are involved - a string for parsing and a semantic expression for generation - which causes a different traversal of the search space defined by the grammar. Even if this problem were solved, it is not that obvious how a uniform model could re-use partial results computed in one direction efficiently in the other direction for obtaining a practical interleaved approach to parsing and generation.Liegt nicht vor

Universaar

Acronym

A uniform computational model for natural language parsing and generation

Author: Neumann Günter
Publication venue: Fakultät 4 - Philosophische Fakultät II. Fachrichtung 4.7 - Allgemeine Linguistik
Publication date: 01/01/1994
Field of study

CiteSeerX