322 research outputs found
Towards a generation-based semantic web authoring tool
Widespread use of Semantic Web technologies requires interfaces through which knowledge can be viewed and edited without deep understanding of Description Logic and formalisms like OWL and RDF. Several groups are pursuing approaches based on Controlled Natural Languages (CNLs), so that editing can be performed by typing in sentences which are automatically interpreted as statements in OWL. We suggest here a variant of this approach which relies entirely on Natural Language Generation (NLG), and propose requirements for a system that can reliably generate transparent realisations of statements in Description Logic
A Linearization Framework for Dependency and Constituent Trees
[Abstract]: Parsing is a core natural language processing problem in which, given an input raw sentence, a
model automatically produces a structured output that represents its syntactic structure. The
most common formalisms in this field are constituent and dependency parsing. Although
both formalisms show differences, they also share limitations, in particular the limited speed
of the models to obtain the desired representation, and the lack of a common representation
that allows any end-to-end neural system to obtain those models. Transforming both parsing
tasks into a sequence labeling task solves both of these problems. Several tree linearizations
have been proposed in the last few years, however there is no common suite that facilitates
their use under an integrated framework. In this work, we will develop such a system. On the
one hand, the system will be able to: (i) encode syntactic trees according to the desired syntactic
formalism and linearization function, and (ii) decode linearized trees into their original
representation. On the other hand, (iii) we will also train several neural sequence labeling
systems to perform parsing from those labels, and we will compare the results.[Resumen]: El análisis sintáctico es una tarea central dentro del procesado del lenguaje natural, en
el que dada una oración se produce una salida que representa su estructura sintáctica. Los
formalismos más populares son el de constituyentes y el de dependencias. Aunque son fundamentalmente
diferentes, tienen ciertas limitaciones en común, como puede ser la lentitud
de los modelos empleados para su predicción o la falta de una representación común que permita
predecirlos con sistemas neuronales de uso general. Transformar ambos formalismos a
una tarea de etiquetado de secuencias permite resolver ambos problemas. Durante los últimos
años se han propuesto diferentes maneras de linearizar árboles sintácticos, pero todavía
se carecía de un software unificado que permitiese obtener representaciones para ambos formalismos
sobre un mismo sistema. En este trabajo se desarrollará dicho sistema. Por un lado,
éste permitirá: (i) linearizar árboles sintácticos en el formalismo y función de linearización
deseadas y (ii) decodificar árboles linearizados de vuelta a su formato original. Por otro lado,
también se entrenarán varios modelos de etiquetado de secuencias, y se compararán los resultados
obtenidos.Traballo fin de grao (UDC.FIC). Enxeñaría Informática. Curso 2021/202
Formal Linguistic Models and Knowledge Processing. A Structuralist Approach to Rule-Based Ontology Learning and Population
2013 - 2014The main aim of this research is to propose a structuralist approach for knowledge processing by means of ontology learning and population, achieved starting from unstructured and structured texts. The method suggested includes distributional semantic approaches and NL formalization theories, in order to develop a framework, which relies upon deep linguistic analysis... [edited by author]XIII n.s
IS THIS ARTIFICIAL INTELLIGENCE?
Artificial Intelligence (AI) has become one of the most frequently used terms in the technical jargon (and often in not-so-technical jargon). Recent advancements in the field of AI have certainly contributed to the AI hype, and so have numerous applications and results of using AI technology in practice. Still, just like with any other hype, the AI hype has its controversies. This paper critically examines developments in the field of AI from multiple perspectives – research, technological, social and pragmatic. Part of the controversies of the AI hype stem from the fact that people use the term AI differently, often without a deep understanding of the wider context in which AI as a field has been developing since its inception in Mid 1950s
Transformer Neural Networks for Automated Story Generation
Towards the last two-decade Artificial Intelligence (AI) proved its use on tasks such as image recognition, natural language processing, automated driving. As discussed in the Moore’s law the computational power increased rapidly over the few decades (Moore, 1965) and made it possible to use the techniques which were computationally expensive. These techniques include Deep Learning (DL) changed the field of AI and outperformed other models in a lot of fields some of which mentioned above. However, in natural language generation especially for creative tasks that needs the artificial intelligent models to have not only a precise understanding of the given input, but an ability to be creative, fluent and, coherent within a content. One of these tasks is automated story generation which has been an open research area from the early days of artificial intelligence. This study investigates whether the transformer network can outperform state-of-the-art model for automated story generation. A large dataset gathered from Reddit’s WRITING PROMPTS sub forum and processed by the transformer network in order to compare the perplexity and two human evaluation metrics on transformer network and the state-of-the-art model. It was found that the transformer network cannot outperform the state-of-art model and even though it generated viable and novel stories it didn’t pay much attention to the prompts of the generated stories. Also, the results implied that there should be a better automated evaluation metric in order to assess the performance of story generation models
ANNOTATED DISJUNCT FOR MACHINE TRANSLATION
Most information found in the Internet is available in English version. However,
most people in the world are non-English speaker. Hence, it will be of great advantage
to have reliable Machine Translation tool for those people. There are many
approaches for developing Machine Translation (MT) systems, some of them are
direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses
on developing an MT for less resourced languages i.e. languages that do not have
available grammar formalism, parser, and corpus, such as some languages in South
East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer
approaches. Moreover, the unavailability of grammar formalism and parser in the
target languages motivates us to develop a hybrid between direct and transfer
approaches. This hybrid approach is referred as a hybrid transfer approach. This
approach uses the Annotated Disjunct (ADJ) method. This method, based on Link
Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and
many-to-many word(s) translations. This method consists of transfer rules module
which maps source words in a source sentence (SS) into target words in correct
position in a target sentence (TS). The developed transfer rules are demonstrated on
English → Indonesian translation tasks. An experimental evaluation is conducted to
measure the performance of the developed system over available English-Indonesian
MT systems. The developed ADJ-based MT system translated simple, compound, and
complex English sentences in present, present continuous, present perfect, past, past
perfect, and future tenses with better precision than other systems, with the accuracy
of 71.17% in Subjective Sentence Error Rate metric
- …