Search CORE

93 research outputs found

A New Multilingual Authoring Tool of Semistructured Legal Documents

Author: Aguayo Maldonado Andres
Bautista Zambrana María Rosario
Caro Herrero José Luis
Corpas Pastor Gloria
Guevara Plaza Antonio
Trujillo Antonio Jesús
Publication venue: 'Malaga University'
Publication date: 25/09/2017
Field of study

Los enfoques actuales de gestión de la documentación multilingüe hacen uso de la traducción humana, la traducción automática (TA) y la traducción asistida por ordenador (TAO) para producir versiones de un solo documento en variosidiomas. Sin embargo, losrecientes avances en generación de lenguaje natural (GLN) indican que es posible implementarsistemas independientes del lenguaje a fin de producir documentos en variosidiomas, independientes de una lengua origen, de forma más eficiente y rentable. En este artículo presentamos GenTur —una herramienta de ayuda a la redacción para producir contratosturísticos en variosidiomas. Se prestará especial atención a dos elementos básicos de su implementación: por un lado, la interlengua xgtling usada para la representación discursiva de los contratos, y por otro lado, el desarrollo de una arquitectura que permita a la citada interlengua generar contratosturísticos por medio del algoritmo de generación GT-Mth

Crossref

Portal de Revistas OJS

Up-cycling Data for Natural Language Generation

Author: Grover Claire
Isard Amy
Oberlander Jon
Publication venue
Publication date: 12/05/2018
Field of study

Edinburgh Research Explorer

A Study on the Implementation of Generative AI Services Using an Enterprise Data-Based LLM Application Architecture

Author: Jeong Cheonsu
Publication venue
Publication date: 03/09/2023
Field of study

This study presents a method for implementing generative AI services by utilizing the Large Language Model (LLM) application architecture. With recent advancements in generative AI technology, LLMs have gained prominence across various domains. In this context, the research addresses the challenge of information scarcity and proposes specific remedies by harnessing LLM capabilities. The investigation delves into strategies for mitigating the issue of inadequate data, offering tailored solutions. The study delves into the efficacy of employing fine-tuning techniques and direct document integration to alleviate data insufficiency. A significant contribution of this work is the development of a Retrieval-Augmented Generation (RAG) model, which tackles the aforementioned challenges. The RAG model is carefully designed to enhance information storage and retrieval processes, ensuring improved content generation. The research elucidates the key phases of the information storage and retrieval methodology underpinned by the RAG model. A comprehensive analysis of these steps is undertaken, emphasizing their significance in addressing the scarcity of data. The study highlights the efficacy of the proposed method, showcasing its applicability through illustrative instances. By implementing the RAG model for information storage and retrieval, the research not only contributes to a deeper comprehension of generative AI technology but also facilitates its practical usability within enterprises utilizing LLMs. This work holds substantial value in advancing the field of generative AI, offering insights into enhancing data-driven content generation and fostering active utilization of LLM-based services within corporate settings

arXiv.org e-Print Archive

Natural Language Interfaces to Data

Author: Efthymiou Vasilis
Lei Chuan
Quamar Abdul
Özcan Fatma
Publication venue: 'Now Publishers'
Publication date: 26/12/2022
Field of study

Recent advances in NLU and NLP have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, there has also been a lot of focus on using conversational interfaces for data analytics, empowering a line of non-technical users with quick insights into the data. There are three main challenges in natural language querying (NLQ): (1) identifying the entities involved in the user utterance, (2) connecting the different entities in a meaningful way over the underlying data source to interpret user intents, and (3) generating a structured query in the form of SQL or SPARQL. There are two main approaches for interpreting a user's NLQ. Rule-based systems make use of semantic indices, ontologies, and KGs to identify the entities in the query, understand the intended relationships between those entities, and utilize grammars to generate the target queries. With the advances in deep learning (DL)-based language models, there have been many text-to-SQL approaches that try to interpret the query holistically using DL models. Hybrid approaches that utilize both rule-based techniques as well as DL models are also emerging by combining the strengths of both approaches. Conversational interfaces are the next natural step to one-shot NLQ by exploiting query context between multiple turns of conversation for disambiguation. In this article, we review the background technologies that are used in natural language interfaces, and survey the different approaches to NLQ. We also describe conversational interfaces for data analytics and discuss several benchmarks used for NLQ research and evaluation.Comment: The full version of this manuscript, as published by Foundations and Trends in Databases, is available at http://dx.doi.org/10.1561/190000007

arXiv.org e-Print Archive

PaLM: Scaling Language Modeling with Pathways

Author: Agrawal Shivani
Austin Jacob
Barham Paul
Barnes Parker
Bosma Maarten
Bradbury James
Catasta Michele
Child Rewon
Chowdhery Aakanksha
Chung Hyung Won
Dai Andrew M.
Dean Jeff
Dev Sunipa
Devlin Jacob
Diaz Mark
Dohan David
Du Nan
Duke Toju
Eck Douglas
Fedus Liam
Fiedel Noah
Firat Orhan
Garcia Xavier
Gehrmann Sebastian
Ghemawat Sanjay
Gur-Ari Guy
Hutchinson Ben
Ippolito Daphne
Isard Michael
Lee Katherine
Levskaya Anselm
Lewkowycz Aitor
Lim Hyeontaek
Luan David
Maynez Joshua
Meier-Hellstern Kathy
Michalewski Henryk
Mishra Gaurav
Misra Vedant
Moreira Erica
Narang Sharan
Omernick Mark
Pellat Marie
Petrov Slav
Pillai Thanumalayan Sankaranarayana
Polozov Oleksandr
Pope Reiner
Prabhakaran Vinodkumar
Rao Abhishek
Reif Emily
Roberts Adam
Robinson Kevin
Saeta Brennan
Schuh Parker
Sepassi Ryan
Shazeer Noam
Shi Kensen
Spiridonov Alexander
Sutton Charles
Tay Yi
Tsvyashchenko Sasha
Wang Xuezhi
Wei Jason
Yin Pengcheng
Zhou Denny
Zhou Zongwei
Zoph Barret
Publication venue
Publication date: 19/04/2022
Field of study

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies

arXiv.org e-Print Archive

Adapting the use of attributes to the task environment in joint action: results and a model

Author: Bard Ellen
Guhe Markus
Publication venue
Publication date: 01/06/2008
Field of study

Edinburgh Research Explorer

Surface Realisation from Knowledge-Bases

Author: Gardent Claire
Gyawali Bikash
Publication venue: HAL CCSD
Publication date: 22/06/2014
Field of study

International audienceWe present a simple, data-driven approach to generation from knowledge bases (KB). A key feature of this approach is that grammar induction is driven by the extended domain of locality principle of TAG (Tree Adjoining Grammar); and that it takes into account both syntactic and semantic information. The resulting extracted TAG includes a unification based semantics and can be used by an existing surface realiser to generate sentences from KB data. Experimental evaluation on the KBGen data shows that our model outperforms a data-driven generate-and-rank approach based on an automatically induced probabilistic grammar; and is comparable with a handcrafted symbolic approach

INRIA a CCSD electronic archive server

Adaptive hypertext and hypermedia : workshop : proceedings, 3rd, Sonthofen, Germany, July 14, 2001 and Aarhus, Denmark, August 15, 2001

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2001
Field of study

This paper presents two empirical usability studies based on techniques from Human-Computer Interaction (HeI) and software engineering, which were used to elicit requirements for the design of a hypertext generation system. Here we will discuss the findings of these studies, which were used to motivate the choice of adaptivity techniques. The results showed dependencies between different ways to adapt the explanation content and the document length and formatting. Therefore, the system's architecture had to be modified to cope with this requirement. In addition, the system had to be made adaptable, in addition to being adaptive, in order to satisfy the elicited users' preferences

Pure OAI Repository

Adaptive hypertext and hypermedia : workshop : proceedings, 3rd, Sonthofen, Germany, July 14, 2001 and Aarhus, Denmark, August 15, 2001

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2001
Field of study

Pure OAI Repository