93 research outputs found

    A New Multilingual Authoring Tool of Semistructured Legal Documents

    Get PDF
     Los enfoques actuales de gestión de la documentación multilingüe hacen uso de la traducción humana, la traducción automática (TA) y la traducción asistida por ordenador (TAO) para producir versiones de un solo documento en variosidiomas. Sin embargo, losrecientes avances en generación de lenguaje natural (GLN) indican que es posible implementarsistemas independientes del lenguaje a fin de producir documentos en variosidiomas, independientes de una lengua origen, de forma más eficiente y rentable. En este artículo presentamos GenTur —una herramienta de ayuda a la redacción para producir contratosturísticos en variosidiomas. Se prestará especial atención a dos elementos básicos de su implementación: por un lado, la interlengua xgtling usada para la representación discursiva de los contratos, y por otro lado, el desarrollo de una arquitectura que permita a la citada interlengua generar contratosturísticos por medio del algoritmo de generación GT-Mth

    A Study on the Implementation of Generative AI Services Using an Enterprise Data-Based LLM Application Architecture

    Full text link
    This study presents a method for implementing generative AI services by utilizing the Large Language Model (LLM) application architecture. With recent advancements in generative AI technology, LLMs have gained prominence across various domains. In this context, the research addresses the challenge of information scarcity and proposes specific remedies by harnessing LLM capabilities. The investigation delves into strategies for mitigating the issue of inadequate data, offering tailored solutions. The study delves into the efficacy of employing fine-tuning techniques and direct document integration to alleviate data insufficiency. A significant contribution of this work is the development of a Retrieval-Augmented Generation (RAG) model, which tackles the aforementioned challenges. The RAG model is carefully designed to enhance information storage and retrieval processes, ensuring improved content generation. The research elucidates the key phases of the information storage and retrieval methodology underpinned by the RAG model. A comprehensive analysis of these steps is undertaken, emphasizing their significance in addressing the scarcity of data. The study highlights the efficacy of the proposed method, showcasing its applicability through illustrative instances. By implementing the RAG model for information storage and retrieval, the research not only contributes to a deeper comprehension of generative AI technology but also facilitates its practical usability within enterprises utilizing LLMs. This work holds substantial value in advancing the field of generative AI, offering insights into enhancing data-driven content generation and fostering active utilization of LLM-based services within corporate settings

    Natural Language Interfaces to Data

    Full text link
    Recent advances in NLU and NLP have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, there has also been a lot of focus on using conversational interfaces for data analytics, empowering a line of non-technical users with quick insights into the data. There are three main challenges in natural language querying (NLQ): (1) identifying the entities involved in the user utterance, (2) connecting the different entities in a meaningful way over the underlying data source to interpret user intents, and (3) generating a structured query in the form of SQL or SPARQL. There are two main approaches for interpreting a user's NLQ. Rule-based systems make use of semantic indices, ontologies, and KGs to identify the entities in the query, understand the intended relationships between those entities, and utilize grammars to generate the target queries. With the advances in deep learning (DL)-based language models, there have been many text-to-SQL approaches that try to interpret the query holistically using DL models. Hybrid approaches that utilize both rule-based techniques as well as DL models are also emerging by combining the strengths of both approaches. Conversational interfaces are the next natural step to one-shot NLQ by exploiting query context between multiple turns of conversation for disambiguation. In this article, we review the background technologies that are used in natural language interfaces, and survey the different approaches to NLQ. We also describe conversational interfaces for data analytics and discuss several benchmarks used for NLQ research and evaluation.Comment: The full version of this manuscript, as published by Foundations and Trends in Databases, is available at http://dx.doi.org/10.1561/190000007

    PaLM: Scaling Language Modeling with Pathways

    Full text link
    Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies

    Surface Realisation from Knowledge-Bases

    Get PDF
    International audienceWe present a simple, data-driven approach to generation from knowledge bases (KB). A key feature of this approach is that grammar induction is driven by the extended domain of locality principle of TAG (Tree Adjoining Grammar); and that it takes into account both syntactic and semantic information. The resulting extracted TAG includes a unification based semantics and can be used by an existing surface realiser to generate sentences from KB data. Experimental evaluation on the KBGen data shows that our model outperforms a data-driven generate-and-rank approach based on an automatically induced probabilistic grammar; and is comparable with a handcrafted symbolic approach

    Adaptive hypertext and hypermedia : workshop : proceedings, 3rd, Sonthofen, Germany, July 14, 2001 and Aarhus, Denmark, August 15, 2001

    Get PDF
    This paper presents two empirical usability studies based on techniques from Human-Computer Interaction (HeI) and software engineering, which were used to elicit requirements for the design of a hypertext generation system. Here we will discuss the findings of these studies, which were used to motivate the choice of adaptivity techniques. The results showed dependencies between different ways to adapt the explanation content and the document length and formatting. Therefore, the system's architecture had to be modified to cope with this requirement. In addition, the system had to be made adaptable, in addition to being adaptive, in order to satisfy the elicited users' preferences

    Adaptive hypertext and hypermedia : workshop : proceedings, 3rd, Sonthofen, Germany, July 14, 2001 and Aarhus, Denmark, August 15, 2001

    Get PDF
    This paper presents two empirical usability studies based on techniques from Human-Computer Interaction (HeI) and software engineering, which were used to elicit requirements for the design of a hypertext generation system. Here we will discuss the findings of these studies, which were used to motivate the choice of adaptivity techniques. The results showed dependencies between different ways to adapt the explanation content and the document length and formatting. Therefore, the system's architecture had to be modified to cope with this requirement. In addition, the system had to be made adaptable, in addition to being adaptive, in order to satisfy the elicited users' preferences
    corecore