38 research outputs found

    Automatic natural language generation applied to alternative and augmentative communication for online video content services using SimpleNLG for Spanish

    Get PDF
    We present our work to build the Spanish version of SimpleNLG by adapting it and creating new code to satisfy the Spanish linguistic requirements. Not only have we developed this version but also we have achieved a library that only needs the main words as input and it is able to conduct the generation process on its own. The adaptation of the library uses aLexiS, a complete and reliable lexicon with morphology that we created. On the other hand, our enhanced version uses Elsa created from the pictogram domain, which also contains syntactic and semantic information needed to conduct the generation process automatically. Both the adaptation and its enhanced version may be useful integrated in several applications as well as web applications, bringing them natural language generation functionalities. We provide a use case of the system focused on Augmentative and Alternative Communication and online video content services.Xunta de Galicia | Ref. GRC2014/046Xunta de Galicia | Ref. ED341D R2016/012Ministerio de EconomĂ­a, Industria y Competitividad | Ref. TEC2016-76465-C2-2-

    A System for Automatic English Text Expansion

    Get PDF
    This work was supported in part by the Mineco, Spain, under Grant TEC2016-76465-C2-2-R, in part by the Xunta de Galicia, Spain, under Grant GRC-2018/53 and Grant ED341D R2016/012, and in part by the University of Vigo Travel Grant to visit the CLAN Research Group, University of Aberdeen, U.K.Peer reviewedPublisher PD

    Automatic generation of textual descriptions in data-to-text systems using a fuzzy temporal ontology: Application in air quality index data series

    Get PDF
    In this paper we present a model based on computational intelligence and natural language generation for the automatic generation of textual summaries from numerical data series, aiming to provide insights which help users to understand the relevant information hidden in the data. Our model includes a fuzzy temporal ontology with temporal references which addresses the problem of managing imprecise temporal knowledge, which is relevant in data series. We fully describe a real use case of application in the environmental information systems field, providing linguistic descriptions about the air quality index (AQI), which is a very well-known indicator provided by all meteorological agencies worldwide. We consider two different data sources of real AQI data provided by the official Galician (NW Spain) Meteorology Agency: (i) AQI distribution in the stations of the meteorological observation network and (ii) time series which describe the state and evolution of the AQI in each meteorological station. Both application models were evaluated following the current standards and good practices of manual human expert evaluation of the Natural Language Generation field. Assessment results by two experts meteorologists were very satisfactory, which empirically confirm that the proposed textual descriptions fit this type of data and service both in content and layoutThis research was funded by the Spanish Ministry for Science, Innovation and Universities (grants TIN2017-84796-C2-1-R, PID2020-112623GB-I00, and PDC2021-121072-C21) and the Galician Ministry of Education, University and Professional Training, Spain (grants ED431C2018/29 and ED431G2019/04). All grants were co-funded by the European Regional Development Fund (ERDF/FEDER program)S

    A system for automatic English text expansion

    Get PDF
    We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, “automatic” means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation.Ministerio de Economía, Industria y Competitividad | Ref. TEC2016-76465-C2-2-RXunta de Galicia | Ref. GRC-2018/53Xunta de Galicia | Ref. ED341D R2016/012University of Aberdee

    The DipInfo-UniTo system for SRST 2018

    Get PDF

    Automated Semantic Analysis, Legal Assessment, and Summarization of Standard Form Contracts

    Get PDF
    Consumers are confronted with standard form contracts on a daily basis, for example, when shopping online, registering for online platforms, or opening bank accounts. With expected revenue of more than 343 billion Euro in 2020, e-commerce is an ever more important branch of the European economy. Accepting standard form contracts often is a prerequisite to access products or services, and consumers frequently do so without reading, let alone understanding, them. Consumer protection organizations can advise and represent consumers in such situations of power imbalance. However, with increasing demand, limited budgets, and ever more complex regulations, they struggle to provide the necessary support. This thesis investigates techniques for the automated semantic analysis, legal assessment, and summarization of standard form contracts in German and English, which can be used to support consumers and those who protect them. We focus on Terms and Conditions from the fast growing market of European e-commerce, but also show that the developed techniques can in parts be applied to other types of standard form contracts. We elicited requirements from consumers and consumer advocates to understand their needs, identified the most relevant clause topics, and analyzed the processes in consumer protection organizations concerning the handling of standard form contracts. Based on these insights, a pipeline for the automated semantic analysis, legal assessment, and summarization of standard form contracts was developed. The components of this pipeline can automatically identify and extract standard form contracts from the internet and hierarchically structure them into their individual clauses. Clause topics can be automatically identified, and relevant information can be extracted. Clauses can then be legally assessed, either using a knowledge-base we constructed or through binary classification by a transformer model. This information is then used to create summaries that are tailored to the needs of the different user groups. For each step of the pipeline, different approaches were developed and compared, from classical rule-based systems to deep learning techniques. Each approach was evaluated on German and English corpora containing more than 10,000 clauses, which were annotated as part of this thesis. The developed pipeline was prototypically implemented as part of a web-based tool to support consumer advocates in analyzing and assessing standard form contracts. The implementation was evaluated with experts from two German consumer protection organizations with questionnaires and task-based evaluations. The results of the evaluation show that our system can identify over 50 different types of clauses, which cover more than 90% of the clauses typically occurring in Terms and Conditions from online shops, with an accuracy of 0.80 to 0.84. The system can also automatically extract 21 relevant data points from these clauses with a precision of 0.91 and a recall of 0.86. On a corpus of more than 200 German clauses, the system was also able to assess the legality of clauses with an accuracy of 0.90. The expert evaluation has shown that the system is indeed able to support consumer advocates in their daily work by reducing the time they need to analyze and assess clauses in standard form contracts
    corecore