10,208 research outputs found

    Ontología y Procesamiento de Lenguaje Natural

    Get PDF
    At present, the convergence of several areas of knowledge has led to the design and implementation of ICT systems that support the integration of heterogeneous tools, such as artificial intelligence (AI), statistics and databases (BD), among others. Ontologies in computing are included in the world of AI and refer to formal representations of an area of knowledge or domain. The discipline that is in charge of the study and construction of tools to accelerate the process of creation of ontologies from the natural language is the ontological engineering. In this paper, we propose a knowledge management model based on the clinical histories of patients (HC) in Panama, based on information extraction (EI), natural language processing (PLN) and the development of a domain ontology.Keywords: Knowledge, information extraction, ontology, automatic population of ontologies, natural language processing

    Evaluating machine translation in a low-resource language combination : Spanish-Galician

    Get PDF
    This paper reports the results of a study designed to assess the perception of adequacy of three different types of machine translation systems within the context of a minoritized language combination (Spanish-Galician). To perform this evaluation, a mixed design with three different metrics (BLEU, survey and error analysis) is used to extract quantitative and qualitative data about two marketing letters from the energy industry translated with a rulebased system (RBMT), a phrase-based system (PBMT) and a neural system (NMT). Results show that in the case of low-resource languages rule-based and phrase-based machine translations systems still play an important role

    Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso

    Get PDF
    The web is a giant resource of data and information about security, health, education, and others, matters that have great utility for people, but to get a synthesis or abstract about one or many documents is an expensive labor, which with manual process might be impossible due to the huge amount of data. Abstract generation is a challenging task, due to that involves analysis and comprehension of the written text in non structural natural language dependent of a context and it must describe an events synthesis or knowledge in a simple form, becoming natural for any reader. There are diverse approaches to summarize. These categorized into extractive or abstractive. On abstractive technique, summaries are generated starting from selecting outstanding sentences on source text. Abstractive summaries are created by regenerating the content extracted from source text, through that phrases are reformulated by terms fusion, compression or suppression processes. In this manner, paraphrasing sentences are obtained or even sentences were not in the original text. This summarize type has a major probability to reach coherence and smoothness like one generated by human beings. The present work implements a method that allows to integrate syntactic, semantic (AMR annotator) and discursive (RST) information into a conceptual graph. This will be summarized through the use of a new measure of concept similarity on WordNet.To find the most relevant concepts we use PageRank, considering all discursive information given by the O”Donell method application. With the most important concepts and semantic roles information got from the PropBank, a natural language generation method was implemented with tool SimpleNLG. In this work we can appreciated the results of applying this method to the corpus of Document Understanding Conference 2002 and tested by Rouge metric, widely used in the automatic summarization task. Our method reaches a measure F1 of 24 % in Rouge-1 metric for the mono-document abstract generation task. This shows that using these techniques are workable and even more profitable and recommended configurations and useful tools for this task.Tesi

    AMIC: Affective multimedia analytics with inclusive and natural communication

    Get PDF
    Traditionally, textual content has been the main source of information extraction and indexing, and other technologies that are capable of extracting information from the audio and video of multimedia documents have joined later. Other major axis of analysis is the emotional and affective aspect intrinsic in human communication. This information of emotions, stances, preferences, figurative language, irony, sarcasm, etc. is fundamental and irreplaceable for a complete understanding of the content in conversations, speeches, debates, discussions, etc. The objective of this project is focused on advancing, developing and improving speech and language technologies as well as image and video technologies in the analysis of multimedia content adding to this analysis the extraction of affective-emotional information. As additional steps forward, we will advance in the methodologies and ways for presenting the information to the user, working on technologies for language simplification, automatic reports and summary generation, emotional speech synthesis and natural and inclusive interaction

    Un enfoque del filtrado de léxico para perfiles de autor

    Get PDF
    This paper studies the influence of a general Spanish lexicon and a domain-specific lexicon on a text classification problem. Specifically, we address the impact of the choice of lexicons for user modelling. To do so, we identify gender and profession as demographic traits, and political ideology as a psychographic trait from a set of tweets. We experimented with machine learning and supervised learning methods to create a prediction model with which we evaluated our specific lexicon. Our results show that the choice and/or construction of lexicons to support the resolution of this task can follow a given strategy, characterised by the domain of the lexicon and the type of words it contains.Este trabajo estudia la influencia de un léxico general del español y un léxico específico del dominio en un problema de clasificación de textos. En concreto, abordamos el impacto de la elección de léxicos para el modelado de usuarios. Para ello, identificamos el género y la profesión como rasgos demográficos, y la ideología política como rasgo psicográfico a partir de un conjunto de tuits. Experimentamos con métodos de aprendizaje automático y aprendizaje supervisado para crear un modelo de predicción con el que evaluamos nuestro léxico específico. Nuestros resultados muestran que la elección y/o construcción de léxicos para apoyar la resolución de esta tarea puede seguir una estrategia determinada, caracterizada por el dominio del léxico y el tipo de palabras que contiene.This work has been partially supported by projects Big Hug (P20 00956, PAIDI 2020) and WeLee (1380939, FEDER Andalucía 2014-2020) both funded by the Andalusian Regional Government, and projects CONSENSO (PID2021-122263OB-C21), MODERATES (TED2021-130145B-I00), Social-TOX (PDC2022-133146-C21) funded by Plan Nacional I+D+i from the Spanish Government, and project PRECOM (SUBV-00016) funded by the Ministry of Consumer Affairs of the Spanish Government

    Natural Language Generation: Revision of the State of the Art

    Get PDF
    El ser humano se comunica y expresa a través del lenguaje. Para conseguirlo, ha de desarrollar una serie de habilidades de alto nivel cognitivo cuya complejidad se pone de manifiesto en la tarea de automatizar el proceso, tanto cuando se trata de producir lenguaje como de interpretarlo. Cuando la acción comunicativa ocurre entre una persona y un ordenador y éste último es el destinatario de la acción, se emplean lenguajes computacionales que, como norma general, están sujetos a un conjunto de reglas fuertemente tipadas, acotadas y sin ambigüedad. Sin embargo, cuando el sentido de la comunicación es el contrario y la máquina ha de transmitir información a la persona, si el mensaje se quiere transmitir en lenguaje natural, el procedimiento para generarlo debe lidiar con la flexibilidad y la ambigüedad que lo caracterizan, dando lugar a una tarea de alto nivel de complejidad. Para que las máquinas sean capaces de manejar el lenguaje humano se hacen necesarias técnicas de Lingüística Computacional. Dentro de esta disciplina, el campo que se encarga de crear textos en lenguaje natural se denomina Generación de Lenguaje Natural (GLN). En este artículo se va a hacer un recorrido exhaustivo de este campo. Se describen las fases en las que se suelen descomponer los sistemas de GLN junto a las técnicas que se aplican y se analiza con detalle la situación actual de esta área de investigación y su problemática, así como los recursos más relevantes y las técnicas que se están empleando para evaluar la calidad de los sistemas.Language is one of the highest cognitive skills developed by human beings and, therefore, one of the most complex tasks to be faced from the computational perspective. Human-computer communication processes imply two different degrees of difficulty depending on the nature of that communication. If the language used is oriented towards the domain of the machine, there is no place for ambiguity since it is restricted by rules. However, when the communication is in terms of natural language, its flexibility and ambiguity becomes unavoidable. Computational Linguistic techniques are mandatory for machines when it comes to process human language. Among them, the area of Natural Language Generation aims to automatical development of techniques to produce human utterances, text and speech. This paper presents a deep survey of this research area taking into account different points of view about the theories, methodologies, architectures, techniques and evaluation approaches, thus providing a review of the current situation and possible future research in the field.Esta investigación ha sido financiada por la Generalitat Valenciana a través del proyecto DIIM2.0: Desarrollo de técnicas Inteligentes e Interactivas de Minería y generación de información sobre la web 2.0 (PROMETEOII/2014/001). Además, ha sido parcialmente financiada por la Comisión Europea a través del proyecto SAM (FP7-611312); por el Ministerio de Economía y Competitividad del Gobierno de España mediante los proyectos: “Análisis de Tendencias Mediante Técnicas de Opinión Semántica” (TIN2012-38536-C03-03) y ‘Técnicas de Deconstrucción en la Tecnología del Lenguaje Humano” (TIN2012-31224); y finalmente, por la Universidad de Alicante a través del proyecto “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15)

    Neurociencia y Educación: ¿podemos ir de la investigación básica a su aplicación? Un posible marco de referencia desde la investigación en dislexia

    Get PDF
    ResumenLa neurociencia podría transformar la educación, pues proporciona nuevos métodos para comprender el aprendizaje y el desarrollo cognitivo, sus mecanismos causales y una forma empírica de evaluar la eficacia de diferentes pedagogías. No obstante, éste sería un objetivo a largo plazo. Desde la neurociencia educativa se debería empezar estudiando cómo los sistemas cognitivos se construyen sobre los sensoriales a lo largo del desarrollo. Aquí me centraré en el lenguaje. Pequeñas diferencias individuales iniciales en una función sensorial, por ejemplo la auditiva, podrían ser el origen de notables diferencias individuales en el desarrollo lingüístico. La neurociencia podría proporcionar una comprensión detallada de los mecanismos causales del desarrollo que vinculan la audición, el desarrollo fonológico y el desarrollo de la alfabetización. Este tipo de investigación neurocientífica básica podría orientar al campo de la educación y la pedagogía explorando los efectos que sobre estos mecanismos ejercen diferentes contextos pedagógicos y de aprendizaje.AbstractNeuroscience has the potential to transform education because it provides novel methods for understanding human learning and cognitive development. It therefore offers deeper understanding of causal mechanisms in learning and an empirical approach to evaluating the efficacy of different pedagogies. However, this will be a long-term enterprise and there will be few immediate pay-offs. Here I set out one possible framework for linking basic research in neuroscience to pedagogical questions in education. I suggest that the developing field of educational neuroscience must first study how sensory systems build cognitive systems over developmental time. I focus on one cognitive system, language, the efficient functioning of which is critical for reading acquisition. Small initial differences in sensory function, for example auditory function, have the potential to cause large differences in linguistic performance over the learning trajectory. The tools offered by neuroscience can enable better understanding of the causal developmental mechanisms linking audition, phonological development and literacy development, in fine-grained detail. Following this basic research, neuroscience can then inform education and pedagogy by exploring the effects on these neural mechanisms of different learning contexts and pedagogies

    Polarity analisys od reviews based on the omission of asymetric sentences

    Get PDF
    In this paper, we present a novel approach to polarity analysis of product reviews which detects and removes sentences with the opposite polarity to that of the entire document (asymmetric sentences) as a previous step to identify positive and negative reviews. We postulate that asymmetric sentences are morpho-syntactically more complex than symmetric ones (sentences with the same polarity to that of the entire document) and that it is possible to improve the detection of the polarity orientation of reviews by removing asymmetric sentences from the text. To validate this hypothesis, we measured the syntactic complexity of both types of sentences in a multi-domain corpus of product reviews and contrasted three relevant data configurations based on inclusion and omission of asymmetric sentences from the reviews
    corecore