13 research outputs found

    Diccionarios bilingües y aprendizaje de lengua extranjera: hechos y opiniones

    Get PDF
    Despite the eminently communicative approaches to FL, learners still feel the need to have almost immediate access to the meaning or form of foreign words. With this premise in mind, we have conducted the present study on the bilingual dictionary, being our goal twofold. Firstly, we compare the task where the dictionary is required with other three tasks where it is not used. Then, the results of the dictionary task are compared to the participants’ opinion about the dictionary. Results suggest that the use of the dictionary is not as efficient as expected. Yet, a positive attitude towards this tool prevails among the best performers.A pesar del enfoque eminentemente comunicativo que impera en el aprendizaje de lenguas, los estudiantes siguen teniendo esa necesidad de acceder de manera casi inmediata a la forma o significado de las palabras en una segunda lengua. Bajo esta premisa se ha llevado a cabo este estudio sobre el uso del diccionario bilingüe, en el que se persigue un doble objetivo. Primero se compara el efecto sobre el aprendizaje léxico de una tarea con diccionario con otras de distinta índole. Tras ello, los resultados de la primera se comparan con las opiniones de los estudiantes acerca de esta herramienta. Los resultados sugieren que el uso del diccionario no es tan eficaz como se esperaba. No obstante, los participantes que mejor puntúan muestran una actitud positiva hacia esta herramienta

    The representation of migrants in Spanish judicial decisions: using corpus data to refute hate speech

    Get PDF
    © 2022. The authors. This document is made available under the CC-BY 4.0 license http://creativecommons.org/licenses/by /4.0/ This document is the submitted version of a published work that appeared in final form in Corpora.The phenomenon of immigration and its depiction in media texts have been examined profusely within the field of corpus-based discourse analysis (Gabrielatos and Baker, 2008; Baker et al., 2013; Blinder and Allen, 2016). This research seeks to present it as reflected on a corpus of 600 judicial decisions issued by Spanish courts in the years 2016 and 2017. This analysis was motivated by the rise of extreme right-wing parties in Europe in the recent years, which dehumanise immigrants and portray them as a threat to the welfare state. On a first approach, the results appear to dissociate immigration and crime since a considerable percentage of the keywords obtained (c. 20%) revolves around three major topoi, namely, family, territory/access, and legal punishment, not showing evidence of any major offences or crimes amongst the top-ranking lexicon. The study of the collocate networks of the KWs within the category legal punishment confirms our initial perception, in fact, out of 21 collocates, only the word delito (crime) itself collocates with terms referring to typified crimes such as violencia (violence). In parallel, the data were triangulated using the text-classification software UMTextStats (García-Díaz et al., 2018). The results of this second analysis confirm our initial observations

    Developing and analyzing a spanish corpus for forensic purposes

    Get PDF
    In this paper, the methods for developing a database of Spanish writing that can be used for forensic linguistic research are presented, including our data collection procedures. Specifically, the main instrument used for data collection has been translated into Spanish and adapted from Chaski (2001). It consists of ten tasks, by means of which the subjects are asked to write formal and informal texts about different topics. To date, 93 undergraduates from Spanish universities have already participated in the study and prisoners convicted of gender-based abuse have participated. A twofold analysis has been performed, since the data collected have been approached from a semantic and a morphosyntactic perspective. Regarding the semantic analysis, psycholinguistic categories have been used, many of them taken from the LIWC dictionary (Pennebaker et al., 2001). In order to obtain a more comprehensive depiction of the linguistic data, some other ad-hoc categories have been created, based on the corpus itself, using a double-check method for their validation so as to ensure inter-rater reliability. Furthermore, as regards morphosyntactic analysis, the natural language processing tool ALIAS TATTLER is being developed for Spanish. Results shows that is it possible to differentiate non-abusers from abusers with strong accuracy based on linguistic features

    Resumen de FinancES 2023: Análisis de Sentimiento Dirigido en Español sobre Finanzas

    Get PDF
    This paper presents the FinancES 2023 shared task, organized in the IberLEF 2023 workshop, within the framework of the 39th International Conference of the Spanish Society for Natural Language Processing (SEPLN 2023). The aim of this task is to extend the challenge of sentiment analysis in Spanish to the financial domain, in order to extract the sentiment that a piece of financial information can have for several actors, including the main economic target (i.e., the specific company or asset where the economic fact applies), other companies (i.e., the entities producing the goods and services that others consume) and consumers (i.e., households/individuals). Specifically, two tasks are proposed and evaluated separately. One to identify the main target and to determine the sentiment polarity towards such target, and a second task to assess the sentiment towards both other companies and consumers. The ranking includes results for 10 different teams proposing novel approaches, mostly based on Transformers and generative language models.Este artículo resume la tarea FinancES 2023, organizada en el taller IberLEF 2023, dentro del marco de la 39ª Conferencia Internacional de la Sociedad Española de Procesamiento del Lenguaje Natural (SEPLN 2023). El objetivo de esta tarea es mejorar la materia de la minería de opiniones en español dentro del ámbito financiero realizando el análisis de sentimientos desde distintos puntos de vista. En concreto, se proponen y estudian dos tareas que son evaluadas de forma independiente. La primera tarea consiste en (i) identificar el actor principal asociado a una noticia financiera, y (ii) el sentimiento expresado hacia dicho actor. La segunda tarea consiste en determinar el sentimiento de la noticia (i) hacia otras empresas (i.e., otros agentes económicos), y (ii) hacia los consumidores (i.e., la sociedad). El ranking incluye los resultados de 10 equipos diferentes que proponen enfoques novedosos, en su mayoría basados en Transformers y modelos generativos del lenguaje.This work is part of the research project AIInFunds (PDC2021-121112-I00) funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR. This work is also part of the research project LaTe4PSP (PID2019-107652RB-I00) funded by MCIN/AEI/10.13039/501100011033 and the research project LaTe4PoliticES (PID2022-138099OB-I00) funded by MCIN/AEI/10.13039/501100011033 and the European Fund for Regional Development (FEDER)-a way to make Europe

    KBS4FIA: Sistema inteligente basado en conocimiento para análisis de información financiera

    Get PDF
    Decision making takes place in an environment of uncertainty. Therefore, it is necessary to have information which is as accurate and complete as possible in order to minimize the risk that is inherent to the decision-making process. In the financial domain, the situation becomes even more critical due to the intrinsic complexity of the analytical tasks within this field. The main aim of the KBS4FIA project is to automate the processes associated with financial analysis by leveraging the technological advances in natural language processing, ontology learning and population, ontology evolution, opinion mining, the Semantic Web and Linked Data. This project is being developed by the TECNOMOD research group at the University of Murcia and has been funded by the Ministry of Economy, Industry and Competitiveness and the European Regional Development Fund (ERDF) through the Spanish National Plan for Scientific and Technical Research and Innovation Aimed at the Challenges of Society.La toma de decisiones tiene lugar en un ambiente de incertidumbre, por lo tanto es necesario disponer de información lo más exacta y completa posible para minimizar el riesgo inherente al proceso de toma de decisiones. En el dominio de las finanzas la situación se hace, si cabe, aún más crítica debido a la complejidad intrínseca de las tareas analíticas dentro de este campo. La finalidad del proyecto KBS4FIA es la automatización de los procesos ligados al análisis financiero, utilizando para ello tecnologías asociadas con el procesamiento del lenguaje natural, el aprendizaje, la instanciación y la evolución de ontologías, la minería de opiniones, la Web Semántica y el Linked Data. Este proyecto está siendo desarrollado por el grupo TECNOMOD de la Universidad de Murcia y ha sido financiado por el Ministerio de Economía y Competitividad y el Fondo Europeo de Desarrollo Regional (FEDER) a través del Programa Estatal de I+D+i Orientada a los Retos de la Sociedad.This project has been funded by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER / ERDF) through project KBS4FIA (TIN2016-76323-R)

    Transcription, indexing and automatic analysis of judicial declarations from phonetic representations and techniques of forensic linguistics

    Get PDF
    Recientes avances tecnológicos han permitido mejorar los procesos judiciales para la búsqueda de información en los expedientes judiciales asociados a un caso. Sin embargo, cuando técnicos y peritos deben revisar pruebas almacenadas en vídeos y fragmentos de audio, se ven obligados a realizar una búsqueda manual en el documento multimedia para localizar la parte que desean revisar, lo cual es una tarea tediosa y que consume bastante tiempo. Para poder facilitar el desempeño de los técnicos, el presente proyecto consiste en un sistema que permite la transcripción e indexación automática de contenido multimedia basado en tecnologías de deep-learning en entornos de ruido y con múltiples interlocutores, así como la posibilidad de realizar análisis de lingüística forense sobre los datos para ayudar a los peritos a analizar los testimonios de modo que se aporten evidencias sobre la veracidad del mismo.Recent technological advances have made it possible to improve the search for information in the judicial files of the Ministry of Justice associated with a trial. However, when judicial experts examine evidence in multimedia files, such as videos or audio fragments, they must manually search the document to locate the fragment at issue, which is a tedious and time-consuming task. In order to ease this task, we propose a system that allows automatic transcription and indexing of multimedia content based on deep-learning technologies in noise environments and with multiple speakers, as well as the possibility of applying forensic linguistics techniques to enable the analysis of witness statements so that evidence on its veracity is provided.Este proyecto ha sido financiado por el Instituto de Fomento de la Región de Murcia con fondos FEDER dentro del proyecto con referencia 2018.08.ID+I.0025

    Estudio de los rasgos lingüísticos de la mentira en el medio escrito: un análisis contrastivo inglés-español = Featuring deception in written language : a contrastive study of english and spanish

    No full text
    El objetivo principal de esta tesis doctoral es el análisis de las características lingüísticas de la mentira en el lenguaje escrito en inglés y en español, para lo cual se ha llevado a cabo un análisis contrastivo entre ambas lenguas. Se han realizado diversos experimentos de clasificación automática sobre dos corpora ad-hoc para testar la clasificación de los textos según su valor de verdad. En el primer experimento se han aplicado técnicas de aprendizaje automático y se han comparado los resultados con un modelo Bag-of-Words, obteniendo una tasa de éxito máxima de 78,5% para inglés y de 84,5% para español. El segundo experimento ha incluido dos técnicas estadísticas: análisis discriminante y regresión logística binaria, siendo los resultados de clasificación igualmente satisfactorios. Además de ello, se ha confirmado el papel fundamental en la configuración de la mentira escrita de parámetros tales como la longitud del texto, referencias propias, entendimiento y exclusiones. Palabras clave: lingüística computacional, detección de la mentira, análisis contrastivo, clasificación automática. The main aim of this PhD thesis is to analyse the linguistic cues to deception in written language both in English and Spanish, performing a contrastive analysis between both languages. For this purpose, several automatic classification experiments have been performed on two ad-hoc corpora in both languages, in order to check whether the texts could be successfully classified on the basis of their truth value. In the first set of experiments, a machine learning technique has been applied on the data and compared to a Bag-of-Words model, obtaining a maximum rate of 78.5% for English and 84.5% for Spanish. The second experiment involved statistical techniques, namely discriminant function analysis and binary logistic regression, and the results obtained proved remarkably successful too. In addition, they confirm the leading role in deception detection of parameters such as text length, self-references, insight and exclusive words. Keywords: computational linguistics, deception detection, contrastive analysis, automatic classification

    Detecting deception in written language

    Get PDF
    La mentira en el lenguaje se ha estudiado desde la perspectiva de varias disciplinas, siendo la más reciente la minería de opiniones. En este contexto, el presente estudio persigue explorar los rasgos sintomáticos de la mentira en lengua escrita en español, lo cual no ha sido aún investigado. Para ello, hemos desarrollado un marco de trabajo basado en un clasificador de máquinas de soporte vectorial (SVM) aplicado a un corpus ad hoc de opiniones. Hemos usado las categorías psicolingüísticas definidas en LIWC (Pennebaker, Francis y Booth, 2001) a través de sus cuatro dimensiones fundamentales para entrenar el algoritmo. Los resultados del experimento muestran que es posible separar los textos en lengua española de acuerdo con su condición de verdad, siendo las dos primeras dimensiones, procesos lingüísticos y psicológicos, las más relevantes para la consecución de tal objetivo.Deception in language has been studied from the perspective of several disciplines, being the most recent one opinion mining. Within this framework, the present study attempts to explore deception cues in written Spanish, which, to the best of our knowledge, has not been investigated yet. For our purposes, we have developed a framework based on a classifier using a Support Vector Machine (SVM) in order to detect deception in an ad hoc opinion corpus. We have used the psycholinguistic categories defined in LIWC (Pennebaker, Francis and Booth, 2001) through its four broad dimensions for the subsequent training of the abovementioned classifier. The findings reveal that truthful and deceptive texts in Spanish are indeed separable, being the two first dimensions, linguistic and psychological processes, the most relevant ones for fulfilling our aim.Este trabajo ha sido financiado por el Ministerio de Ciencia e Innovación a través del proyecto SeCloud (TIN2010-18650). Además de ello, Ángela Almela cuenta con la financiación de la Fundación Séneca (12406/FPI/09)
    corecore