4,520 research outputs found

    Named Entity Recognition in multilingual handwritten texts

    [ES] En nuestro trabajo presentamos un único modelo basado en aprendizaje profundo para la transcripción automática y el reconocimiento de entidades nombradas de textos manuscritos. Este modelo aprovecha las capacidades de generalización de sistemas de reconocimiento, combinando redes neuronales artificiales y n-gramas de caracteres. Se discute la evaluación de dicho sistema y, como consecuencia, se propone una nueva medida de evaluación. Con el fin de mejorar los resultados con respecto a dicha métrica, se evalúan diferentes estrategias de corrección de errores.[EN] In our work we present a single Deep Learning based model for the automatic transcription and Named Entity Recognition of handwritten texts. Such model leverages the generalization capabilities of recognition systems, combining Artificial Neural Networks and n-gram character models. The evaluation of said system is discussed and, as a consequence, a new evaluation metric is proposed. As a means to improve the results in regards to such metric, different error correction strategies are assessed.Villanova Aparisi, D. (2021). Named Entity Recognition in multilingual handwritten texts. Universitat Politècnica de València. http://hdl.handle.net/10251/174942TFG

    Recognition of handwritten music scores

    The recognition of handwritten music scores still remains an open problem. The existing approaches can only deal with very simple handwritten scores mainly because of the variability in the handwriting style and the variability in the composition of groups of music notes (i.e. compound music notes). In this work on the one hand I study the isolated symbols (i.e half-note, quarter-note, clefs, sharps) and on the other hand the compound music notes. Firstly, I will separate the isolated symbols (i.e half-notes, quarter-notes, clefs, sharps) to the compounds and I will study each one separately. The isolated symbols will be recognized with symbol recognition methods and compounds with a primitive hierarchy and syntactic rules. The method has been tested using several handwritten music scores of the CVC-MUSCIMA database and compared with a commercial Optical Music Recognition software. Given that my method is learning-free, the obtained results are promising.El reconeixement de partitures musicals manuscrites segueix sent un problema obert. Els enfocaments existents només poden reconéixer partitures manuscrites molt simples, principalment a causa de la variabilitat en l'estil d'escriptura i la variabilitat en la composició dels grups de notes musicals (p.e. els símbols musicals compostos). En aquest treball, per començar, se separaran els símbols simples (p.e blanques, negres, claus, sostinguts) dels compostos i els estudiaré per separat. Els símbols simples mitjançant mètodes de reconeixement de símbols i els compostos a partir d'una jerarquia de primitives i regles sintàctiques. El meu mètode ha estat provat utilitzant diferents partitures de música escrita a mà de la base de dades CVC-MUSCIMA i comparat amb un programari de reconeixement òptic musical comercial. Tenint en compte que el meu mètode és d'aprenentatge lliure, els resultats obtinguts són prometedors.El reconocimiento de partituras musicales manuscritas sigue siendo un problema abierto. Los enfoques existentes sólo pueden reconocer partituras manuscritas muy simples, principalmente debido a la variabilidad en el estilo de escritura y la variabilidad en la composición de los grupos de notas musicales (p.e. los símbolos musicales compuestos). En este trabajo, para empezar, se separarán los símbolos simples (p.e blancas, negras, llaves, sostenidos) de los compuestos y los estudiaré por separado. Los símbolos simples mediante métodos de reconocimiento de símbolos y los compuestos a partir de una jerarquía de primitivas y reglas sintácticas. Mi método ha sido probado utilizando diferentes partituras de música escrita a mano de la base de datos CVC-MUSCIMA y comparado con un software de reconocimiento óptico musical comercial. Teniendo en cuenta que mi método es de aprendizaje libre, los resultados obtenidos son prometedores

    Digital Preservation, Archival Science and Methodological Foundations for Digital Libraries

    Digital libraries, whether commercial, public or personal, lie at the heart of the information society. Yet, research into their long‐term viability and the meaningful accessibility of their contents remains in its infancy. In general, as we have pointed out elsewhere, ‘after more than twenty years of research in digital curation and preservation the actual theories, methods and technologies that can either foster or ensure digital longevity remain startlingly limited.’ Research led by DigitalPreservationEurope (DPE) and the Digital Preservation Cluster of DELOS has allowed us to refine the key research challenges – theoretical, methodological and technological – that need attention by researchers in digital libraries during the coming five to ten years, if we are to ensure that the materials held in our emerging digital libraries are to remain sustainable, authentic, accessible and understandable over time. Building on this work and taking the theoretical framework of archival science as bedrock, this paper investigates digital preservation and its foundational role if digital libraries are to have long‐term viability at the centre of the global information society.

    Debugging Inputs

    When a program fails to process an input, it need not be the program code that is at fault. It can also be that the input data is faulty, for instance as result of data corruption. To get the data processed, one then has to debug the input data—that is, (1) identify which parts of the input data prevent processing, and (2) recover as much of the (valuable) input data as possible. In this paper, we present a general-purpose algorithm called ddmax that addresses these problems automatically. Through experiments, ddmax maximizes the subset of the input that can still be processed by the program, thus recovering and repairing as much data as possible; the difference between the original failing input and the “maximized” passing input includes all input fragments that could not be processed. To the best of our knowledge, ddmax is the first approach that fixes faults in the input data without requiring program analysis. In our evaluation, ddmax repaired about 69% of input files and recovered about 78% of data within one minute per input

    The use of data-mining for the automatic formation of tactics

    This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques

    Business Ontology for Evaluating Corporate Social Responsibility

    This paper presents a software solution that is developed to automatically classify companies by taking into account their level of social responsibility. The application is based on ontologies and on intelligent agents. In order to obtain the data needed to evaluate companies, we developed a web crawling module that analyzes the company’s website and the documents that are available online such as social responsibility report, mission statement, employment structure, etc. Based on a predefined CSR ontology, the web crawling module extracts the terms that are linked to corporate social responsibility. By taking into account the extracted qualitative data, an intelligent agent, previously trained on a set of companies, computes the qualitative values, which are then included in the classification model based on neural networks. The proposed ontology takes into consideration the guidelines proposed by the “ISO 26000 Standard for Social Responsibility”. Having this model, and being aware of the positive relationship between Corporate Social Responsibility and financial performance, an overall perspective on each company’s activity can be configured, this being useful not only to the company’s creditors, auditors, stockholders, but also to its consumers.corporate social responsibility, ISO 26000 Standard for Social Responsibility, ontology, web crawling, intelligent agent, corporate performance, POS tagging, opinion mining, sentiment analysis