8 research outputs found

    The Need for Good Old Fashioned AI and Law

    Get PDF

    Crowdsourcing for image metadata : a comparison between game-generated tags and professional descriptors

    Get PDF
    One way to address the challenge of creating metadata for digitized image collections is to rely on user-created index terms, typically by harvesting tags from the collaborative information services known as folksonomies or by allowing the users to tag directly in the catalog. An alternative method, only recently applied in cultural heritage institutions, is Human Computation Games, a crowdsourcing tool that relies on user-agreement to create valid tags. This study contributes to the research by investigating tags (at various degrees of validation) generated by a Human Computation Game and comparing them to descriptors assigned to the same images by professional indexers. The analysis is done by classifying tags and descriptors by term-category, as well as by measuring overlap on both syntactic (matching on terms) and semantic (matching on meaning) level between the tags and the descriptors. The findings shows that validated tags tend to describe ‘artifacts/objects’ and that game-generated tags typically will represent what is in the picture, rather than what it is about. Descriptors also primarily belonged to this term-category but also had a substantial amount of ‘Proper nouns’, mainly named locations. Tags generated by the game, not validated by player-agreement, had a higher frequency of ‘subjective/narrative’ tags, but also more errors. It was determined that the exact (character-for-character) overlap i.e. the number of common terms compared to the entire pool of tags and descriptors was slightly less than 5% for all types of tags. By extending the analysis to include fuzzy (word-stem) matching, the overlap more than doubled. The semantic overlap was established with thesaurus relations between a sample of tags and descriptors and adapting this - more inclusive - view of overlap resulted in an increase in percentage of tags that were matched to descriptors. More than half of the validated tags had some thesaurus relation to a descriptor added by a professional indexer. Approximately 60% of the thesaurus relations between descriptors and valid tags were either ‘same’ or ‘equivalent’ and roughly 20% were associative and 20% were hierarchical. For the hierarchical relations it was found that tags typically describe images at a less specific level than descriptors.Joint Master Degree in Digital Library Learning (DILL

    Generic semantics-based task-oriented dialogue system framework for human-machine interaction in industrial scenarios

    Get PDF
    285 p.En Industria 5.0, los trabajadores y su bienestar son cruciales en el proceso de producción. En estecontexto, los sistemas de diálogo orientados a tareas permiten que los operarios deleguen las tareas mássencillas a los sistemas industriales mientras trabajan en otras más complejas. Además, la posibilidad deinteractuar de forma natural con estos sistemas reduce la carga cognitiva para usarlos y genera aceptaciónpor parte de los usuarios. Sin embargo, la mayoría de las soluciones existentes no permiten unacomunicación natural, y las técnicas actuales para obtener dichos sistemas necesitan grandes cantidadesde datos para ser entrenados, que son escasos en este tipo de escenarios. Esto provoca que los sistemas dediálogo orientados a tareas en el ámbito industrial sean muy específicos, lo que limita su capacidad de sermodificados o reutilizados en otros escenarios, tareas que están ligadas a un gran esfuerzo en términos detiempo y costes. Dados estos retos, en esta tesis se combinan Tecnologías de la Web Semántica contécnicas de Procesamiento del Lenguaje Natural para desarrollar KIDE4I, un sistema de diálogo orientadoa tareas semántico para entornos industriales que permite una comunicación natural entre humanos ysistemas industriales. Los módulos de KIDE4I están diseñados para ser genéricos para una sencillaadaptación a nuevos casos de uso. La ontología modular TODO es el núcleo de KIDE4I, y se encarga demodelar el dominio y el proceso de diálogo, además de almacenar las trazas generadas. KIDE4I se haimplementado y adaptado para su uso en cuatro casos de uso industriales, demostrando que el proceso deadaptación para ello no es complejo y se beneficia del uso de recursos

    CONCAT - Connotation Analysis of Thesauri Based on the Interpretation of Context Meaning

    No full text
    Knowledge acquisition constitutes the bottleneck for the creation of legal expert systems. Legal language must be formalised to such a degree that it can be processed automatically. We deal with this problem by supporting the process of creating a selective thesaurus for a legal information system which can be seen as prerequisite for further knowledge processing. This selectivity is obtained by means of connotation analysis of the individual descriptors which makes it possible to detect hidden word meanings and to distinguish between precise legal terms and words with fuzzy meaning. Within the prototype system CONCAT we applied both a statistical and a connectionist approach to connotation analysis and performed a comparative evaluation of the achieved results. 1 Introduction Advanced use of information technology in the legal field requires formalisation of the legal data (e.g. statutes, treaties, court decisions or literature). Two main approaches are concerned with this ..

    Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021

    Get PDF
    The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at Università degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018 : 10-12 December 2018, Torino

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Aspects of Record Linkage

    Get PDF
    This thesis is an exploration of the subject of historical record linkage. The general goal of historical record linkage is to discover relations between historical entities in a database, for any specific definition of relation, entity and database. Although this task originates from historical research, multiple disciplines are involved. Increasing volumes of data necessitate the use of automated or semi-automated linkage procedures, which is in the domain of computer science. Linkage methodologies depend heavily on the nature of the data itself, often requiring analysis based on onomastics (i.e., the study of person names) or general linguistics. To understand the dynamics of natural language one could be tempted to look at the source of language, i.e., humans, either on the individual cognitive level or as group behaviour. This further increases the multidisciplinarity of the subject by including cognitive psychology. Every discipline addresses a subset of problem aspects, all of which can contribute either to practical solutions for linkage problems or to further insights into the subject matter.Algorithms and the Foundations of Software technolog

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges
    corecore