831 research outputs found

    Improving Cross-Lingual Transfer Learning for Event Detection

    Get PDF
    The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Essays on Corporate Disclosure of Value Creation

    Get PDF
    Information on a firm’s business model helps investors understand an entity’s resource requirements, priorities for action, and prospects (FASB, 2001, pp. 14-15; IASB, 2010, p. 12). Disclosures of strategy and business model (SBM) are therefore considered a central element of effective annual report commentary (Guillaume, 2018; IIRC, 2011). By applying natural language processing techniques, I explore what SBM disclosures look like when management are pressed to say something, analyse determinants of cross-sectional variation in SBM reporting properties, and assess whether and how managers respond to regulatory interventions seeking to promote SBM annual report commentary. This dissertation contains three main chapters. Chapter 2 presents a systematic review of the academic literature on non-financial reporting and the emerging literature on SBM reporting. Here, I also introduce my institutional setting. Chapter 3 and Chapter 4 form the empirical sections of this thesis. In Chapter 3, I construct the first large sample corpus of SBM annual report commentary and provide the first systematic analysis of the properties of such disclosures. My topic modelling analysis rejects the hypothesis that such disclosure is merely padding; instead finding themes align with popular strategy frameworks and management tailor the mix of SBM topics to reflect their unique approach to value creation. However, SBM commentary is less specific, less precise about time horizon (short- and long-term), and less balanced (more positive) in tone relative to general management commentary. My findings suggest symbolic compliance and legitimisation characterize the typical annual report discussion of SBM. Further analysis identifies proprietary cost considerations and obfuscation incentives as key determinants of symbolic reporting. In Chapter 4, I seek evidence on how managers respond to regulatory mandates by adapting the properties of disclosure and investigate whether the form of the mandate matters. Using a differences-in-differences research design, my results suggest a modest incremental response by treatment firms to the introduction of a comply or explain provision to provide disclosure on strategy and business model. In contrast, I find a substantial response to enacting the same requirements in law. My analysis provides clear and consistent evidence that treatment firms incrementally increase the volume of SBM disclosure, improve coverage across a broad range of topics as well as providing commentary with greater focus on the long term. My results point to substantial changes in SBM reporting properties following regulatory mandates, but the form of the mandate does matter. Overall, this dissertation contributes to the accounting literature by examining how firms discuss a central topic to economic decision making in annual reports and how firms respond to different forms of disclosure mandate. Furthermore, the results of my analysis are likely to be of value for regulators and policymakers currently reviewing or considering mandating disclosure requirements. By examining how companies adapt their reporting to different types of regulations, this study provides an empirical basis for recalibrating SBM disclosure mandates, thereby enhancing the information set of capital market participants and promoting stakeholder engagement in a landscape increasingly shaped by non-financial information

    AI: Limits and Prospects of Artificial Intelligence

    Get PDF
    The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence

    Evaluating automated and hybrid neural disambiguation for African historical named entities

    Get PDF
    Documents detailing South African history contain ambiguous names. Ambiguous names may be due to people having the same name or the same person being referred to by multiple different names. Thus when searching for or attempting to extract information about a particular person, the name used may affect the results. This problem may be alleviated by using a Named Entity Disambiguation (NED) system to disambiguate names by linking them to a knowledge base. In recent years, transformer-based language models have led to improvements in NED systems. Furthermore, multilingual language models have shown the ability to learn concepts across languages, reducing the amount of training data required in low-resource languages. Thus a multilingual language model-based NED system was developed to disambiguate people's names within a historical South African context using documents written in English and isiZulu from the 500 Year Archive (FHYA). The multilingual language model-based system substantially improved on a probability-based baseline and achieved a micro F1-score of 0.726. At the same time, the entity linking component was able to link 81.9% of the mentions to the correct entity. However, the system's performance on documents written in isiZulu was significantly lower than on the documents written in English. Thus the system was augmented with handcrafted rules to improve its performance. The addition of handcrafted rules resulted in a small but significant improvement in performance when compared to the unaugmented NED system

    Predicate Matrix: an interoperable lexical knowledge base for predicates

    Get PDF
    183 p.La Matriz de Predicados (Predicate Matrix en inglés) es un nuevo recurso léxico-semántico resultado de la integración de múltiples fuentes de conocimiento, entre las cuales se encuentran FrameNet, VerbNet, PropBank y WordNet. La Matriz de Predicados proporciona un léxico extenso y robusto que permite mejorar la interoperabilidad entre los recursos semánticos mencionados anteriormente. La creación de la Matriz de Predicados se basa en la integración de Semlink y nuevos mappings obtenidos utilizando métodos automáticos que enlazan el conocimiento semántico a nivel léxico y de roles. Asimismo, hemos ampliado la Predicate Matrix para cubrir los predicados nominales (inglés, español) y predicados en otros idiomas (castellano, catalán y vasco). Como resultado, la Matriz de predicados proporciona un léxico multilingüe que permite el análisis semántico interoperable en múltiples idiomas

    20th SC@RUG 2023 proceedings 2022-2023

    Get PDF

    ALBERTI, a Multilingual Domain Specific Language Model for Poetry Analysis

    Full text link
    The computational analysis of poetry is limited by the scarcity of tools to automatically analyze and scan poems. In a multilingual settings, the problem is exacerbated as scansion and rhyme systems only exist for individual languages, making comparative studies very challenging and time consuming. In this work, we present \textsc{Alberti}, the first multilingual pre-trained large language model for poetry. Through domain-specific pre-training (DSP), we further trained multilingual BERT on a corpus of over 12 million verses from 12 languages. We evaluated its performance on two structural poetry tasks: Spanish stanza type classification, and metrical pattern prediction for Spanish, English and German. In both cases, \textsc{Alberti} outperforms multilingual BERT and other transformers-based models of similar sizes, and even achieves state-of-the-art results for German when compared to rule-based systems, demonstrating the feasibility and effectiveness of DSP in the poetry domain.Comment: Accepted for publication at SEPLN 2023: 39th International Conference of the Spanish Society for Natural Language Processin

    20th SC@RUG 2023 proceedings 2022-2023

    Get PDF

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/
    corecore