3,896 research outputs found

    Improving Cross-Lingual Transfer Learning for Event Detection

    Get PDF
    The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning

    Essays on Corporate Disclosure of Value Creation

    Get PDF
    Information on a firm’s business model helps investors understand an entity’s resource requirements, priorities for action, and prospects (FASB, 2001, pp. 14-15; IASB, 2010, p. 12). Disclosures of strategy and business model (SBM) are therefore considered a central element of effective annual report commentary (Guillaume, 2018; IIRC, 2011). By applying natural language processing techniques, I explore what SBM disclosures look like when management are pressed to say something, analyse determinants of cross-sectional variation in SBM reporting properties, and assess whether and how managers respond to regulatory interventions seeking to promote SBM annual report commentary. This dissertation contains three main chapters. Chapter 2 presents a systematic review of the academic literature on non-financial reporting and the emerging literature on SBM reporting. Here, I also introduce my institutional setting. Chapter 3 and Chapter 4 form the empirical sections of this thesis. In Chapter 3, I construct the first large sample corpus of SBM annual report commentary and provide the first systematic analysis of the properties of such disclosures. My topic modelling analysis rejects the hypothesis that such disclosure is merely padding; instead finding themes align with popular strategy frameworks and management tailor the mix of SBM topics to reflect their unique approach to value creation. However, SBM commentary is less specific, less precise about time horizon (short- and long-term), and less balanced (more positive) in tone relative to general management commentary. My findings suggest symbolic compliance and legitimisation characterize the typical annual report discussion of SBM. Further analysis identifies proprietary cost considerations and obfuscation incentives as key determinants of symbolic reporting. In Chapter 4, I seek evidence on how managers respond to regulatory mandates by adapting the properties of disclosure and investigate whether the form of the mandate matters. Using a differences-in-differences research design, my results suggest a modest incremental response by treatment firms to the introduction of a comply or explain provision to provide disclosure on strategy and business model. In contrast, I find a substantial response to enacting the same requirements in law. My analysis provides clear and consistent evidence that treatment firms incrementally increase the volume of SBM disclosure, improve coverage across a broad range of topics as well as providing commentary with greater focus on the long term. My results point to substantial changes in SBM reporting properties following regulatory mandates, but the form of the mandate does matter. Overall, this dissertation contributes to the accounting literature by examining how firms discuss a central topic to economic decision making in annual reports and how firms respond to different forms of disclosure mandate. Furthermore, the results of my analysis are likely to be of value for regulators and policymakers currently reviewing or considering mandating disclosure requirements. By examining how companies adapt their reporting to different types of regulations, this study provides an empirical basis for recalibrating SBM disclosure mandates, thereby enhancing the information set of capital market participants and promoting stakeholder engagement in a landscape increasingly shaped by non-financial information

    Cross-lingual AMR Aligner: Paying Attention to Cross-Attention

    Full text link
    This paper introduces a novel aligner for Abstract Meaning Representation (AMR) graphs that can scale cross-lingually, and is thus capable of aligning units and spans in sentences of different languages. Our approach leverages modern Transformer-based parsers, which inherently encode alignment information in their cross-attention weights, allowing us to extract this information during parsing. This eliminates the need for English-specific rules or the Expectation Maximization (EM) algorithm that have been used in previous approaches. In addition, we propose a guided supervised method using alignment to further enhance the performance of our aligner. We achieve state-of-the-art results in the benchmarks for AMR alignment and demonstrate our aligner's ability to obtain them across multiple languages. Our code will be available at \href{https://www.github.com/Babelscape/AMR-alignment}{github.com/Babelscape/AMR-alignment}.Comment: ACL 2023. Please cite authors correctly using both lastnames ("Mart\'inez Lorenzo", "Huguet Cabot"

    OHH HE LIKES THE GIRLS: A GENEALOGY OF THE “TRANNY CHASER”

    Get PDF
    Research presented in this project examines how the social construction of sexuality affects cisgender (cis) men\u27s attraction to transgender women. While mainstream discourse roots gender normative males\u27 attraction to transgender women in heterosexuality, this project demonstrates how cis-trans pairings emerged from homosexuality in the twentieth century. This project traces the way sexologists\u27 elaboration of the differences between sex, gender, and sexuality helped to distinguish transfeminine people from trans-attracted gender normative males using Foucauldian genealogy. Further, this project examines how researchers have adapted nineteenth-century frameworks of same-sex desires as sexual fetishes to construct gender-conforming “healthy” desires aimed at transsexual women by using the elaboration of these categories in the science of transsexualism. By doing so, this project illustrates how researchers deemphasized the body of trans people and elevated their gender to ensure a white middle-class cis-normative society

    A Theistic Critique of Secular Moral Nonnaturalism

    Get PDF
    This dissertation is an exercise in Theistic moral apologetics. It will be developing both a critique of secular nonnaturalist moral theory (moral Platonism) at the level of metaethics, as well as a positive form of the moral argument for the existence of God that follows from this critique. The critique will focus on the work of five prominent metaethical theorists of secular moral non-naturalism: David Enoch, Eric Wielenberg, Russ Shafer-Landau, Michael Huemer, and Christopher Kulp. Each of these thinkers will be critically examined. Following this critique, the positive moral argument for the existence of God will be developed, combining a cumulative, abductive argument that follows from filling in the content of a succinct apagogic argument. The cumulative abductive argument and the apagogic argument together, with a transcendental and modal component, will be presented to make the case that Theism is the best explanation for the kind of moral, rational beings we are and the kind of universe in which we live, a rational intelligible universe

    Comparing the production of a formula with the development of L2 competence

    Get PDF
    This pilot study investigates the production of a formula with the development of L2 competence over proficiency levels of a spoken learner corpus. The results show that the formula in beginner production data is likely being recalled holistically from learners’ phonological memory rather than generated online, identifiable by virtue of its fluent production in absence of any other surface structure evidence of the formula’s syntactic properties. As learners’ L2 competence increases, the formula becomes sensitive to modifications which show structural conformity at each proficiency level. The transparency between the formula’s modification and learners’ corresponding L2 surface structure realisations suggest that it is the independent development of L2 competence which integrates the formula into compositional language, and ultimately drives the SLA process forward

    On Internal Merge

    Get PDF

    A Paradigm Gap in Turkish

    Get PDF
    In this paper, we argue that Turkish has a gap in the third person plural cell of the person-number agreement paradigm of desiderative constructions formed with the -AsI suffix. We provide evidence for this claim from a corpus search and an acceptability judgment experiment. The corpus search shows that the third person plural suffix is virtually unattested with -AsI desideratives and the results of the experiment show that the third person plural suffix significantly reduces the acceptability of -AsI desideratives. In order to account for the observation that third person plural desideratives are unacceptable for most speakers, we argue that both negative evidence and competition accounts contribute to the existence and persistence of the gap. We discuss that competition accounts are supported by the presence of two competing forms whereas negative evidence accounts are supported by the anomalous relative frequency distribution in the paradigm of desideratives

    Discontinuous grammar as a foreign language

    Get PDF
    [Abstract] In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy, coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11We acknowledge the European Research Council (ERC), which has funded this research under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and the Horizon Europe research and innovation programme (SALSA, grant agreement No 101100615), ERDF/ MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), Xunta de Galicia (ED431C 2020/11), and Centro de InvestigaciĂłn de Galicia ‘‘CITIC”, funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014–2020 Program), by grant ED431G 2019/01. Funding for open access charge: Universidade da Coruña/CISUG

    On regular copying languages

    Get PDF
    This paper proposes a formal model of regular languages enriched with unbounded copying. We augment finite-state machinery with the ability to recognize copied strings by adding an unbounded memory buffer with a restricted form of first-in-first-out storage. The newly introduced computational device, finite-state buffered machines (FS-BMs), characterizes the class of regular languages and languages de-rived from them through a primitive copying operation. We name this language class regular copying languages (RCLs). We prove a pumping lemma and examine the closure properties of this language class. As suggested by previous literature (Gazdar and Pullum 1985, p.278), regular copying languages should approach the correct characteriza-tion of natural language word sets
    • 

    corecore