3,896 research outputs found
Improving Cross-Lingual Transfer Learning for Event Detection
The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning
Essays on Corporate Disclosure of Value Creation
Information on a firmâs business model helps investors understand an entityâs resource requirements, priorities for action, and prospects (FASB, 2001, pp. 14-15; IASB, 2010, p. 12). Disclosures of strategy and business model (SBM) are therefore considered a central element of effective annual report commentary (Guillaume, 2018; IIRC, 2011). By applying natural language processing techniques, I explore what SBM disclosures look like when management are pressed to say something, analyse determinants of cross-sectional variation in SBM reporting properties, and assess whether and how managers respond to regulatory interventions seeking to promote SBM annual report commentary. This dissertation contains three main chapters. Chapter 2 presents a systematic review of the academic literature on non-financial reporting and the emerging literature on SBM reporting. Here, I also introduce my institutional setting. Chapter 3 and Chapter 4 form the empirical sections of this thesis. In Chapter 3, I construct the first large sample corpus of SBM annual report commentary and provide the first systematic analysis of the properties of such disclosures. My topic modelling analysis rejects the hypothesis that such disclosure is merely padding; instead finding themes align with popular strategy frameworks and management tailor the mix of SBM topics to reflect their unique approach to value creation. However, SBM commentary is less specific, less precise about time horizon (short- and long-term), and less balanced (more positive) in tone relative to general management commentary. My findings suggest symbolic compliance and legitimisation characterize the typical annual report discussion of SBM. Further analysis identifies proprietary cost considerations and obfuscation incentives as key determinants of symbolic reporting. In Chapter 4, I seek evidence on how managers respond to regulatory mandates by adapting the properties of disclosure and investigate whether the form of the mandate matters. Using a differences-in-differences research design, my results suggest a modest incremental response by treatment firms to the introduction of a comply or explain provision to provide disclosure on strategy and business model. In contrast, I find a substantial response to enacting the same requirements in law. My analysis provides clear and consistent evidence that treatment firms incrementally increase the volume of SBM disclosure, improve coverage across a broad range of topics as well as providing commentary with greater focus on the long term. My results point to substantial changes in SBM reporting properties following regulatory mandates, but the form of the mandate does matter. Overall, this dissertation contributes to the accounting literature by examining how firms discuss a central topic to economic decision making in annual reports and how firms respond to different forms of disclosure mandate. Furthermore, the results of my analysis are likely to be of value for regulators and policymakers currently reviewing or considering mandating disclosure requirements. By examining how companies adapt their reporting to different types of regulations, this study provides an empirical basis for recalibrating SBM disclosure mandates, thereby enhancing the information set of capital market participants and promoting stakeholder engagement in a landscape increasingly shaped by non-financial information
Cross-lingual AMR Aligner: Paying Attention to Cross-Attention
This paper introduces a novel aligner for Abstract Meaning Representation
(AMR) graphs that can scale cross-lingually, and is thus capable of aligning
units and spans in sentences of different languages. Our approach leverages
modern Transformer-based parsers, which inherently encode alignment information
in their cross-attention weights, allowing us to extract this information
during parsing. This eliminates the need for English-specific rules or the
Expectation Maximization (EM) algorithm that have been used in previous
approaches. In addition, we propose a guided supervised method using alignment
to further enhance the performance of our aligner. We achieve state-of-the-art
results in the benchmarks for AMR alignment and demonstrate our aligner's
ability to obtain them across multiple languages. Our code will be available at
\href{https://www.github.com/Babelscape/AMR-alignment}{github.com/Babelscape/AMR-alignment}.Comment: ACL 2023. Please cite authors correctly using both lastnames
("Mart\'inez Lorenzo", "Huguet Cabot"
OHH HE LIKES THE GIRLS: A GENEALOGY OF THE âTRANNY CHASERâ
Research presented in this project examines how the social construction of sexuality affects cisgender (cis) men\u27s attraction to transgender women. While mainstream discourse roots gender normative males\u27 attraction to transgender women in heterosexuality, this project demonstrates how cis-trans pairings emerged from homosexuality in the twentieth century. This project traces the way sexologists\u27 elaboration of the differences between sex, gender, and sexuality helped to distinguish transfeminine people from trans-attracted gender normative males using Foucauldian genealogy. Further, this project examines how researchers have adapted nineteenth-century frameworks of same-sex desires as sexual fetishes to construct gender-conforming âhealthyâ desires aimed at transsexual women by using the elaboration of these categories in the science of transsexualism. By doing so, this project illustrates how researchers deemphasized the body of trans people and elevated their gender to ensure a white middle-class cis-normative society
A Theistic Critique of Secular Moral Nonnaturalism
This dissertation is an exercise in Theistic moral apologetics. It will be developing both a critique of secular nonnaturalist moral theory (moral Platonism) at the level of metaethics, as well as a positive form of the moral argument for the existence of God that follows from this critique. The critique will focus on the work of five prominent metaethical theorists of secular moral non-naturalism: David Enoch, Eric Wielenberg, Russ Shafer-Landau, Michael Huemer, and Christopher Kulp. Each of these thinkers will be critically examined. Following this critique, the positive moral argument for the existence of God will be developed, combining a cumulative, abductive argument that follows from filling in the content of a succinct apagogic argument. The cumulative abductive argument and the apagogic argument together, with a transcendental and modal component, will be presented to make the case that Theism is the best explanation for the kind of moral, rational beings we are and the kind of universe in which we live, a rational intelligible universe
Comparing the production of a formula with the development of L2 competence
This pilot study investigates the production of a formula with the development of L2 competence over proficiency levels of a spoken learner corpus. The results show that the formula
in beginner production data is likely being recalled holistically from learnersâ phonological
memory rather than generated online, identifiable by virtue of its fluent production in absence
of any other surface structure evidence of the formulaâs syntactic properties. As learnersâ L2
competence increases, the formula becomes sensitive to modifications which show structural
conformity at each proficiency level. The transparency between the formulaâs modification
and learnersâ corresponding L2 surface structure realisations suggest that it is the independent
development of L2 competence which integrates the formula into compositional language,
and ultimately drives the SLA process forward
A Paradigm Gap in Turkish
In this paper, we argue that Turkish has a gap in the third person plural cell of the person-number agreement paradigm of desiderative constructions formed with the -AsI suffix. We provide evidence for this claim from a corpus search and an acceptability judgment experiment. The corpus search shows that the third person plural suffix is virtually unattested with -AsI desideratives and the results of the experiment show that the third person plural suffix significantly reduces the acceptability of -AsI desideratives. In order to account for the observation that third person plural desideratives are unacceptable for most speakers, we argue that both negative evidence and competition accounts contribute to the existence and persistence of the gap. We discuss that competition accounts are supported by the presence of two competing forms whereas negative evidence accounts are supported by the anomalous relative frequency distribution in the paradigm of desideratives
Discontinuous grammar as a foreign language
[Abstract] In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy,
coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11We acknowledge the European Research Council (ERC), which has funded this research under the European Unionâs Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and the Horizon Europe research and innovation programme (SALSA, grant agreement No 101100615), ERDF/ MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), Xunta de Galicia (ED431C 2020/11), and Centro de InvestigaciĂłn de Galicia ââCITICâ, funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014â2020 Program), by grant ED431G 2019/01. Funding for open access charge: Universidade da Coruña/CISUG
On regular copying languages
This paper proposes a formal model of regular languages enriched with unbounded copying. We augment finite-state machinery with the ability to recognize copied strings by adding an unbounded memory buffer with a restricted form of first-in-first-out storage. The newly introduced computational device, finite-state buffered machines (FS-BMs), characterizes the class of regular languages and languages de-rived from them through a primitive copying operation. We name this language class regular copying languages (RCLs). We prove a pumping lemma and examine the closure properties of this language class. As suggested by previous literature (Gazdar and Pullum 1985, p.278), regular copying languages should approach the correct characteriza-tion of natural language word sets
- âŠ