1,023 research outputs found

    CAUSAL RELATIONS IN ENGLISH NEWS MAGAZINE DISCOURSE: JOURNALISTS’ AGE PERSPECTIVE

    Get PDF
    This paper reveals how journalists’ age influences the linguistic representation of causal relations in English news magazine articles. Treating cause in a broad sense covering adverbials and clauses of reason, concession, purpose and result, the study finds that causal relations are scarce in the texts of young reporters. Unlike them, middle-aged authors’ articles demonstrate a 17-per-cent-higher frequency of adverbials and clauses of reason, and older journalists’ texts show a 12-per-cent rise in concessive clauses with the temporal concessive, comparative concessive, alternative concessive, conditional concessive and generalizing concessive relations. To account for these findings, I apply Talmy’s (1985) force dynamics theory viewing cause as an interaction of entities concerning force and energy where one causes another. Given this theory, middle-aged journalists verbalise causal relations grounded in what I call energy transfer model with one moving entity causing another to move, and energy loss model where inactivity of one entity is due to blocking of the other entity. In older authors’ articles, causal relations are represented by concessive clauses introduced by a range of conjunctions specifying concessive meaning: temporal concessive, comparative concessive, alternative concessive, conditional concessive and generalizing concessive

    Aspects of a primacy of frame model of translation

    Get PDF
    Frame Semantics and Construction Grammar are two highly interdependent cognitive linguistic theories which have been used in various ways to date to analyse and model translation. However, a unified model on how frames and constructions (are) operate(d on) and interact in translation, i.e. a translational perspective on and of frames and constructions, has not yet been fully developed. The model proposed in this paper is intended to narrow this gap. In drafting this model, I establish the principle of maximum frame comparability. I furthermore analyse factors which may lead to an override of this principle. From these analyses, I deduce research questions the investigation of which can benefit both translation studies as well as the theoretical frameworks Frame Semantics and Construction Grammar

    Eesti keele ĂŒldvaldkonna tekstide laia kattuvusega automaatne sĂŒndmusanalĂŒĂŒs

    Get PDF
    Seoses tekstide suuremahulise digitaliseerimisega ning digitaalse tekstiloome jĂ€rjest laiema levikuga on tohutul hulgal loomuliku keele tekste muutunud ja muutumas masinloetavaks. Masinloetavus omab potentsiaali muuta tekstimassiivid inimeste jaoks lihtsamini hallatavaks, nt lubada rakendusi nagu automaatne sisukokkuvĂ”tete tegemine ja tekstide pĂ”hjal kĂŒsimustele vastamine, ent paraku ei ulatu praegused automaatanalĂŒĂŒsi vĂ”imalused tekstide sisu tegeliku mĂ”istmiseni. Oletatakse, tekstide sisu mĂ”istvale automaatanalĂŒĂŒsile viib meid lĂ€hemale sĂŒndmusanalĂŒĂŒs – kuna paljud tekstid on narratiivse ĂŒlesehitusega, tĂ”lgendatavad kui „sĂŒndmuste kirjeldused”, peaks tekstidest sĂŒndmuste eraldamine ja formaalsel kujul esitamine pakkuma alust mitmete „teksti mĂ”istmist” nĂ”udvate keeletehnoloogia rakenduste loomisel. KĂ€esolevas vĂ€itekirjas uuritakse, kuivĂ”rd saab eestikeelsete tekstide sĂŒndmusanalĂŒĂŒsi kĂ€sitleda kui avatud sĂŒndmuste hulka ja ĂŒldvaldkonna tekste hĂ”lmavat automaatse lingvistilise analĂŒĂŒsi ĂŒlesannet. Probleemile lĂ€henetakse eesti keele automaatanalĂŒĂŒsi kontekstis uudsest, sĂŒndmuste ajasemantikale keskenduvast perspektiivist. Töös kohandatakse eesti keelele TimeML mĂ€rgendusraamistik ja luuakse raamistikule toetuv automaatne ajavĂ€ljendite tuvastaja ning ajasemantilise mĂ€rgendusega (sĂŒndmusviidete, ajavĂ€ljendite ning ajaseoste mĂ€rgendusega) tekstikorpus; analĂŒĂŒsitakse korpuse pĂ”hjal inimmĂ€rgendajate kooskĂ”la sĂŒndmusviidete ja ajaseoste mÀÀramisel ning lĂ”puks uuritakse vĂ”imalusi ajasemantika-keskse sĂŒndmusanalĂŒĂŒsi laiendamiseks geneeriliseks sĂŒndmusanalĂŒĂŒsiks sĂŒndmust vĂ€ljendavate keelendite samaviitelisuse lahendamise nĂ€itel. Töö pakub suuniseid tekstide ajasemantika ja sĂŒndmusstruktuuri mĂ€rgenduse edasiarendamiseks tulevikus ning töös loodud keeleressurssid vĂ”imaldavad nii konkreetsete lĂ”pp-rakenduste (nt automaatne ajakĂŒsimustele vastamine) katsetamist kui ka automaatsete mĂ€rgendustööriistade edasiarendamist.  Due to massive scale digitalisation processes and a switch from traditional means of written communication to digital written communication, vast amounts of human language texts are becoming machine-readable. Machine-readability holds a potential for easing human effort on searching and organising large text collections, allowing applications such as automatic text summarisation and question answering. However, current tools for automatic text analysis do not reach for text understanding required for making these applications generic. It is hypothesised that automatic analysis of events in texts leads us closer to the goal, as many texts can be interpreted as stories/narratives that are decomposable into events. This thesis explores event analysis as broad-coverage and general domain automatic language analysis problem in Estonian, and provides an investigation starting from time-oriented event analysis and tending towards generic event analysis. We adapt TimeML framework to Estonian, and create an automatic temporal expression tagger and a news corpus manually annotated for temporal semantics (event mentions, temporal expressions, and temporal relations) for the language; we analyse consistency of human annotation of event mentions and temporal relations, and, finally, provide a preliminary study on event coreference resolution in Estonian news. The current work also makes suggestions on how future research can improve Estonian event and temporal semantic annotation, and the language resources developed in this work will allow future experimentation with end-user applications (such as automatic answering of temporal questions) as well as provide a basis for developing automatic semantic analysis tools

    Strategies to Address Data Sparseness in Implicit Semantic Role Labeling

    Get PDF
    Natural language texts frequently contain predicates whose complete understanding re- quires access to other parts of the discourse. Human readers can retrieve such infor- mation across sentence boundaries and infer the implicit piece of information. This capability enables us to understand complicated texts without needing to repeat the same information in every single sentence. However, for computational systems, resolv- ing such information is problematic because computational approaches traditionally rely on sentence-level processing and rarely take into account the extra-sentential context. In this dissertation, we investigate this omission phenomena, called implicit semantic role labeling. Implicit semantic role labeling involves identification of predicate argu- ments that are not locally realized but are resolvable from the context. For example, in ”What’s the matter, Walters? asked Baynes sharply.”, the ADDRESSEE of the predicate ask, Walters, is not mentioned as one of its syntactic arguments, but can be recoverable from the previous sentence. In this thesis, we try to improve methods for the automatic processing of such predicate instances to improve natural language pro- cessing applications. Our main contribution is introducing approaches to solve the data sparseness problem of the task. We improve automatic identification of implicit roles by increasing the amount of training set without needing to annotate new instances. For this purpose, we propose two approaches. As the first one, we use crowdsourcing to annotate instances of implicit semantic roles and show that with an appropriate task de- sign, reliable annotation of implicit semantic roles can be obtained from the non-experts without the need to present precise and linguistic definition of the roles to them. As the second approach, we combine seemingly incompatible corpora to solve the problem of data sparseness of ISRL by applying a domain adaptation technique. We show that out of domain data from a different genre can be successfully used to improve a baseline implicit semantic role labeling model, when used with an appropriate domain adapta- tion technique. The results also show that the improvement occurs regardless of the predicate part of speech, that is, identification of implicit roles relies more on semantic features than syntactic ones. Therefore, annotating instances of nominal predicates, for instance, can help to improve identification of verbal predicates’ implicit roles, we well. Our findings also show that the variety of the additional data is more important than its size. That is, increasing a large amount of data does not necessarily lead to a better model

    Frame semantics for the field of climate change : d iscovering frames based on chinese and english terms

    Full text link
    La plupart des dictionnaires spĂ©cialisĂ©s de termes environnementaux en mandarin sont des dictionnaires papier, compilĂ©s et rĂ©visĂ©s il y a plus de dix ans, et contiennent principalement des termes nominaux. Les informations terminologiques se limitent aux connaissances vĂ©hiculĂ©es par le terme et son ou ses Ă©quivalents anglais. Pour les lecteurs qui souhaitent connaĂźtre les propriĂ©tĂ©s sĂ©mantiques ou syntaxiques des termes et pour les lecteurs qui veulent voir l’usage des termes dans des contextes rĂ©els de textes spĂ©cialisĂ©s, les informations fournies par les dictionnaires existants sont insuffisantes. Dans cette recherche, nous avons compilĂ© une ressource terminologique en ligne du mandarin, dĂ©crivant les termes verbaux chinois dans le domaine du changement climatique. Cette ressource comble certaines des lacunes des dictionnaires environnementaux mandarin existants, en rĂ©vĂ©lant le(s) sens du terme Ă  travers la(les) structure(s) actantielle(s) et en montrant, Ă  travers des contextes annotĂ©s, les propriĂ©tĂ©s sĂ©mantiques et syntaxiques du terme ainsi que ses usages pratiques dans des textes spĂ©cialisĂ©s. Cette ressource rĂ©pondra mieux aux besoins du public. La base thĂ©orique qui sous-tend cette recherche est la SĂ©mantique des cadres (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), et le FrameNet construit Ă  partir de celle-ci. L’objectif principal de cette recherche est de dĂ©couvrir et de dĂ©finir des cadres sĂ©mantiques chinois dans le domaine du changement climatique, et d’établir des relations entre les cadres chinois dĂ©finis. Les cadres sĂ©mantiques chinois sont dĂ©couverts Ă  l’aide de la mĂ©thodologie du dictionnaire environnemental multilingue DiCoEnviro (et de sa ressource d’accompagnement Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). Afin de rendre cette mĂ©thodologie applicable Ă  une langue sino-tibĂ©taine, le chinois, nous avons modifiĂ© et adaptĂ© cette mĂ©thodologie pour qu’elle convienne Ă  la description des termes chinois et Ă  la dĂ©finition des cadres sĂ©mantiques chinois. Certaines de ces modifications et adaptations sont basĂ©es sur le Chinese FrameNet (CFN) (Liu & You, 2015). Afin de dĂ©couvrir les cadres sĂ©mantiques chinois, un corpus monolingue en chinois mandarin sur le changement climatique (MCCC) a d’abord Ă©tĂ© compilĂ©. Ce corpus contient 224 textes iv authentiques chinois spĂ©cialisĂ©s dans le domaine du changement climatique, qui totalisent 1,228,333 caractĂšres chinois, soit 547,592 mots chinois. Puis, les termes candidats ont Ă©tĂ© automatiquement extraits du MCCC Ă  l’aide du logiciel de gestion et d’analyse de corpus – Sketch Engine. AprĂšs une analyse et une validation manuelle, nous avons dĂ©terminĂ© quels termes candidats sont des termes rĂ©els. Par la suite, la structure actancielle de chaque terme a Ă©tĂ© Ă©crite en analysant les contextes oĂč le terme apparaĂźt. Ensuite, chaque sens d’un terme polysĂ©mique a Ă©tĂ© placĂ© dans une entrĂ©e sĂ©parĂ©e et 16-20 contextes ont Ă©tĂ© sĂ©lectionnĂ©s pour chaque entrĂ©e. Puis, chaque contexte a Ă©tĂ© annotĂ© en fonction de trois couches – structure sĂ©mantique, fonction syntaxique et groupe syntaxique. Ensuite, les termes ont Ă©tĂ© classĂ©s en fonction des scĂ©narios qu’ils Ă©voquent. Les termes qui dĂ©peignent la mĂȘme scĂšne ou situation dans le domaine du changement climatique, qui ont une structure actantielle similaire et qui partagent la majoritĂ© des circonstants sont classĂ©s dans un seul cadre sĂ©mantique (critĂšres basĂ©s sur le projet DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). AprĂšs avoir identifiĂ© les cadres sĂ©mantiques chinois, chaque cadre a Ă©tĂ© dĂ©fini. Enfin, les cadres chinois dĂ©couverts ont Ă©tĂ© reliĂ©s selon les huit types de relations entre cadres proposĂ©s par Ruppenhofer et al. (2016). Pour ĂȘtre affichĂ©s en ligne, les entrĂ©es de termes et les cadres sĂ©mantiques ont Ă©tĂ© encodĂ©s dans des fichiers XML. GuidĂ©s par cette mĂ©thodologie de recherche, nous avons finalement relevĂ© 23 cadres sĂ©mantiques chinois et nous les avons dĂ©finis. Le rĂ©sultat final de cette recherche est une ressource terminologique en chinois mandarin basĂ©e sur des cadres et spĂ©cialisĂ©e dans le domaine du changement climatique. Cette ressource terminologique se compose de deux parties. La première partie est la description d’un total de 39 termes verbaux chinois. Chaque sens d’un terme verbal polysĂ©mique Ă©tant placĂ© dans une entrĂ©e sĂ©parĂ©e, il y a au total 59 entrĂ©es (chaque entrĂ©e contient la structure actantielle et les contextes annotĂ©s). Au total, 1,027 contextes ont Ă©tĂ© annotĂ©s. La deuxiĂšme partie de cette ressource prĂ©sente les 23 cadres sĂ©mantiques chinois identifiĂ©s ainsi que les relations entre les cadres.Most of the existing Mandarin Chinese specialised dictionaries of environmental terms are paper dictionaries, compiled and revised more than ten years ago, and contain mainly noun terms. Terminological information is restricted to knowledge conveyed by the term and its English equivalent(s). For readers who want to learn about semantic or syntactic properties of terms and for readers who want to see usage of terms in real contexts of specialised texts, information provided in existing dictionaries is insufficient. In this research, we compiled an online Mandarin Chinese terminological resource, describing Chinese verb terms in the field of climate change. This resource makes up for some of the deficiencies of existing Chinese environmental dictionaries, revealing meaning(s) of the term through actantial structure(s) and showing, through annotated contexts, semantic and syntactic properties of the term as well as its practical usages in specialised texts. This resource better meets the needs of the audience. The theoretical basis underpinning this research is Frame Semantics (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), and the FrameNet built from it. The main objective of this research is to discover and define Chinese semantic frames in the field of climate change, and to establish relations between the Chinese frames defined. The Chinese semantic frames are discovered with the help of the methodology of the multilingual environmental dictionary DiCoEnviro (and its accompanying resource Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). In order to make this methodology applicable to a Sino-Tibetan language, Chinese, we modified and adapted this methodology to suit the description of Chinese terms and definition of Chinese semantic frames. Some of the changes and adaptations are based on the Chinese FrameNet (CFN) (Liu & You, 2015). In order to discover Chinese semantic frames, a monolingual Mandarin (Chinese) Climate Change Corpus (MCCC) was first compiled. This corpus contains 224 authentic Chinese specialised texts in the field of climate change, totaling 1,228,333 Chinese characters, which is 547,592 Chinese words. Following this, candidate terms were automatically extracted from MCCC using the corpus ii management and analysing software – Sketch Engine. After manual analysis and validation, which of the candidate terms are true terms was clarified. Subsequently, the actantial structure of each term was written by analysing the contexts where the term occurs. Next, each sense of a polysemous term was placed in a separate entry and 16-20 contexts were selected for each entry. Then, each context was annotated in terms of three layers – semantic structure, syntactic function and syntactic group. After this, the terms were classified according to the scenarios they evoke. Terms that depict the same scene or situation in the field of climate change, have similar actantial structure, and share the majority of circumstants are categorised into one semantic frame (criteria based on the project DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). After Chinese semantic frames were identified, each frame was defined. Finally, the discovered Chinese frames were linked according to the eight types of frame relations proposed by Ruppenhofer et al. (2016). To be displayed online, term entries and semantic frames were encoded in XML files. Guided by this research methodology, we eventually discovered and defined 23 Chinese semantic frames. The end result of this research is a frame-based Mandarin Chinese terminological resource specialised in the field of climate change. This terminological resource consists of two parts. The first part is the description of a total of 39 Chinese verb terms. With each meaning of a polysemous verb term placed in a separate entry, there are a total of 59 entries (each entry contains the actantial structure and annotated contexts). A total of 1,027 contexts were annotated. The second part of this resource presents the 23 Chinese semantic frames identified as well as the relations between frames

    Proceedings

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili MĂŒĂŒrisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 268 pages. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891
    • 

    corecore