2,877 research outputs found

    Machine Learning meets Data-Driven Journalism: Boosting International Understanding and Transparency in News Coverage

    Full text link
    Migration crisis, climate change or tax havens: Global challenges need global solutions. But agreeing on a joint approach is difficult without a common ground for discussion. Public spheres are highly segmented because news are mainly produced and received on a national level. Gain- ing a global view on international debates about important issues is hindered by the enormous quantity of news and by language barriers. Media analysis usually focuses only on qualitative re- search. In this position statement, we argue that it is imperative to pool methods from machine learning, journalism studies and statistics to help bridging the segmented data of the international public sphere, using the Transatlantic Trade and Investment Partnership (TTIP) as a case study.Comment: presented at 2016 ICML Workshop on #Data4Good: Machine Learning in Social Good Applications, New York, N

    An investigation of challenges in machine translation of literary texts : the case of the English–Chinese language pair

    Get PDF
    In the absence of a focus on literary text translation in studies of machine translation (MT), this study aims at investigating some challenges of this application of the technology. First, the most commonly used types of MT are reviewed in chronological order of their development, and, for the purpose of identifying challenges for MT in literary text translation, the challenges human translators face in literary text translation are linked to corresponding aspects of MT. In investigating the research questions of the challenges that MT systems face in literary text translation, and whether equivalence can be established by MT in literary text translation, a qualitative method is used. Areas such as the challenges for MT in the establishment of corpora, achieving equivalence, and realisation of creativity in literary texts are examined in order to reveal some of the potential contributing factors to the difficulties faced in literary text translation by MT. Through text analysis on chosen sample literary texts on three online MT platforms (Google Translate, DeepL and Youdao Translate), all based on highly advanced neural machine translation engines, this study offers a pragmatic view on some challenging areas in literary text translation using these widely acclaimed online platforms, and offers insights on potential research opportunities in studies of literary text translation using MT

    Translating Technical Texts

    Get PDF

    “Measuring Silences” in the Translation of Awa Thiam\u27s La Parole aux NĂ©gresses

    Get PDF
    An overlooked, yet significant text in the genealogy of intersectionality and Black feminist theory is Awa Thiam’s 1978 text La Parole aux NĂ©gresses. This paper examines the ways that the English translation, Speak Out, Black Sisters: Feminism and Oppression in Black Africa,though widening the audience for Thiam’s work, engages in various practices of erasure that undermine Thiam’s academic authority, theoretical contributions, activist insights, and ultimately, her own voice. Namely, I contend that these practices, which scholars have linked to receptions and English translations of Black Francophone texts in particular, include de-formalization, domestication, de-philosophizing, untracing, and invisibilisation. I seek not just to focus on the “negative” aspect of these silences, but also to enact a partial restitution of Thiam’s insights from the original French text. Further, re-engaging with her text, contributions, and insights calls for more reflexivity around the politics of translation, English language hegemony, and recognition of African feminist scholarship

    Computer-Aided Biomimetics : Semi-Open Relation Extraction from scientific biological texts

    Get PDF
    Engineering inspired by biology – recently termed biom* – has led to various groundbreaking technological developments. Example areas of application include aerospace engineering and robotics. However, biom* is not always successful and only sporadically applied in industry. The reason is that a systematic approach to biom* remains at large, despite the existence of a plethora of methods and design tools. In recent years computational tools have been proposed as well, which can potentially support a systematic integration of relevant biological knowledge during biom*. However, these so-called Computer-Aided Biom* (CAB) tools have not been able to fill all the gaps in the biom* process. This thesis investigates why existing CAB tools fail, proposes a novel approach – based on Information Extraction – and develops a proof-of-concept for a CAB tool that does enable a systematic approach to biom*. Key contributions include: 1) a disquisition of existing tools guides the selection of a strategy for systematic CAB, 2) a dataset of 1,500 manually-annotated sentences, 3) a novel Information Extraction approach that combines the outputs from a supervised Relation Extraction system and an existing Open Information Extraction system. The implemented exploratory approach indicates that it is possible to extract a focused selection of relations from scientific texts with reasonable accuracy, without imposing limitations on the types of information extracted. Furthermore, the tool developed in this thesis is shown to i) speed up a trade-off analysis by domain-experts, and ii) also improve the access to biology information for nonexperts

    Computer-aided biomimetics : semi-open relation extraction from scientific biological texts

    Get PDF
    Engineering inspired by biology – recently termed biom* – has led to various ground-breaking technological developments. Example areas of application include aerospace engineering and robotics. However, biom* is not always successful and only sporadically applied in industry. The reason is that a systematic approach to biom* remains at large, despite the existence of a plethora of methods and design tools. In recent years computational tools have been proposed as well, which can potentially support a systematic integration of relevant biological knowledge during biom*. However, these so-called Computer-Aided Biom* (CAB) tools have not been able to fill all the gaps in the biom* process. This thesis investigates why existing CAB tools fail, proposes a novel approach – based on Information Extraction – and develops a proof-of-concept for a CAB tool that does enable a systematic approach to biom*. Key contributions include: 1) a disquisition of existing tools guides the selection of a strategy for systematic CAB, 2) a dataset of 1,500 manually-annotated sentences, 3) a novel Information Extraction approach that combines the outputs from a supervised Relation Extraction system and an existing Open Information Extraction system. The implemented exploratory approach indicates that it is possible to extract a focused selection of relations from scientific texts with reasonable accuracy, without imposing limitations on the types of information extracted. Furthermore, the tool developed in this thesis is shown to i) speed up a trade-off analysis by domain-experts, and ii) also improve the access to biology information for non-exper

    From holism to compositionality: memes and the evolution of segmentation, syntax, and signification in music and language

    Get PDF
    Steven Mithen argues that language evolved from an antecedent he terms “Hmmmmm, [meaning it was] Holistic, manipulative, multi-modal, musical and mimetic”. Owing to certain innate and learned factors, a capacity for segmentation and cross-stream mapping in early Homo sapiens broke the continuous line of Hmmmmm, creating discrete replicated units which, with the initial support of Hmmmmm, eventually became the semantically freighted words of modern language. That which remained after what was a bifurcation of Hmmmmm arguably survived as music, existing as a sound stream segmented into discrete units, although one without the explicit and relatively fixed semantic content of language. All three types of utterance – the parent Hmmmmm, language, and music – are amenable to a memetic interpretation which applies Universal Darwinism to what are understood as language and musical memes. On the basis of Peter Carruthers’ distinction between ‘cognitivism’ and ‘communicativism’ in language, and William Calvin’s theories of cortical information encoding, a framework is hypothesized for the semantic and syntactic associations between, on the one hand, the sonic patterns of language memes (‘lexemes’) and of musical memes (‘musemes’) and, on the other hand, ‘mentalese’ conceptual structures, in Chomsky’s ‘Logical Form’ (LF)
    • 

    corecore