9 research outputs found

    "Where the data are coming from?" Ethics, crowdsourcing and traceability for Big Data in Human Language Technology

    No full text
    National audienceBased on the experience gained on the observation of the corpora developement in HLT, the authors want to warn the Big Data community about some recent usage of hu-man computation. For instance, the growing use in the HLT community of crowdsourcing methods, and especially of microworking retributed crowsourcing platforms, lead to many ethical, economical and juridical concerns. The au-thors want also to foster some behaviours, especially con-cerning traceability, implemented in the form of a charter, the Ethics and Big Data Charter

    Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter

    Get PDF
    International audienceThe authors have written the Ethics and Big Data Charter in collaboration with various agencies, private bodies and associations. This Charter aims at describing any large or complex resources, and in particular language resources, from a legal and ethical viewpoint and ensuring the transparency of the process of creating and distributing such resources. We propose in this article an analysis of the documentation coverage of the most frequently mentioned language resources with regards to the Charter, in order to show the benefit it offer

    "Where the data are coming from?" Ethics, crowdsourcing and traceability for Big Data in Human Language Technology

    Get PDF
    National audienceBased on the experience gained on the observation of the corpora developement in HLT, the authors want to warn the Big Data community about some recent usage of hu-man computation. For instance, the growing use in the HLT community of crowdsourcing methods, and especially of microworking retributed crowsourcing platforms, lead to many ethical, economical and juridical concerns. The au-thors want also to foster some behaviours, especially con-cerning traceability, implemented in the form of a charter, the Ethics and Big Data Charter

    Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter

    Get PDF
    International audienceThe authors have written the Ethics and Big Data Charter in collaboration with various agencies, private bodies and associations. This Charter aims at describing any large or complex resources, and in particular language resources, from a legal and ethical viewpoint and ensuring the transparency of the process of creating and distributing such resources. We propose in this article an analysis of the documentation coverage of the most frequently mentioned language resources with regards to the Charter, in order to show the benefit it offer

    A sequence to sequence transformer data logic experiment

    No full text
    International audienceIn this paper we present experiments to evaluate how a T5 model behaves with regard to input data fidelity. The rationale behind these experiments is to evaluate if a sequence to sequence transformer can be constrained into generating the specifics of a financial report, and more generally whether it can trustfully reproduce a semantic logic, and to what extent

    The Financial Document Causality Detection Shared Task (FinCausal 2020)

    No full text
    We present the FinCausal 2020 Shared Task on Causality Detection in Financial Documents and the associated FinCausal dataset, and discuss the participating systems and results. Two sub-tasks are proposed: a binary classification task (Task 1) and a relation extraction task (Task 2). A total of 16 teams submitted runs across the two Tasks and 13 of them contributed with a system description paper. This workshop is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), held at The 28th International Conference on Computational Linguistics (COLING'2020), Barcelona, Spain on September 12, 2020
    corecore