117 research outputs found

    Overview of the author identification task at PAN 2014

    Get PDF
    The author identification task at PAN-2014 focuses on author verification. Similar to PAN-2013 we are given a set of documents by the same author along with exactly one document of questioned authorship, and the task is to determine whether the known and the questioned documents are by the same author or not. In comparison to PAN-2013, a significantly larger corpus was built comprising hundreds of documents in four natural languages (Dutch, English, Greek, and Spanish) and four genres (essays, reviews, novels, opinion articles). In addition, more suitable performance measures are used focusing on the accuracy and the confidence of the predictions as well as the ability of the submitted methods to leave some problems unanswered in case there is great uncertainty. To this end, we adopt the c@1 measure, originally proposed for the question answering task. We received 13 software submissions that were evaluated in the TIRA framework. Analytical evaluation results are presented where one language-independent approach serves as a challenging baseline. Moreover, we continue the successful practice of the PAN labs to examine meta-models based on the combination of all submitted systems. Last but not least, we provide statistical significance tests to demonstrate the important differences between the submitted approaches

    Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness

    Get PDF
    We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 1: Check-Worthiness. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact checking. We offered the task in both English and Arabic, based on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign. A total of 30 teams registered to participate in the Lab and seven teams actually submitted systems for Task 1. The most successful approaches used by the participants relied on recurrent and multi-layer neural networks, as well as on combinations of distributional representations, on matchings claims' vocabulary against lexicons, and on measures of syntactic dependency. The best systems achieved mean average precision of 0.18 and 0.15 on the English and on the Arabic test datasets, respectively. This leaves large room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in check-worthiness estimation

    Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality

    Get PDF
    We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 2: Factuality. The task asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. In terms of data, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and five of them actually submitted runs. The most successful approaches used by the participants relied on the automatic retrieval of evidence from the Web. Similarities and other relationships between the claim and the retrieved documents were used as input to classifiers in order to make a decision. The best-performing official submissions achieved mean absolute error of .705 and .658 for the English and for the Arabic test sets, respectively. This leaves plenty of room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in fact-checking

    Overview of the CLEF-2018 checkthat! lab on automatic identification and verification of political claims

    Get PDF
    We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. In its starting year, the lab featured two tasks. Task 1 asked to predict which (potential) claims in a political debate should be prioritized for fact-checking; in particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact-checking. Task 2 asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. We offered both tasks in English and in Arabic. In terms of data, for both tasks, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and 9 of them actually submitted runs. The evaluation results show that the most successful approaches used various neural networks (esp. for Task 1) and evidence retrieval from the Web (esp. for Task 2). We release all datasets, the evaluation scripts, and the submissions by the participants, which should enable further research in both check-worthiness estimation and automatic claim verification

    Direct Interrogation of Viral Peptides Presented by the Class I HLA of HIV-Infected T Cells

    Get PDF
    Identification of CD8+ cytotoxic T lymphocyte (CTL) epitopes has traditionally relied upon testing of overlapping peptide libraries for their reactivity with T cells in vitro. Here, we pursued deep ligand sequencing (DLS) as an alternative method of directly identifying those ligands that are epitopes presented to CTLs by the class I human leukocyte antigens (HLA) of infected cells. Soluble class I HLA-A*11:01 (sHLA) was gathered from HIV-1 NL4-3-infected human CD4+ SUP-T1 cells. HLA-A*11:01 harvested from infected cells was immunoaffinity purified and acid boiled to release heavy and light chains from peptide ligands that were then recovered by size-exclusion filtration. The ligands were first fractionated by high-pH high-pressure liquid chromatography and then subjected to separation by nano-liquid chromatography (nano-LC)–mass spectrometry (MS) at low pH. Approximately 10 million ions were selected for sequencing by tandem mass spectrometry (MS/MS). HLA-A*11:01 ligand sequences were determined with PEAKS software and confirmed by comparison to spectra generated from synthetic peptides. DLS identified 42 viral ligands presented by HLA-A*11:01, and 37 of these were previously undetected. These data demonstrate that (i) HIV-1 Gag and Nef are extensively sampled, (ii) ligand length variants are prevalent, particularly within Gag and Nef hot spots where ligand sequences overlap, (iii) noncanonical ligands are T cell reactive, and (iv) HIV-1 ligands are derived from de novo synthesis rather than endocytic sampling. Next-generation immunotherapies must factor these nascent HIV-1 ligand length variants and the finding that CTL-reactive epitopes may be absent during infection of CD4+ T cells into strategies designed to enhance T cell immunity

    A highly-resolved food web for insect seed predators in a species-rich tropical forest

    Get PDF
    The top-down and indirect effects of insects on plant communities depend on patterns of host use, which are often poorly documented, particularly in species-rich tropical forests. At Barro Colorado Island, Panama, we compiled the first food web quantifying trophic interactions between the majority of co-occurring woody plant species and their internally-feeding insect seed predators. Our study is based on more than 200,000 fruits representing 478 plant species, associated with 369 insect species. Insect host-specificity was remarkably high: only 20% of seed predator species were associated with more than one plant species, while each tree species experienced seed predation from a median of two insect species. Phylogeny, but not plant traits, explained patterns of seed predator attack. These data suggest that seed predators are unlikely to mediate indirect interactions such as apparent competition between plant species, but are consistent with their proposed contribution to maintaining plant diversity via the Janzen-Connell mechanism
    • …
    corecore