316 research outputs found

    Measuring the Correctness of Double-Keying: Error Classification and Quality Control in a Large Corpus of TEI-Annotated Historical Text

    Get PDF
    Among mass digitization methods, double-keying is considered to be the one with the lowest error rate. This method requires two independent transcriptions of a text by two different operators. It is particularly well suited to historical texts, which often exhibit deficiencies like poor master copies or other difficulties such as spelling variation or complex text structures. Providers of data entry services using the double-keying method generally advertise very high accuracy rates (around 99.95% to 99.98%). These advertised percentages are generally estimated on the basis of small samples, and little if anything is said about either the actual amount of text or the text genres which have been proofread, about error types, proofreaders, etc. In order to obtain significant data on this problem it is necessary to analyze a large amount of text representing a balanced sample of different text types, to distinguish the structural XML/TEI level from the typographical level, and to differentiate between various types of errors which may originate from different sources and may not be equally severe. This paper presents an extensive and complex approach to the analysis and correction of double-keying errors which has been applied by the DFG-funded project "Deutsches Textarchiv" (German Text Archive, hereafter DTA) in order to evaluate and preferably to increase the transcription and annotation accuracy of double-keyed DTA texts. Statistical analyses of the results gained from proofreading a large quantity of text are presented, which verify the common accuracy rates for the double-keying method

    The DTA “Base Format”: A TEI Subset for the Compilation of a Large Reference Corpus of Printed Text from Multiple Sources

    Get PDF
    In this article we describe the DTA “Base Format” (DTABf), a strict subset of the TEI P5 tag set. The purpose of the DTABf is to provide a balance between expressiveness and precision as well as an interoperable annotation scheme for a large variety of text types of historical corpora of printed text from multiple sources. The DTABf has been developed on the basis of a large amount of historical text data in the core corpus of the project Deutsches Textarchiv (DTA) and text collections from 15 cooperating projects with a current total of 210 million tokens. The DTABf is a “living” TEI format which is continuously adjusted when new text candidates for the DTA containing new structural phenomena are encountered. We also focus on other aspects of the DTABf including consistency, interoperability with other TEI dialects, HTML and other presentations of the TEI texts, and conversion into other formats, as well as linguistic analysis. We include some examples of best practices to illustrate how external corpora can be losslessly converted into the DTABf, thus enabling third parties to use the DTABf in their specific projects. The DTABf is comprehensively documented, and several software tools are available for working with it, making it a widely used format for the encoding of historical printed German text

    A succinate/SUCNR1-brush cell defense program in the tracheal epithelium

    Get PDF
    Host-derived succinate accumulates in the airways during bacterial infection. Here, we show that luminal succinate activates murine tracheal brush (tuft) cells through a signaling cascade involving the succinate receptor 1 (SUCNR1), phospholipase Cβ2, and the cation channel transient receptor potential channel subfamily M member 5 (TRPM5). Stimulated brush cells then trigger a long-range Ca2+ wave spreading radially over the tracheal epithelium through a sequential signaling process. First, brush cells release acetylcholine, which excites nearby cells via muscarinic acetylcholine receptors. From there, the Ca2+ wave propagates through gap junction signaling, reaching also distant ciliated and secretory cells. These effector cells translate activation into enhanced ciliary activity and Cl− secretion, which are synergistic in boosting mucociliary clearance, the major innate defense mechanism of the airways. Our data establish tracheal brush cells as a central hub in triggering a global epithelial defense program in response to a danger-associated metabolite

    ART Suppresses Plasma HIV-1 RNA to a Stable Set Point Predicted by Pretherapy Viremia

    Get PDF
    Current antiretroviral therapy is effective in suppressing but not eliminating HIV-1 infection. Understanding the source of viral persistence is essential for developing strategies to eradicate HIV-1 infection. We therefore investigated the level of plasma HIV-1 RNA in patients with viremia suppressed to less than 50–75 copies/ml on standard protease inhibitor- or non-nucleoside reverse transcriptase inhibitor-containing antiretroviral therapy using a new, real-time PCR-based assay for HIV-1 RNA with a limit of detection of one copy of HIV-1 RNA. Single copy assay results revealed that >80% of patients on initial antiretroviral therapy for 60 wk had persistent viremia of one copy/ml or more with an overall median of 3.1 copies/ml. The level of viremia correlated with pretherapy plasma HIV-1 RNA but not with the specific treatment regimen. Longitudinal studies revealed no significant decline in the level of viremia between 60 and 110 wk of suppressive antiretroviral therapy. These data suggest that the persistent viremia on current antiretroviral therapy is derived, at least in part, from long-lived cells that are infected prior to initiation of therapy
    corecore