17 research outputs found

    Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin

    Get PDF
    In this paper we describe a dataset of German and Latin \textit{ground truth} (GT) for historical OCR in the form of printed text line images paired with their transcription. This dataset, called \textit{GT4HistOCR}, consists of 313,173 line pairs covering a wide period of printing dates from incunabula from the 15th century to 19th century books printed in Fraktur types and is openly available under a CC-BY 4.0 license. The special form of GT as line image/transcription pairs makes it directly usable to train state-of-the-art recognition models for OCR software employing recurring neural networks in LSTM architecture such as Tesseract 4 or OCRopus. We also provide some pretrained OCRopus models for subcorpora of our dataset yielding between 95\% (early printings) and 98\% (19th century Fraktur printings) character accuracy rates on unseen test cases, a Perl script to harmonize GT produced by different transcription rules, and give hints on how to construct GT for OCR purposes which has requirements that may differ from linguistically motivated transcriptions.Comment: Submitted to JLCL Volume 33 (2018), Issue 1: Special Issue on Automatic Text and Layout Recognitio

    Plasma DNA methylation: a potential biomarker for stratification of liver fibrosis in non-alcoholic fatty liver disease

    Get PDF
    Objective: Liver biopsy is currently the most reliable way of evaluating liver fibrosis in patients with non-alcoholic fatty liver disease (NAFLD). Its inherent risks limit its widespread use. Differential liver DNA methylation of peroxisome proliferator-activated receptor gamma (PPARγ) gene promoter has recently been shown to stratify patients in terms of fibrosis severity but requires access to liver tissue. The aim of this study was to assess whether DNA methylation of circulating DNA could be detected in human plasma and potentially used to stratify liver fibrosis severity in patients with NAFLD.Design: Patients with biopsy-proven NAFLD and age-matched controls were recruited from the liver and gastroenterology clinics at the Newcastle upon Tyne Hospitals NHS Foundation Trust. Plasma cell-free circulating DNA methylation of PPARγ was quantitatively assessed by pyrosequencing. Liver DNA methylation was quantitatively assessed by pyrosequencing NAFLD explant tissue, subjected to laser capture microdissection (LCM). Patients with alcoholic liver disease (ALD) were also subjected to plasma DNA and LCM pyrosequencing.Results: 26 patients with biopsy-proven NAFLD were included. Quantitative plasma DNA methylation of PPARγ stratified patients into mild (Kleiner 1–2) and severe (Kleiner 3–4) fibrosis (CpG1: 63% vs 86%, p0.05). Hypermethylation at the PPARγ promoter of plasma DNA correlated with changes in hepatocellular rather than myofibroblast DNA methylation. Similar results were demonstrated in patients with ALD cirrhosis.Conclusions: Differential DNA methylation at the PPARγ promoter can be detected within the pool of cell-free DNA of human plasma. With further validation, plasma DNA methylation of PPARγ could potentially be used to non-invasively stratify liver fibrosis severity in patients with NAFLD. Plasma DNA methylation signatures reflect the molecular pathology associated with fibrotic liver disease

    Multiorgan MRI findings after hospitalisation with COVID-19 in the UK (C-MORE): a prospective, multicentre, observational cohort study

    Get PDF
    Introduction: The multiorgan impact of moderate to severe coronavirus infections in the post-acute phase is still poorly understood. We aimed to evaluate the excess burden of multiorgan abnormalities after hospitalisation with COVID-19, evaluate their determinants, and explore associations with patient-related outcome measures. Methods: In a prospective, UK-wide, multicentre MRI follow-up study (C-MORE), adults (aged ≥18 years) discharged from hospital following COVID-19 who were included in Tier 2 of the Post-hospitalisation COVID-19 study (PHOSP-COVID) and contemporary controls with no evidence of previous COVID-19 (SARS-CoV-2 nucleocapsid antibody negative) underwent multiorgan MRI (lungs, heart, brain, liver, and kidneys) with quantitative and qualitative assessment of images and clinical adjudication when relevant. Individuals with end-stage renal failure or contraindications to MRI were excluded. Participants also underwent detailed recording of symptoms, and physiological and biochemical tests. The primary outcome was the excess burden of multiorgan abnormalities (two or more organs) relative to controls, with further adjustments for potential confounders. The C-MORE study is ongoing and is registered with ClinicalTrials.gov, NCT04510025. Findings: Of 2710 participants in Tier 2 of PHOSP-COVID, 531 were recruited across 13 UK-wide C-MORE sites. After exclusions, 259 C-MORE patients (mean age 57 years [SD 12]; 158 [61%] male and 101 [39%] female) who were discharged from hospital with PCR-confirmed or clinically diagnosed COVID-19 between March 1, 2020, and Nov 1, 2021, and 52 non-COVID-19 controls from the community (mean age 49 years [SD 14]; 30 [58%] male and 22 [42%] female) were included in the analysis. Patients were assessed at a median of 5·0 months (IQR 4·2–6·3) after hospital discharge. Compared with non-COVID-19 controls, patients were older, living with more obesity, and had more comorbidities. Multiorgan abnormalities on MRI were more frequent in patients than in controls (157 [61%] of 259 vs 14 [27%] of 52; p<0·0001) and independently associated with COVID-19 status (odds ratio [OR] 2·9 [95% CI 1·5–5·8]; padjusted=0·0023) after adjusting for relevant confounders. Compared with controls, patients were more likely to have MRI evidence of lung abnormalities (p=0·0001; parenchymal abnormalities), brain abnormalities (p<0·0001; more white matter hyperintensities and regional brain volume reduction), and kidney abnormalities (p=0·014; lower medullary T1 and loss of corticomedullary differentiation), whereas cardiac and liver MRI abnormalities were similar between patients and controls. Patients with multiorgan abnormalities were older (difference in mean age 7 years [95% CI 4–10]; mean age of 59·8 years [SD 11·7] with multiorgan abnormalities vs mean age of 52·8 years [11·9] without multiorgan abnormalities; p<0·0001), more likely to have three or more comorbidities (OR 2·47 [1·32–4·82]; padjusted=0·0059), and more likely to have a more severe acute infection (acute CRP >5mg/L, OR 3·55 [1·23–11·88]; padjusted=0·025) than those without multiorgan abnormalities. Presence of lung MRI abnormalities was associated with a two-fold higher risk of chest tightness, and multiorgan MRI abnormalities were associated with severe and very severe persistent physical and mental health impairment (PHOSP-COVID symptom clusters) after hospitalisation. Interpretation: After hospitalisation for COVID-19, people are at risk of multiorgan abnormalities in the medium term. Our findings emphasise the need for proactive multidisciplinary care pathways, with the potential for imaging to guide surveillance frequency and therapeutic stratification

    Argument clauses and correlative es in German -- deriving discourse properties in a unification analysis

    No full text
    We present an analysis of finite argument clauses in German with the goal of clarifying the conditions that control the presence/absence of an additional correlative es in the Mittelfeld. The syntactic analysis relies on the assumption that both the clause and the pronominal es contribute to the same argument slot of the matrix verb, unifying their f-structure contribution under the same grammatical function. The discourse effects triggered by es follow from the behaviour expected from a (semantically) anaphoric element -- its presence either indicates that the state of affairs it refers to has already been discussed; or else, it causes presupposition accommodation. The strict exclusion of an es along with a topicalized finite clause can be reduced to a violation of generalized binding principles

    Argument Clauses and Correlative es in German -- deriving discourse properties in a unification analysis

    No full text
    We present an analysis of finite argument clauses in German with the goal of clarifying the conditions that control the presence/absence of an additional correlative es in the Mittelfeld. The syntactic analysis relies on the assumption that both the clause and the pronominal es contribute to the same argument slot of the matrix verb, unifying their f-structure contribution under the same grammatical function. The discourse effects triggered by es follow from the behaviour expected from a (semantically) anaphoric element -- its presence either indicates that the state of affairs it refers to has already been discussed; or else, it causes presupposition accommodation. The strict exclusion of an es along with a topicalized finite clause can be reduced to a violation of generalized binding principles

    EAGLE MOAO system conceptual design and related technologies

    No full text
    International audienceEAGLE is the multi-object spatially-resolved near-IR spectrograph instrument concept for the E-ELT, relying on a distributed Adaptive Optics, so-called Multi Object Adaptive Optics. This paper presents the results of a phase A study. Using 84×84 actuator deformable mirrors, the performed analysis demonstrates that 6 laser guide stars (on an outer ring of 7.2' diameter) and up to 5 natural guide stars of magnitude R 2 resolution element at H band whatever the target direction in the centred 5' science field for median seeing conditions. In terms of sky coverage, the probability to find the 5 natural guide stars is close to 90% at galactic latitudes |b| ~ 60 deg. Several MOAO demonstration activities are also on-going
    corecore