1,509 research outputs found

    Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] Existing transcripts for historic manuscripts are a very valuable resource for training models useful for automatic recognition, aided transcription, and/or indexing of the remaining untranscribed parts of these collections. However, these existing transcripts generally exhibit two main problems which hinder their convenience: a) text of the transcripts is seldom aligned with manuscript lines, and b) text often deviate very significantly from what can be seen in the manuscript, either because writing style has been modernized or abbreviations have been expanded, or both. This work presents an analysis of these problems and discusses possible solutions for minimizing human effort needed to adapt existing transcripts in order to render them usable. Empirical results presented show the huge performance gain that can be obtained by adequately adapting the transcripts, thus motivating future development of the proposed solutions.We are very grateful to Carlos Lechner and Celio Hernández who helped in the creation of the ground truth of the Alcaraz dataset. This work has been partially supported by the European Union (EU) Horizon 2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), EU project HIMANIS (JPICH programme, Spanish grant Ref: PCIN-2015-068) and MINECO/FEDER, UE under project TIN2015-70924-C2-1-R.Villegas, M.; Toselli, AH.; Romero Gómez, V.; Vidal, E. (2016). Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition. IEEE. https://doi.org/10.1109/ICFHR.2016.22

    A Set of Benchmarks for Handwritten Text Recognition on Historical Documents

    Full text link
    [EN] Handwritten Text Recognition is a important requirement in order to make visible the contents of the myriads of historical documents residing in public and private archives and libraries world wide. Automatic Handwritten Text Recognition (HTR) is a challenging problem that requires a careful combination of several advanced Pattern Recognition techniques, including but not limited to Image Processing, Document Image Analysis, Feature Extraction, Neural Network approaches and Language Modeling. The progress of this kind of systems is strongly bound by the availability of adequate benchmarking datasets, software tools and reproducible results achieved using the corresponding tools and datasets. Based on English and German historical documents proposed in recent open competitions at ICDAR and ICFHR conferences between 2014 and 2017, this paper introduces four HTR benchmarks in order of increasing complexity from several points of view. For each benchmark, a specific system is proposed which overcomes results published so far under comparable conditions. Therefore, this paper establishes new state of the art baseline systems and results which aim at becoming new challenges that would hopefully drive further improvement of HTR technologies. Both the datasets and the software tools used to implement the baseline systems are made freely accessible for research purposes. (C) 2019 Elsevier Ltd. All rights reserved.This work has been partially supported through the European Union's H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), as well as by the BBVA Foundation through the 2017-2018 and 2018-2019 Digital Humanities research grants "Carabela" and "HisClima - Dos Siglos de Datos Cilmaticos", and by EU JPICH project "HOME - History Of Medieval Europe" (Spanish PEICTI Ref. PC12018-093122).Sánchez Peiró, JA.; Romero, V.; Toselli, AH.; Villegas, M.; Vidal, E. (2019). A Set of Benchmarks for Handwritten Text Recognition on Historical Documents. Pattern Recognition. 94:122-134. https://doi.org/10.1016/j.patcog.2019.05.025S1221349

    A comprehensive dataset of annotated brain metastasis MR images with clinical and radiomic data.

    Get PDF
    Brain metastasis (BM) is one of the main complications of many cancers, and the most frequent malignancy of the central nervous system. Imaging studies of BMs are routinely used for diagnosis of disease, treatment planning and follow-up. Artificial Intelligence (AI) has great potential to provide automated tools to assist in the management of disease. However, AI methods require large datasets for training and validation, and to date there have been just one publicly available imaging dataset of 156 BMs. This paper publishes 637 high-resolution imaging studies of 75 patients harboring 260 BM lesions, and their respective clinical data. It also includes semi-automatic segmentations of 593 BMs, including pre- and post-treatment T1-weighted cases, and a set of morphological and radiomic features for the cases segmented. This data-sharing initiative is expected to enable research into and performance evaluation of automatic BM detection, lesion segmentation, disease status evaluation and treatment planning methods for BMs, as well as the development and validation of predictive and prognostic tools with clinical applicability

    Role of age and comorbidities in mortality of patients with infective endocarditis

    Get PDF
    [Purpose]: The aim of this study was to analyse the characteristics of patients with IE in three groups of age and to assess the ability of age and the Charlson Comorbidity Index (CCI) to predict mortality. [Methods]: Prospective cohort study of all patients with IE included in the GAMES Spanish database between 2008 and 2015.Patients were stratified into three age groups:<65 years,65 to 80 years,and ≥ 80 years.The area under the receiver-operating characteristic (AUROC) curve was calculated to quantify the diagnostic accuracy of the CCI to predict mortality risk. [Results]: A total of 3120 patients with IE (1327 < 65 years;1291 65-80 years;502 ≥ 80 years) were enrolled.Fever and heart failure were the most common presentations of IE, with no differences among age groups.Patients ≥80 years who underwent surgery were significantly lower compared with other age groups (14.3%,65 years; 20.5%,65-79 years; 31.3%,≥80 years). In-hospital mortality was lower in the <65-year group (20.3%,<65 years;30.1%,65-79 years;34.7%,≥80 years;p < 0.001) as well as 1-year mortality (3.2%, <65 years; 5.5%, 65-80 years;7.6%,≥80 years; p = 0.003).Independent predictors of mortality were age ≥ 80 years (hazard ratio [HR]:2.78;95% confidence interval [CI]:2.32–3.34), CCI ≥ 3 (HR:1.62; 95% CI:1.39–1.88),and non-performed surgery (HR:1.64;95% CI:11.16–1.58).When the three age groups were compared,the AUROC curve for CCI was significantly larger for patients aged <65 years(p < 0.001) for both in-hospital and 1-year mortality. [Conclusion]: There were no differences in the clinical presentation of IE between the groups. Age ≥ 80 years, high comorbidity (measured by CCI),and non-performance of surgery were independent predictors of mortality in patients with IE.CCI could help to identify those patients with IE and surgical indication who present a lower risk of in-hospital and 1-year mortality after surgery, especially in the <65-year group

    Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV

    Get PDF
    Many measurements and searches for physics beyond the standard model at the LHC rely on the efficient identification of heavy-flavour jets, i.e. jets originating from bottom or charm quarks. In this paper, the discriminating variables and the algorithms used for heavy-flavour jet identification during the first years of operation of the CMS experiment in proton-proton collisions at a centre-of-mass energy of 13 TeV, are presented. Heavy-flavour jet identification algorithms have been improved compared to those used previously at centre-of-mass energies of 7 and 8 TeV. For jets with transverse momenta in the range expected in simulated tt\mathrm{t}\overline{\mathrm{t}} events, these new developments result in an efficiency of 68% for the correct identification of a b jet for a probability of 1% of misidentifying a light-flavour jet. The improvement in relative efficiency at this misidentification probability is about 15%, compared to previous CMS algorithms. In addition, for the first time algorithms have been developed to identify jets containing two b hadrons in Lorentz-boosted event topologies, as well as to tag c jets. The large data sample recorded in 2016 at a centre-of-mass energy of 13 TeV has also allowed the development of new methods to measure the efficiency and misidentification probability of heavy-flavour jet identification algorithms. The heavy-flavour jet identification efficiency is measured with a precision of a few per cent at moderate jet transverse momenta (between 30 and 300 GeV) and about 5% at the highest jet transverse momenta (between 500 and 1000 GeV)

    Evidence for the Higgs boson decay to a bottom quark–antiquark pair

    Get PDF
    info:eu-repo/semantics/publishe

    Transforming scholarship in the archives through handwritten text recognition:Transkribus as a case study

    Get PDF
    Purpose: An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus, gives examples of use cases, highlights the affect HTR may have on scholarship, and evidences this turning point of the advanced use of digitised heritage content. The paper aims to discuss these issues. - Design/methodology/approach: This paper adopts a case study approach, using the development and delivery of the one openly available HTR platform for manuscript material. - Findings: Transkribus has demonstrated that HTR is now a useable technology that can be employed in conjunction with mass digitisation to generate accurate transcripts of archival material. Use cases are demonstrated, and a cooperative model is suggested as a way to ensure sustainability and scaling of the platform. However, funding and resourcing issues are identified. - Research limitations/implications: The paper presents results from projects: further user studies could be undertaken involving interviews, surveys, etc. - Practical implications: Only HTR provided via Transkribus is covered: however, this is the only publicly available platform for HTR on individual collections of historical documents at time of writing and it represents the current state-of-the-art in this field. - Social implications: The increased access to information contained within historical texts has the potential to be transformational for both institutions and individuals. - Originality/value: This is the first published overview of how HTR is used by a wide archival studies community, reporting and showcasing current application of handwriting technology in the cultural heritage sector

    Children’s and adolescents’ rising animal-source food intakes in 1990–2018 were impacted by age, region, parental education and urbanicity

    Get PDF
    Animal-source foods (ASF) provide nutrition for children and adolescents’ physical and cognitive development. Here, we use data from the Global Dietary Database and Bayesian hierarchical models to quantify global, regional and national ASF intakes between 1990 and 2018 by age group across 185 countries, representing 93% of the world’s child population. Mean ASF intake was 1.9 servings per day, representing 16% of children consuming at least three daily servings. Intake was similar between boys and girls, but higher among urban children with educated parents. Consumption varied by age from 0.6 at <1 year to 2.5 servings per day at 15–19 years. Between 1990 and 2018, mean ASF intake increased by 0.5 servings per week, with increases in all regions except sub-Saharan Africa. In 2018, total ASF consumption was highest in Russia, Brazil, Mexico and Turkey, and lowest in Uganda, India, Kenya and Bangladesh. These findings can inform policy to address malnutrition through targeted ASF consumption programmes.publishedVersio

    Incident type 2 diabetes attributable to suboptimal diet in 184 countries

    Get PDF
    The global burden of diet-attributable type 2 diabetes (T2D) is not well established. This risk assessment model estimated T2D incidence among adults attributable to direct and body weight-mediated effects of 11 dietary factors in 184 countries in 1990 and 2018. In 2018, suboptimal intake of these dietary factors was estimated to be attributable to 14.1 million (95% uncertainty interval (UI), 13.8–14.4 million) incident T2D cases, representing 70.3% (68.8–71.8%) of new cases globally. Largest T2D burdens were attributable to insufficient whole-grain intake (26.1% (25.0–27.1%)), excess refined rice and wheat intake (24.6% (22.3–27.2%)) and excess processed meat intake (20.3% (18.3–23.5%)). Across regions, highest proportional burdens were in central and eastern Europe and central Asia (85.6% (83.4–87.7%)) and Latin America and the Caribbean (81.8% (80.1–83.4%)); and lowest proportional burdens were in South Asia (55.4% (52.1–60.7%)). Proportions of diet-attributable T2D were generally larger in men than in women and were inversely correlated with age. Diet-attributable T2D was generally larger among urban versus rural residents and higher versus lower educated individuals, except in high-income countries, central and eastern Europe and central Asia, where burdens were larger in rural residents and in lower educated individuals. Compared with 1990, global diet-attributable T2D increased by 2.6 absolute percentage points (8.6 million more cases) in 2018, with variation in these trends by world region and dietary factor. These findings inform nutritional priorities and clinical and public health planning to improve dietary quality and reduce T2D globally.publishedVersio
    corecore