744 research outputs found

    Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

    Full text link
    We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half. Existing approaches, such as the DeepDocClassifier, apply standard Convolutional Network architectures with transfer learning from the object recognition domain. The contribution of the paper is threefold: First, it investigates recently introduced very deep neural network architectures (GoogLeNet, VGG, ResNet) using transfer learning (from real images). Second, it proposes transfer learning from a huge set of document images, i.e. 400,000 documents. Third, it analyzes the impact of the amount of training data (document images) and other parameters to the classification abilities. We use two datasets, the Tobacco-3482 and the large-scale RVL-CDIP dataset. We achieve an accuracy of 91.13% for the Tobacco-3482 dataset while earlier approaches reach only 77.6%. Thus, a relative error reduction of more than 60% is achieved. For the large dataset RVL-CDIP, an accuracy of 90.97% is achieved, corresponding to a relative error reduction of 11.5%

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters

    Get PDF
    OBJECTIVES: The secondary use of medical data contained in electronic medical records, such as hospital discharge letters, is a valuable resource for the improvement of clinical care (e.g. in terms of medication safety) or for research purposes. However, the automated processing and analysis of medical free text still poses a huge challenge to available natural language processing (NLP) systems. The aim of this study was to implement a knowledge-based best of breed approach, combining a terminology server with integrated ontology, a NLP pipeline and a rules engine. METHODS: We tested the performance of this approach in a use case. The clinical event of interest was the particular drug-disease interaction "proton-pump inhibitor [PPI] use and osteoporosis". Cases were to be identified based on free text digital discharge letters as source of information. Automated detection was validated against a gold standard. RESULTS: Precision of recognition of osteoporosis was 94.19%, and recall was 97.45%. PPIs were detected with 100% precision and 97.97% recall. The F-score for the detection of the given drug-disease-interaction was 96,13%. CONCLUSION: We could show that our approach of combining a NLP pipeline, a terminology server, and a rules engine for the purpose of automated detection of clinical events such as drug-disease interactions from free text digital hospital discharge letters was effective. There is huge potential for the implementation in clinical and research contexts, as this approach enables analyses of very high numbers of medical free text documents within a short time period

    Parametric classification in domains of characters, numerals, punctuation, typefaces and image qualities

    Get PDF
    This thesis contributes to the Optical Font Recognition problem (OFR), by developing a classifier system to differentiate ten typefaces using a single English character ‘e’. First, features which need to be used in the classifier system are carefully selected after a thorough typographical study of global font features and previous related experiments. These features have been modeled by multivariate normal laws in order to use parameter estimation in learning. Then, the classifier system is built up on six independent schemes, each performing typeface classification using a different method. The results have shown a remarkable performance in the field of font recognition. Finally, the classifiers have been implemented on Lowercase characters, Uppercase characters, Digits, Punctuation and also on Degraded Images

    2D Grammar Extension of the CMP Mathematical Formulae On-line Recognition System

    Get PDF
    Projecte realitzat en col.laboració amb Czech Technical University in PragueIn the last years, the recognition of handwritten mathematical formulae has recieved an increasing amount of attention in pattern recognition research. However, the diversity of approaches to the problem and the lack of a commercially viable system indicate that there is still much research to be done in this area. In this thesis, I will describe the previous work on a system for on-line handwritten mathematical formulae recognition based on the structural construction paradigm and two-dimensional grammars. In general, this approach can be successfully used in the anaylysis of inputs composed of objects that exhibit rich structural relations. An important benefit of the structural construction is in not treating symbols segmentation and structural anaylsis as two separate processes which allows the system to perform segmentation in the context of the whole formula structure, helping to solve arising ambiguities more reliably. We explore the opening provided by the polynomial complexity parsing algorithm and extend the grammar by many new grammar production rules which made the system useful for formulae met in the real world. We propose several grammar extensions to support a wide range of real mathematical formulae, as well as new features implemented in the application. Our current approach can recognize functions, limits, derivatives, binomial coefficients, complex numbers and more
    corecore