431 research outputs found

    Automatic text segmentation and text recognition for video indexing

    Full text link
    Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our methods for automatic seg-mentation of text in digital videos. The output is directly passed to a standard OCR software package in order to translate the segmented text into ASCII. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. Especially the inter-frame dependencies of the characters provide new possibilities for their refinement. Then, a straightforward indexing and retrieval scheme is intro-duced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with exist-ing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher seman-tics in videos

    Use of the Smartphone Camera to Monitor Adherence to Inhaled Therapy

    Get PDF
    Self-management strategies can lead to improved health outcomes, fewer unscheduled treatments, and improved disease control. Compliance with inhaled control drugs is essential to achieve good clinical outcomes in patients with chronic respiratory diseases. However, compliance assessments suffer from the difficulty of achieving a high degree of trustworthiness, as patients often self-report high compliance rates and are considered unreliable. This thesis aims to enable reliable adhesion measurement by developing a mobile application module to objectively verify inhalation usage using image snapshots of the inhalation counter. To achieve this, a mobile application module featuring pre and post processing techniques and a default machine learning framework was built, for inhaler and dosage counter numbers detection. In addition, in an effort to improve the app’s capabilities of text recognition on a worst-performing inhaler, a machine learning model was trained on an inhaler image dataset. Some of the features worked on during this project were incorporated on the current version of the app InspirerMundi, a medication management mobile application, planned to be made available at the PlayStore by the end of 2021. The proposed approach was validated through a series of different inhaler image datasets. The carried-out tests with the default machine learning configuration showed correct detection of dosage counters for 70% of inhaler registration events and 93% for three commonly used inhalers in Portugal. On the other hand, the trained model had an average accuracy of 88 % in recognizing the digits on the dose counter of one of the worst-performing inhaler models. These results show the potential to explore mobile and embedded capabilities to gain additional evidence for inhaler compliance. These systems can help bridge the gap between patients and healthcare professionals. By empowering patients with disease selfmanagement and drug adherence tools and providing additional relevant data, these systems pave the way for informed disease management decisions

    Neural text line extraction in historical documents: a two-stage clustering approach

    Get PDF
    Accessibility of the valuable cultural heritage which is hidden in countless scanned historical documents is the motivation for the presented dissertation. The developed (fully automatic) text line extraction methodology combines state-of-the-art machine learning techniques and modern image processing methods. It demonstrates its quality by outperforming several other approaches on a couple of benchmarking datasets. The method is already being used by a wide audience of researchers from different disciplines and thus contributes its (small) part to the aforementioned goal.Das Erschließen des unermesslichen Wissens, welches in unzähligen gescannten historischen Dokumenten verborgen liegt, bildet die Motivation für die vorgelegte Dissertation. Durch das Verknüpfen moderner Verfahren des maschinellen Lernens und der klassischen Bildverarbeitung wird in dieser Arbeit ein vollautomatisches Verfahren zur Extraktion von Textzeilen aus historischen Dokumenten entwickelt. Die Qualität wird auf verschiedensten Datensätzen im Vergleich zu anderen Ansätzen nachgewiesen. Das Verfahren wird bereits durch eine Vielzahl von Forschern verschiedenster Disziplinen genutzt

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    Extraction of Text from Images and Videos

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    The generalization of the R-transform for invariant pattern representation

    Get PDF
    International audienceThe beneficial properties of the Radon transform make it an useful intermediate representation for the extraction of invariant features from pattern images for the purpose of indexing/matching. This paper revisits the problem of Radon image utilization with a generic view on a popular Radon transform-based transform and pattern descriptor, the R-transform and R-signature, bringing in a class of transforms and descriptors spatially describing patterns at all directions and at different levels, while maintaining the beneficial properties of the conventional R-transform and R-signature. The domain of this class, which is delimited due to the existence of singularities and the effect of sampling/quantization and additive noise, is examined. Moreover, the ability of the generic R-transform to encode the dominant directions of pattern is also discussed, adding to the robustness to additive noise of the generic R-signature. The stability of dominant direction encoding by the generic R-transform and the superiority of the generic R-signature over existing invariant pattern descriptors on grayscale and binary noisy datasets have been confirmed by experiments

    Human-Centric Machine Vision

    Get PDF
    Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans

    Digital scaling of binary images

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1979.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING.Includes bibliographical references.by Robert A. Ulichney.M.S
    corecore