1,525 research outputs found

    Local blur estimation based on toggle mapping

    No full text
    International audienceA local blur estimation method is proposed, based on the difference between the gradient and the residue of the toggle mapping. This method is able to compare the quality of images with different content and does not require a contour detection step. Qualitative results are shown in the context of the LINX project. Then, quantitative results are given on DIQA database, outperforming the combination of classical blur detection methods reported in the literature

    A Book Reader Design for Persons with Visual Impairment and Blindness

    Get PDF
    The objective of this dissertation is to provide a new design approach to a fully automated book reader for individuals with visual impairment and blindness that is portable and cost effective. This approach relies on the geometry of the design setup and provides the mathematical foundation for integrating, in a unique way, a 3-D space surface map from a low-resolution time of flight (ToF) device with a high-resolution image as means to enhance the reading accuracy of warped images due to the page curvature of bound books and other magazines. The merits of this low cost, but effective automated book reader design include: (1) a seamless registration process of the two imaging modalities so that the low resolution (160 x 120 pixels) height map, acquired by an Argos3D-P100 camera, accurately covers the entire book spread as captured by the high resolution image (3072 x 2304 pixels) of a Canon G6 Camera; (2) a mathematical framework for overcoming the difficulties associated with the curvature of open bound books, a process referred to as the dewarping of the book spread images, and (3) image correction performance comparison between uniform and full height map to determine which map provides the highest Optical Character Recognition (OCR) reading accuracy possible. The design concept could also be applied to address the challenging process of book digitization. This method is dependent on the geometry of the book reader setup for acquiring a 3-D map that yields high reading accuracy once appropriately fused with the high-resolution image. The experiments were performed on a dataset consisting of 200 pages with their corresponding computed and co-registered height maps, which are made available to the research community (cate-book3dmaps.fiu.edu). Improvements to the characters reading accuracy, due to the correction steps, were quantified and measured by introducing the corrected images to an OCR engine and tabulating the number of miss-recognized characters. Furthermore, the resilience of the book reader was tested by introducing a rotational misalignment to the book spreads and comparing the OCR accuracy to those obtained with the standard alignment. The standard alignment yielded an average reading accuracy of 95.55% with the uniform height map (i.e., the height values of the central row of the 3-D map are replicated to approximate all other rows), and 96.11% with the full height maps (i.e., each row has its own height values as obtained from the 3D camera). When the rotational misalignments were taken into account, the results obtained produced average accuracies of 90.63% and 94.75% for the same respective height maps, proving added resilience of the full height map method to potential misalignments

    Use of the Smartphone Camera to Monitor Adherence to Inhaled Therapy

    Get PDF
    Self-management strategies can lead to improved health outcomes, fewer unscheduled treatments, and improved disease control. Compliance with inhaled control drugs is essential to achieve good clinical outcomes in patients with chronic respiratory diseases. However, compliance assessments suffer from the difficulty of achieving a high degree of trustworthiness, as patients often self-report high compliance rates and are considered unreliable. This thesis aims to enable reliable adhesion measurement by developing a mobile application module to objectively verify inhalation usage using image snapshots of the inhalation counter. To achieve this, a mobile application module featuring pre and post processing techniques and a default machine learning framework was built, for inhaler and dosage counter numbers detection. In addition, in an effort to improve the app’s capabilities of text recognition on a worst-performing inhaler, a machine learning model was trained on an inhaler image dataset. Some of the features worked on during this project were incorporated on the current version of the app InspirerMundi, a medication management mobile application, planned to be made available at the PlayStore by the end of 2021. The proposed approach was validated through a series of different inhaler image datasets. The carried-out tests with the default machine learning configuration showed correct detection of dosage counters for 70% of inhaler registration events and 93% for three commonly used inhalers in Portugal. On the other hand, the trained model had an average accuracy of 88 % in recognizing the digits on the dose counter of one of the worst-performing inhaler models. These results show the potential to explore mobile and embedded capabilities to gain additional evidence for inhaler compliance. These systems can help bridge the gap between patients and healthcare professionals. By empowering patients with disease selfmanagement and drug adherence tools and providing additional relevant data, these systems pave the way for informed disease management decisions

    Video summarisation: A conceptual framework and survey of the state of the art

    Get PDF
    This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

    Automatic optical inspection for detecting keycaps misplacement using Tesseract optical character recognition

    Get PDF
    This research study aims to develop automatic optical inspection (AOI) for detecting keycaps misplacement on the keyboard. The AOI hardware has been designed using an industrial camera with an additional mechanical jig and lighting system. Optical character recognition (OCR) using the Tesseract OCR engine is the proposed method to detect keycaps misplacement. In addition, captured images were cropped using a predefined region of interest (ROI) during the setup. Subsequently, the cropped ROIs were processed to acquire binary images. Furthermore, Tesseract processed these binary images to recognize the text on the keycaps. Keycaps misplacement could be identified by comparing the predicted text with the actual text on the golden sample. Experiments on 25 defects and 25 non-defected samples provided a classification accuracy of 97.34%, a precision of 100%, and a recall of 90.70%. Meanwhile, the character error rate (CER) obtained from the test on a total of 57 characters provided a performance of 10.53%. This outcome has implications for developing AOI for various keyboard products. In addition, the precision level of 100% signifies that the proposed method always offers correct results in detecting product defects. Such outcomes are critical in industrial applications to prevent defective products from circulating in the market

    Kentucky\u27s PRISM-Based Automated Ramp Screening System Evaluation

    Get PDF
    In 2010, Kentucky implemented a Performance Registration Information Systems and Management (PRISM) based automated ramp screening system (PARSS) at the Boone County inspection station on southbound I-71. The purpose of the PARSS is to identify and screen every vehicle that enters the Boone County inspection station. The system provides automated screening of trucks based on the license plate number and the USDOT number displayed on the vehicle. If it is determined that the vehicle should be stopped for inspection, that decision is communicated to the truck driver via the existing directional arrows that direct drivers to the static scale for inspection. A thorough evaluation was conducted to assess the performance of the system (i.e., does it do what it was intended to do?), the value of the system in identifying vehicles for inspection with PRISM or CVISN-related issues, and the potential for more widespread deployment of this type of screening system. In addition, the evaluation also included a side-by-side comparison of the two automated license plate reader (ALPR) systems
    • …
    corecore