43 research outputs found

    Recognition of characters in document images using morphological operation

    Get PDF
    In this paper, we deal with the problem of document image rectification from image captured by digital cameras. The improvement on the resolution of digital camera sensors has brought more and more applications for non-contact text capture. It is widely used as a form of data entry from some sort of original paper data source, documents, sales receipts or any number of printed records. It is crucial to the computerization of printed texts so that they can be electronically searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech and text mining. Unfortunately, perspective distortion in the resulting image makes it hard to properly identify the contents of the captured text using traditional optical character recognition (OCR) systems. In this work we propose a new technique; it is a system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the form. Optical character recognition, usually abbreviated as OCR is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. OCR software detects and extracts each character in the text of a scanned image, and using the ASCII code set, which is the American Standard Code for Information Interchange, converts it into a computer recognizable character. Once each character has been converted, the whole document is saved as an editable text document with a highest accuracy rate of 99.5 per cent, although it is not always this accurate. The basic idea of Optical Character Recognition (OCR) is to classify optical patterns (often contained in a digital image) corresponding to alphanumeric or other characters

    Text Detection and Pose Estimation for a Reading Robot

    Get PDF

    Textension: Digitally Augmenting Document Spaces in Analog Texts

    Get PDF
    In this paper, we present a system that automatically adds visualizations and natural language processing applications to analog texts, using any web-based device with a camera. After taking a picture of a particular page or set of pages from a book or uploading an existing image, our system builds an interactive digital object that automatically inserts modular elements in a digital space. Leveraging the findings of previous studies, our framework augments the reading of analog texts with digital tools, making it possible to work with texts in both a digital and analog environment

    Extending the Page Segmentation Algorithms of the OCRopus Documentation Layout Analysis System

    Get PDF
    With the advent of more powerful personal computers, inexpensive memory, and digital cameras, curators around the world are working towards preserving historical documents on computers. Since many of the organizations for which they work have limited funds, there is world-wide interest in a low-cost solution to obtaining these digital records in a computer-readable form. An open source layout analysis system called OCRopus is being developed for such a purpose. In its original state, though, it could not process documents that contained information other than text. Segmenting the page into regions of text and non-text areas is the first step of analyzing a mixedcontent document, but it did not exist in OCRopus. Therefore, the goal of this thesis was to add this capability so that OCRopus could process a full spectrum of documents. By default, the RAST page segmentation algorithm processed text-only documents at a target resolution of 300 DPI. In a separate module, the Voronoi algorithm divided the page into regions, but did not classify them as text or non-text. Additionally, it tended to oversegment non-text regions and was tuned to a resolution of 300 DPI. Therefore, the RAST algorithm was improved to recognize non-text regions and the Voronoi algorithm was extended to classify text and non-text regions and merge non-text regions appropriately. Finally, both algorithms were modified to perform at a range of resolutions. Testing on a set of documents consisting of different types showed an improvement of 15-40% for the RAST algorithm, giving it at an average segmentation accuracy of about 80%. Partially due to the representation of the ground truth, the Voronoi algorithm did not perform as well as the improved RAST algorithm, averaging around 70% overall. Depending on the layout of the historical documents to be digitized, though, either algorithm could be sufficiently accurate to be utilized

    Hybrid Single and Dual Pattern Structured Light Illumination

    Get PDF
    Structured Light Illumination is a widely used 3D shape measurement technique in non-contact surface scanning. Multi-pattern based Structured Light Illumination methods reconstruct 3-D surface with high accuracy, but are sensitive to object motion during the pattern projection and the speed of scanning process is relatively long. To reduce this sensitivity, single pattern techniques are developed to achieve a high speed scanning process, such as Composite Pattern (CP) and Modified Composite Pattern (MCP) technique. However, most of single patter techniques have a significant banding artifact and sacrifice the accuracy. We focus on developing SLI techniques can achieve both high speed, high accuracy and have the tolerance to the relative motion. We first present a novel Two-Pattern Full Lateral Resolution (2PFLR) SLI method utilizing an MCP pattern for non-ambiguous phase followed by a single sinusoidal pattern for high accuracy. The surface phase modulates the single sinusoidal pattern which is demodulated using a Quadrature demodulation technique and then unwrapped by the MCP phase result. A single sinusoidal pattern reconstruction inherently has banding error. To effective de-band the surface, we propose Projector Space De-banding algorithm (PSDb). We use projector space because the band error is aligned with the projector coordinates allowing more accurate estimation of the banding error. 2PFLR system only allows the relative motion within the FOV of the scanner, to extend the application of the SLI, we present the research on Relative Motion 3-D scanner which utilize a single pattern technique. The pattern in RM3D system is designed based on MCP but has white space area to capture the surface texture, and a constellation correlation filter method is used to estimate the scanner\u27s trajectory and then align the 3-D surface reconstructed by each frame to a point cloud of the whole object surface

    Satellite and UAV Platforms, Remote Sensing for Geographic Information Systems

    Get PDF
    The present book contains ten articles illustrating the different possible uses of UAVs and satellite remotely sensed data integration in Geographical Information Systems to model and predict changes in both the natural and the human environment. It illustrates the powerful instruments given by modern geo-statistical methods, modeling, and visualization techniques. These methods are applied to Arctic, tropical and mid-latitude environments, agriculture, forest, wetlands, and aquatic environments, as well as further engineering-related problems. The present Special Issue gives a balanced view of the present state of the field of geoinformatics

    Machine Learning-based Detection of Compensatory Balance Responses and Environmental Fall Risks Using Wearable Sensors

    Get PDF
    Falls are the leading cause of fatal and non-fatal injuries among seniors worldwide, with serious and costly consequences. Compensatory balance responses (CBRs) are reactions to recover stability following a loss of balance, potentially resulting in a fall if sufficient recovery mechanisms are not activated. While performance of CBRs are demonstrated risk factors for falls in seniors, the frequency, type, and underlying cause of these incidents occurring in everyday life have not been well investigated. This study was spawned from the lack of research on development of fall risk assessment methods that can be used for continuous and long-term mobility monitoring of the geri- atric population, during activities of daily living, and in their dwellings. Wearable sensor systems (WSS) offer a promising approach for continuous real-time detection of gait and balance behavior to assess the risk of falling during activities of daily living. To detect CBRs, we record movement signals (e.g. acceleration) and activity patterns of four muscles involving in maintaining balance using wearable inertial measurement units (IMUs) and surface electromyography (sEMG) sensors. To develop more robust detection methods, we investigate machine learning approaches (e.g., support vector machines, neural networks) and successfully detect lateral CBRs, during normal gait with accuracies of 92.4% and 98.1% using sEMG and IMU signals, respectively. Moreover, to detect environmental fall-related hazards that are associated with CBRs, and affect balance control behavior of seniors, we employ an egocentric mobile vision system mounted on participants chest. Two algorithms (e.g. Gabor Barcodes and Convolutional Neural Networks) are developed. Our vision-based method detects 17 different classes of environmental risk factors (e.g., stairs, ramps, curbs) with 88.5% accuracy. To the best of the authors knowledge, this study is the first to develop and evaluate an automated vision-based method for fall hazard detection

    Autonomous Close Formation Flight of Small UAVs Using Vision-Based Localization

    Get PDF
    As Unmanned Aerial Vehicles (UAVs) are integrated into the national airspace to comply with the 2012 Federal Aviation Administration Reauthorization Act, new civilian uses for robotic aircraft will come about in addition to the more obvious military applications. One particular area of interest for UAV development is the autonomous cooperative control of multiple UAVs. In this thesis, a decentralized leader-follower control strategy is designed, implemented, and tested from the follower’s perspective using vision-based localization. The tasks of localization and control were carried out with separate processing hardware dedicated to each task. First, software was written to estimate the relative state of a lead UAV in real-time from video captured by a camera on-board the following UAV. The software, written using OpenCV computer vision libraries and executed on an embedded single-board computer, uses the Efficient Perspective-n-Point algorithm to compute the 3-D pose from a set of 2-D image points. High-intensity, red, light emitting diodes (LEDs) were affixed to specific locations on the lead aircraft’s airframe to simplify the task if extracting the 2-D image points from video. Next, the following vehicle was controlled by modifying a commercially available, open source, waypoint-guided autopilot to navigate using the relative state vector provided by the vision software. A custom Hardware-In-Loop (HIL) simulation station was set up and used to derive the required localization update rate for various flight patterns and levels of atmospheric turbulence. HIL simulation showed that it should be possible to maintain formation, with a vehicle separation of 50 ± 6 feet and localization estimates updated at 10 Hz, for a range of flight conditions. Finally, the system was implemented into low-cost remote controlled aircraft and flight tested to demonstrate formation convergence to 65.5 ± 15 feet of separation
    corecore