2,177 research outputs found

    Vision-based single-stroke character recognition for wearable computing

    Get PDF
    Wearable computing offers advantages but making data entry easy remains a challenge. A new approach for data entry using a head-mounted digital camera to record characters drawn by hand gestures or by a pointer is introduced. In addition, each character is drawn as a single, isolated stroke using a Graffiti-like alphabet. As such, the resulting character recognition system has potential for application in mobile communication and computing devices

    SymbolDesign: A User-centered Method to Design Pen-based Interfaces and Extend the Functionality of Pointer Input Devices

    Full text link
    A method called "SymbolDesign" is proposed that can be used to design user-centered interfaces for pen-based input devices. It can also extend the functionality of pointer input devices such as the traditional computer mouse or the Camera Mouse, a camera-based computer interface. Users can create their own interfaces by choosing single-stroke movement patterns that are convenient to draw with the selected input device and by mapping them to a desired set of commands. A pattern could be the trace of a moving finger detected with the Camera Mouse or a symbol drawn with an optical pen. The core of the SymbolDesign system is a dynamically created classifier, in the current implementation an artificial neural network. The architecture of the neural network automatically adjusts according to the complexity of the classification task. In experiments, subjects used the SymbolDesign method to design and test the interfaces they created, for example, to browse the web. The experiments demonstrated good recognition accuracy and responsiveness of the user interfaces. The method provided an easily-designed and easily-used computer input mechanism for people without physical limitations, and, with some modifications, has the potential to become a computer access tool for people with severe paralysis.National Science Foundation (IIS-0093367, IIS-0308213, IIS-0329009, EIA-0202067

    Application on character recognition system on road sign for visually impaired: case study approach and future

    Get PDF
    Many visually impaired people worldwide are unable to travel safely and autonomously because they are physically unable to perceive effective visual information during their daily lives. In this research, we study how to extract the character information of the road sign and transmit it to the visually impaired effectively, so they can understand easier. Experimental method is to apply the Maximally Stable External Region and Stroke Width Transform method in Phase I so that the visually impaired person can recognize the letters on the road signs. It is to convey text information to the disabled. The result of Phase I using samples of simple road signs was to extract the sign information after dividing the exact character area, but the accuracy was not good for the Hangul (Korean characters) information. The initial experimental results in the Phase II succeeded in transmitting the text information on Phase I to the visually impaired. In the future, it will be required to develop a wearable character recognition system that can be attached to the visually impaired. In order to perform this task, we need to develop and verify a miniaturized and wearable character recognition system. In this paper, we examined the method of recognizing road sign characters on the road and presented a possibility that may be applicable to our final development

    Interaction Methods for Smart Glasses : A Survey

    Get PDF
    Since the launch of Google Glass in 2014, smart glasses have mainly been designed to support micro-interactions. The ultimate goal for them to become an augmented reality interface has not yet been attained due to an encumbrance of controls. Augmented reality involves superimposing interactive computer graphics images onto physical objects in the real world. This survey reviews current research issues in the area of human-computer interaction for smart glasses. The survey first studies the smart glasses available in the market and afterwards investigates the interaction methods proposed in the wide body of literature. The interaction methods can be classified into hand-held, touch, and touchless input. This paper mainly focuses on the touch and touchless input. Touch input can be further divided into on-device and on-body, while touchless input can be classified into hands-free and freehand. Next, we summarize the existing research efforts and trends, in which touch and touchless input are evaluated by a total of eight interaction goals. Finally, we discuss several key design challenges and the possibility of multi-modal input for smart glasses.Peer reviewe

    Cascaded Segmentation-Detection Networks for Word-Level Text Spotting

    Full text link
    We introduce an algorithm for word-level text spotting that is able to accurately and reliably determine the bounding regions of individual words of text "in the wild". Our system is formed by the cascade of two convolutional neural networks. The first network is fully convolutional and is in charge of detecting areas containing text. This results in a very reliable but possibly inaccurate segmentation of the input image. The second network (inspired by the popular YOLO architecture) analyzes each segment produced in the first stage, and predicts oriented rectangular regions containing individual words. No post-processing (e.g. text line grouping) is necessary. With execution time of 450 ms for a 1000-by-560 image on a Titan X GPU, our system achieves the highest score to date among published algorithms on the ICDAR 2015 Incidental Scene Text dataset benchmark.Comment: 7 pages, 8 figure

    Development of a text reading system on video images

    Get PDF
    Since the early days of computer science researchers sought to devise a machine which could automatically read text to help people with visual impairments. The problem of extracting and recognising text on document images has been largely resolved, but reading text from images of natural scenes remains a challenge. Scene text can present uneven lighting, complex backgrounds or perspective and lens distortion; it usually appears as short sentences or isolated words and shows a very diverse set of typefaces. However, video sequences of natural scenes provide a temporal redundancy that can be exploited to compensate for some of these deficiencies. Here we present a complete end-to-end, real-time scene text reading system on video images based on perspective aware text tracking. The main contribution of this work is a system that automatically detects, recognises and tracks text in videos of natural scenes in real-time. The focus of our method is on large text found in outdoor environments, such as shop signs, street names and billboards. We introduce novel efficient techniques for text detection, text aggregation and text perspective estimation. Furthermore, we propose using a set of Unscented Kalman Filters (UKF) to maintain each text regionÂżs identity and to continuously track the homography transformation of the text into a fronto-parallel view, thereby being resilient to erratic camera motion and wide baseline changes in orientation. The orientation of each text line is estimated using a method that relies on the geometry of the characters themselves to estimate a rectifying homography. This is done irrespective of the view of the text over a large range of orientations. We also demonstrate a wearable head-mounted device for text reading that encases a camera for image acquisition and a pair of headphones for synthesized speech output. Our system is designed for continuous and unsupervised operation over long periods of time. It is completely automatic and features quick failure recovery and interactive text reading. It is also highly parallelised in order to maximize the usage of available processing power and to achieve real-time operation. We show comparative results that improve the current state-of-the-art when correcting perspective deformation of scene text. The end-to-end system performance is demonstrated on sequences recorded in outdoor scenarios. Finally, we also release a dataset of text tracking videos along with the annotated ground-truth of text regions
    • …
    corecore