11 research outputs found

    The CONTENT4ALL project

    Get PDF

    Low-cost portable text recognition and speech synthesis with generic software, l

    Get PDF
    The final publication is available at link.springer.comBlind persons or people with reduced eyesight could benefit from a portablesystem that can interpret textual information in the surrounding environment and speakdirectly to the user. The need for such a system was surveyed with a questionnaire, and aprototype system was built using generic, inexpensive components readily available. Thesystem architecture is component-based so that every module can be replaced with anothergeneric module. Even though the system makes partly incorrect recognition of text in aversatile environment, the evaluation of the system with five actual users suggested that thesystem can provide genuine additional value in coping with everyday issues outdoors.Peer reviewe

    Image-Based Mobile Service: Automatic Text Extraction and Translation

    Get PDF
    We present a new mobile service for the translation of text from images taken by consumer-grade cell-phone cameras. Such capability represents a new paradigm for users where a simple image provides the basis for a service. The ubiquity and ease of use of cell-phone cameras enables acquisition and transmission of images anywhere and at any time a user wishes, delivering rapid and accurate translation over the phone’s MMS and SMS facilities. Target text is extracted completely automatically, requiring no bounding box delineation or related user intervention. The service uses localization, binarization, text deskewing, and optical character recognition (OCR) in its analysis. Once the text is translated, an SMS message is sent to the user with the result. Further novelties include that no software installation is required on the handset, any service provider or camera phone can be used, and the entire service is implemented on the server side

    INTEGRATED AUGMENTED REALITY SIGN BOARD MOBILE TRANSLATION SYSTEM

    Get PDF
    Malaysia continuously receives students, businessmen and tourists coming from all parts of the world. More often than not, these have little or no knowledge of the Malay language. Moreover, these will often miss important messages that will be conveyed through sign boards, information boards and any other written method because they are not versed in Malay language. Hence, an app has been proposed and developed to combine a highly accurate Optical Character Recognition engine, Tesseract and little of Augmented Reality concepts to provide Malay translation to foreigners. This app will prove to be useful because it will provide Malay translation with almost no user input, requiring only that these focus their devices’ cameras to the sign board’s text they intent to translate

    Cell Phones as Imaging Sensors

    Get PDF
    Camera phones are ubiquitous, and consumers have been adopting them faster than any other technology in modern history. When connected to a network, though, they are capable of more than just picture taking: Suddenly, they gain access to the power of the cloud. We exploit this capability by providing a series of image-based personal advisory services. These are designed to work with any handset over any cellular carrier using commonly available Multimedia Messaging Service (MMS) and Short Message Service (SMS) features. Targeted at the unsophisticated consumer, these applications must be quick and easy to use, not requiring download capabilities or preplanning. Thus, all application processing occurs in the back-end system (i.e., as a cloud service) and not on the handset itself. Presenting an image to an advisory service in the cloud, a user receives information that can be acted upon immediately. Two of our examples involve color assessment – selecting cosmetics and home décor paint palettes; the third provides the ability to extract text from a scene. In the case of the color imaging applications, we have shown that our service rivals the advice quality of experts. The result of this capability is a new paradigm for mobile interactions — image-based information services exploiting the ubiquity of camera phones

    Scene Text Extraction using Convolutional Neural Network with Amended MSER

    Get PDF
    Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well by evaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on the variation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawback of original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image size enhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss of YOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the network complexity as it does not improve the system performance. Experimental results demonstrate that the proposed method is productive in identifying text from natural scene images

    Scene Text Extraction using Convolutional Neural Network with Amended MSER

    Get PDF
    817-827Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well byevaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on thevariation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawbackof original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image sizeenhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss ofYOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the networkcomplexity as it does not improve the system performance. Experimental results demonstrate that the proposed method isproductive in identifying text from natural scene images

    Computer-assisted acquisition of information for visually impaired

    Get PDF
    The study examines various uses of computer technology in acquisition of information for visually impaired people. For this study 29 visually impaired persons took part in a survey about their experiences concerning acquisition of infomation and use of computers, especially with a screen magnification program, a speech synthesizer and a braille display. According to the responses, the evolution of computer technology offers an important possibility for visually impaired people to cope with everyday activities and interacting with the environment. Nevertheless, the functionality of assistive technology needs further development to become more usable and versatile. Since the challenges of independent observation of environment were emphasized in the survey, the study led into developing a portable text vision system called Tekstinäkö. Contrary to typical stand-alone applications, Tekstinäkö system was constructed by combining devices and programs that are readily available on consumer market. As the system operates, pictures are taken by a digital camera and instantly transmitted to a text recognition program in a laptop computer that talks out loud the text using a speech synthesizer. Visually impaired test users described that even unsure interpretations of the texts in the environment given by Tekstinäkö system are at least a welcome addition to complete perception of the environment. It became clear that even with a modest development work it is possible to bring new, useful and valuable methods to everyday life of disabled people. Unconventional production process of the system appeared to be efficient as well. Achieved results and the proposed working model offer one suggestion for giving enough attention to easily overlooked needs of the people with special abilities. ACM Computing Classification System (1998): K.4.2 Social Issues: Assistive technologies for persons with disabilities I.4.9 Image processing and computer vision: ApplicationsTutkielma tarkastelee tietotekniikan erilaisia käyttötapoja näkövammaisten tiedon hankinnassa. Tutkielmaa varten tehdyllä kyselyllä kartoitettiin 29 näkövammaisen kokemuksia tiedon hankinnasta ja tietokoneen käytöstä erityisesti hyödyntäen ruudun suurennusohjelmaa, puhesyntetisaattoria ja pistenäyttöä. Vastausten perusteella tietotekniikan kehitys tarjoaa näkövammaisille tärkeän mahdollisuuden edistää arjen askareita ja vuorovaikutteisuutta ympäristöön. Kuitenkin esimerkiksi apuvälineiden toiminnallisuutta tulisi kehittää entistä havainnollisemmaksi ja monipuolisemmaksi. Kun ympäristön omatoimisen havainnoinnin haasteet korostuivat kyselyssä, osana tutkimusta kehitettiin mukana kuljetettava Tekstinäkö-järjestelmä. Perinteisestä omavaraisesta tuotantotavasta poiketen Tekstinäkö-järjestelmä koostettiin kuluttajamarkkinoilla olevista valmiista laitteista ja ohjelmista. Järjestelmässä digitaalikameralla otettavat kuvat johdetaan tuoreeltaan kannettavassa tietokoneessa toimivaan tekstintunnistusohjelmaan, ja kuvista tulkitut tekstit lausutaan puhesyntetisaattorilla. Näkövammaiset koekäyttäjät kuvailivat kehitetyn järjestelmän antamia epävarmojakin tulkintoja ympäristön teksteistä vähintään tervetulleeksi lisäksi täydentämään näkymän hahmottamista. Tuli ilmi, että jo pienelläkin kehitystyöllä voidaan vammaisten arkeen tuoda hyödyllisiä ja arvokkaitakin uusia menetelmiä. Myös järjestelmän ennakkoluuloton tuotantotapa osoittautui tehokkaaksi. Saadut tulokset ja esitetty toimintamalli tarjoavat eväitä helposti katveeseen jäävien erityisryhmien tarpeiden huomioimiseksi niille kuuluvalla painokkuudella nyt ja vastaisuudessa. ACM Computing Classification System (1998) -luokitus: K.4.2 Social Issues: Assistive technologies for persons with disabilities I.4.9 Image processing and computer vision: Application

    Towards Automatic Sign Translation

    No full text
    Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they also pose problems. For example, a tourist might not be able to understand signs in a foreign country. In this paper, we present our efforts towards automatic sign translation. We discuss methods for automatic sign detection. We describe sign translation using example based machine translation technology. We use a user-centered approach in developing an automatic sign translation system. The approach takes advantage of human intelligence in selecting an area of interest and domain for translation if needed. A user can determine which sign is to be translated if multiple signs have been detected within the image. The selected part of the image is then processed, recognized, and translated. We have developed a prototype system that can recognize Chinese signs input from a video camera which is a common gadget for a tourist, and translate them into English text or voice stream
    corecore