Search CORE

11 research outputs found

The CONTENT4ALL project

Author: Inches Giacomo
Publication venue
Publication date: 01/01/2020
Field of study

Low-cost portable text recognition and speech synthesis with generic software, l

Author: Kurhila Jaakko
Lahti Lauri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

The final publication is available at link.springer.comBlind persons or people with reduced eyesight could benefit from a portablesystem that can interpret textual information in the surrounding environment and speakdirectly to the user. The need for such a system was surveyed with a questionnaire, and aprototype system was built using generic, inexpensive components readily available. Thesystem architecture is component-based so that every module can be replaced with anothergeneric module. Even though the system makes partly incorrect recognition of text in aversatile environment, the evaluation of the system with five actual users suggested that thesystem can provide genuine additional value in coping with everyday issues outdoors.Peer reviewe

Aaltodoc Publication Archive

Image-Based Mobile Service: Automatic Text Extraction and Translation

Author: Berclaz Jérôme
Bhatti Nina
Schettino John
Simske Steven
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 12/01/2011
Field of study

We present a new mobile service for the translation of text from images taken by consumer-grade cell-phone cameras. Such capability represents a new paradigm for users where a simple image provides the basis for a service. The ubiquity and ease of use of cell-phone cameras enables acquisition and transmission of images anywhere and at any time a user wishes, delivering rapid and accurate translation over the phone’s MMS and SMS facilities. Target text is extracted completely automatically, requiring no bounding box delineation or related user intervention. The service uses localization, binarization, text deskewing, and optical character recognition (OCR) in its analysis. Once the text is translated, an SMS message is sent to the user with the result. Further novelties include that no software installation is required on the handset, any service provider or camera phone can be used, and the entire service is implemented on the server side

Infoscience - École polytechnique fédérale de Lausanne

INTEGRATED AUGMENTED REALITY SIGN BOARD MOBILE TRANSLATION SYSTEM

Author: Nhane Stelio Sergio
Publication venue: Universiti Teknologi Petronas
Publication date: 01/01/2014
Field of study

Malaysia continuously receives students, businessmen and tourists coming from all parts of the world. More often than not, these have little or no knowledge of the Malay language. Moreover, these will often miss important messages that will be conveyed through sign boards, information boards and any other written method because they are not versed in Malay language. Hence, an app has been proposed and developed to combine a highly accurate Optical Character Recognition engine, Tesseract and little of Augmented Reality concepts to provide Malay translation to foreigners. This app will prove to be useful because it will provide Malay translation with almost no user input, requiring only that these focus their devices’ cameras to the sign board’s text they intent to translate

UTPedia

Cell Phones as Imaging Sensors

Author: Baker Harlyn
Berclaz Jérôme
Bhatti Nina
Marguier Joanna
Süsstrunk Sabine
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 20/04/2010
Field of study

Camera phones are ubiquitous, and consumers have been adopting them faster than any other technology in modern history. When connected to a network, though, they are capable of more than just picture taking: Suddenly, they gain access to the power of the cloud. We exploit this capability by providing a series of image-based personal advisory services. These are designed to work with any handset over any cellular carrier using commonly available Multimedia Messaging Service (MMS) and Short Message Service (SMS) features. Targeted at the unsophisticated consumer, these applications must be quick and easy to use, not requiring download capabilities or preplanning. Thus, all application processing occurs in the back-end system (i.e., as a cloud service) and not on the handset itself. Presenting an image to an advisory service in the cloud, a user receives information that can be acted upon immediately. Two of our examples involve color assessment – selecting cosmetics and home décor paint palettes; the third provides the ability to extract text from a scene. In the case of the color imaging applications, we have shown that our service rivals the advice quality of experts. The result of this capability is a new paradigm for mobile interactions — image-based information services exploiting the ubiquity of camera phones

Infoscience - École polytechnique fédérale de Lausanne

Scene Text Extraction using Convolutional Neural Network with Amended MSER

Author: Valli S
Yegnaraman Aparna
Publication venue: Journal of Scientific and Industrial Research (JSIR)
Publication date: 29/10/2021
Field of study

Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well by evaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on the variation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawback of original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image size enhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss of YOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the network complexity as it does not improve the system performance. Experimental results demonstrate that the proposed method is productive in identifying text from natural scene images

Online Publishing @ NISCAIR

Scene Text Extraction using Convolutional Neural Network with Amended MSER

Author: Valli S
Yegnaraman A
Publication venue: NIScPR-CSIR, India
Publication date: 01/09/2021
Field of study

817-827Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well byevaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on thevariation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawbackof original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image sizeenhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss ofYOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the networkcomplexity as it does not improve the system performance. Experimental results demonstrate that the proposed method isproductive in identifying text from natural scene images

NOPR

Computer-assisted acquisition of information for visually impaired

Author: Lahti Lauri
Publication venue: Helsingin yliopisto
Publication date: 01/01/2006
Field of study

The study examines various uses of computer technology in acquisition of information for visually impaired people. For this study 29 visually impaired persons took part in a survey about their experiences concerning acquisition of infomation and use of computers, especially with a screen magnification program, a speech synthesizer and a braille display. According to the responses, the evolution of computer technology offers an important possibility for visually impaired people to cope with everyday activities and interacting with the environment. Nevertheless, the functionality of assistive technology needs further development to become more usable and versatile. Since the challenges of independent observation of environment were emphasized in the survey, the study led into developing a portable text vision system called Tekstinäkö. Contrary to typical stand-alone applications, Tekstinäkö system was constructed by combining devices and programs that are readily available on consumer market. As the system operates, pictures are taken by a digital camera and instantly transmitted to a text recognition program in a laptop computer that talks out loud the text using a speech synthesizer. Visually impaired test users described that even unsure interpretations of the texts in the environment given by Tekstinäkö system are at least a welcome addition to complete perception of the environment. It became clear that even with a modest development work it is possible to bring new, useful and valuable methods to everyday life of disabled people. Unconventional production process of the system appeared to be efficient as well. Achieved results and the proposed working model offer one suggestion for giving enough attention to easily overlooked needs of the people with special abilities. ACM Computing Classification System (1998): K.4.2 Social Issues: Assistive technologies for persons with disabilities I.4.9 Image processing and computer vision: ApplicationsTutkielma tarkastelee tietotekniikan erilaisia käyttötapoja näkövammaisten tiedon hankinnassa. Tutkielmaa varten tehdyllä kyselyllä kartoitettiin 29 näkövammaisen kokemuksia tiedon hankinnasta ja tietokoneen käytöstä erityisesti hyödyntäen ruudun suurennusohjelmaa, puhesyntetisaattoria ja pistenäyttöä. Vastausten perusteella tietotekniikan kehitys tarjoaa näkövammaisille tärkeän mahdollisuuden edistää arjen askareita ja vuorovaikutteisuutta ympäristöön. Kuitenkin esimerkiksi apuvälineiden toiminnallisuutta tulisi kehittää entistä havainnollisemmaksi ja monipuolisemmaksi. Kun ympäristön omatoimisen havainnoinnin haasteet korostuivat kyselyssä, osana tutkimusta kehitettiin mukana kuljetettava Tekstinäkö-järjestelmä. Perinteisestä omavaraisesta tuotantotavasta poiketen Tekstinäkö-järjestelmä koostettiin kuluttajamarkkinoilla olevista valmiista laitteista ja ohjelmista. Järjestelmässä digitaalikameralla otettavat kuvat johdetaan tuoreeltaan kannettavassa tietokoneessa toimivaan tekstintunnistusohjelmaan, ja kuvista tulkitut tekstit lausutaan puhesyntetisaattorilla. Näkövammaiset koekäyttäjät kuvailivat kehitetyn järjestelmän antamia epävarmojakin tulkintoja ympäristön teksteistä vähintään tervetulleeksi lisäksi täydentämään näkymän hahmottamista. Tuli ilmi, että jo pienelläkin kehitystyöllä voidaan vammaisten arkeen tuoda hyödyllisiä ja arvokkaitakin uusia menetelmiä. Myös järjestelmän ennakkoluuloton tuotantotapa osoittautui tehokkaaksi. Saadut tulokset ja esitetty toimintamalli tarjoavat eväitä helposti katveeseen jäävien erityisryhmien tarpeiden huomioimiseksi niille kuuluvalla painokkuudella nyt ja vastaisuudessa. ACM Computing Classification System (1998) -luokitus: K.4.2 Social Issues: Assistive technologies for persons with disabilities I.4.9 Image processing and computer vision: Application

Helsingin yliopiston digitaalinen arkisto

Towards Automatic Sign Translation

Author: Alex Waibel
Jiang Gao
Jie Yang
Ying Zhang
Publication venue
Publication date: 01/01/2001
Field of study

Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they also pose problems. For example, a tourist might not be able to understand signs in a foreign country. In this paper, we present our efforts towards automatic sign translation. We discuss methods for automatic sign detection. We describe sign translation using example based machine translation technology. We use a user-centered approach in developing an automatic sign translation system. The approach takes advantage of human intelligence in selecting an area of interest and domain for translation if needed. A user can determine which sign is to be translated if multiple signs have been detected within the image. The selected part of the image is then processed, recognized, and translated. We have developed a prototype system that can recognize Chinese signs input from a video camera which is a common gadget for a tourist, and translate them into English text or voice stream

CiteSeerX

Crossref