Search CORE

3 research outputs found

Automated Data Digitization System for Vehicle Registration Certificates Using Google Cloud Vision API

Author: Intakosum Sarun
Kongkla Prateep
Sirisathitkul Yaowarat
Thammarak Karanrat
Publication venue: 'Ital Publication'
Publication date: 01/07/2022
Field of study

This study aims to develop an automated data digitization system for the Thai vehicle registration certificate. It is the first system developed as a web service Application Programming Interface (API), which is essential for any enterprise to increase its business value. Currently, this system is available on “www.carjaidee.com”. The system involves four steps: 1) an embedded frame aligns a document to be correctly recognised in the image acquisition step; 2) sharpening and brightness filtering techniques to enhance image quality are applied in the pre-processing step; 3) the Google Cloud Vision API receives a prompt to proceed in the recognition step; 4) a specific domain dictionary to improve accuracy rate is developed for the post-processing step. This study defines 92 images for the experiment by counting the correct words and terms from the output. The findings suggest that the proposed method, which had an average accuracy of 93.28%, was significantly more accurate than the original method using only the Google Cloud Vision API. However, the system is limited because the dictionaries cannot automatically recognise a new word. In the future, we will explore solutions to this problem using natural language processing techniques. Doi: 10.28991/CEJ-2022-08-07-09 Full Text: PD

Civil Engineering Journal (C.E.J)

Accessibility-as-a-service an open-source reading assistive tool for education

Author: Cao Qi
Chinsean Sum
Wong Dennis
Yam Shaun
Yau Peter C.Y.
Publication venue: Academy Publisher
Publication date: 01/08/2023
Field of study

As technology evolves, more and more articles and materials are readily available on the internet for the world to use. This project proposes and demonstrates the implementation of an application to further increase the accessibility of web pages, through the use of image recognition techniques, object detection, and optical character recognition (OCR). The proposed application allows users to input URLs and the application will process the web page in under a minute and outputs a modified web page with translated words detected from images

Enlighten

Information Extraction for Thai Documents

Author: Smith Dan J.
Sukhahuta Rattasit
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2000
Field of study

An increasing amount of electronically available information is stored in Asian language documents, which makes Information Retrieval (IR) and Information Extraction (IE) for these languages important for a large number of users. Analysis and extraction of information in these languages presents several interesting problems not seen in Western European languages; these are interesting in their own right and for the insights they can give into more general IR and IE techniques. We describe these problems and our system for Thai language IE One of the main concerns when working with Thai natural language is that the structure of the language itself is highly ambiguous. The analyser therefore requires more sophisticated techniques and large amounts of domain knowledge to cope with these ambiguities. We describe our approach to a natural language analysis system that performs preprocessing for the Thai language and the extraction module to retrieve specific information according to the predefined concept definitions

University of East Anglia digital repository