Search CORE

4 research outputs found

Conversion of Cadastral Survey Information into LandXML Files using Machine Learning

Author: De La Rosa Oscar Garrido
Publication venue
Publication date: 01/10/2019
Field of study

Although new cadastral surveys can readily be produced in the industry standard LandXML format, there is a vast amount of pre-existing information which is only stored as image files. Automating the back-capture of this information would improve a process which is labour intensive and prone to human error. This project proposes a workflow to automate this process, in relation to Victorian cadastral survey information. Specific algorithms and outcomes are examined using a simplified sample cadastral plan. The literature review reveals that similar documentation processes have been undertaken in other fields, such as music (Calvo-Zaragoza et al., 2018). In the cadastral context only true to scale cadastral maps have been digitised but not surveyors’ sketches or field records (Ignjatić et al., 2018) A simple plan was created containing a closed parcel and two instrument points for creation and testing of the workflow. An analysis of the tasks required to extract the information needed for the LandXML files was undertaken. A pipeline was designed to perform the data extraction in a machine learning environment, which has been dubbed Double Filter Capture. It consists of two main workflows that handle the graphical information and the text elements separately, by means of Computer Vision and Optical Character Recognition algorithms, respectively. An implementation of the actions in the pipeline was trialled and barriers encountered discussed. Several Machine Learning algorithms were used for the required tasks, such as line detection, corner detection, image rotation, text detection and text extraction. The project gives some idea of the possibilities and limitations that a larger scale automated back-capture would face, when dealing with records of significantly greater complexity. It also points the way to further research required to refine the extraction process outlined here, for example including elements omitted in this project, such as occupation and other auxiliary information and hand-written records. This project demonstrates automated accurate data extraction from an image file is possible, however an extensive investment would be required in the programming stage, given the complexity and inconsistencies of existing plans that require back-capture

University of Southern Queensland ePrints

Text line extraction in graphical documents using background and foreground information

Author: C.L. Tan
F. Hones
G. Nagy
H. Goto
H.C. Park
Josep Lladós
L. O’Gorman
L.A. Fletcher
N. Otsu
Partha Pratim Roy
U. Pal
U. Pal
Umapada Pal
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref