44 research outputs found

    Machine Learning for Data Linkage

    Get PDF
    Data linkage traditionally uses deterministic and probabilistic methods. Alternatively, machine learning methods can be applied as classification algorithms, using the data to inform decisions. This project compared the quality, in terms of precision and recall, of traditional methods with selected machine learning methods when applied to a standard linkage problem. Two supervised methods, gradient boosted trees (GBT) and multiple layered perceptron classifier (MLPC), and one unsupervised method, maximum entropy classification (MEC), were implemented. The England and Wales 2021 Census to Census Coverage Survey (CCS) linkage was used as a gold-standard (GS) linked dataset to provide training samples for the supervised methods as well as testing samples for all methods. The F1 score (harmonic mean of precision and recall) was used to compare the performance of the models and to determine the optimal parameters and thresholds. The Splink implementation of Fellegi-Sunter with Expectation Maximisation was used as a baseline for comparison. The methods, trained on a sample of the GS, were used to link census and CCS data. All methods performed well with MEC achieving the highest precision (99.79%) but lowest recall (96.36%). The MLPC model achieved the highest F1 score (98.94%). To understand the implications of not retraining supervised models for each dataset, the models were also used to link Census to a health dataset. The supervised models were not retrained using the health data; instead, the optimised GS models were applied. MEC had the lowest precision (96.51%) but the highest recall (98.48%) and highest F1 score (97.49%). With F1 scores of 96.99% and 96.14% respectively, the GBT and MLPC supervised models were not far behind in performance, despite not being trained using health data. We have shown that machine learning methods can be used effectively for data linkage problems. Unsurprisingly, supervised models perform best when trained on and applied to the same data. Further research into generic training may allow us to use both supervised and unsupervised machine learning models for future data linkage

    Sensor Integration in a Low Cost Land Mobile Mapping System

    Get PDF
    Mobile mapping is a multidisciplinary technique which requires several dedicated equipment, calibration procedures that must be as rigorous as possible, time synchronization of all acquired data and software for data processing and extraction of additional information. To decrease the cost and complexity of Mobile Mapping Systems (MMS), the use of less expensive sensors and the simplification of procedures for calibration and data acquisition are mandatory features. This article refers to the use of MMS technology, focusing on the main aspects that need to be addressed to guarantee proper data acquisition and describing the way those aspects were handled in a terrestrial MMS developed at the University of Porto. In this case the main aim was to implement a low cost system while maintaining good quality standards of the acquired georeferenced information. The results discussed here show that this goal has been achieved

    Load tests on roof trusses made with bolded thinwalled sections for greenhouses

    No full text

    Structural design of steel-framed buildings for agriculture and industry

    No full text
    corecore