Development of a Framework for Ontology Population Using Web Scraping in Mechatronics

Abstract

One of the major challenges in engineering contexts is the efficient collection, management, and sharing of data. To address this problem, semantic technologies and ontologies are potent assets, although some tasks, such as ontology population, usually demand high maintenance effort. This thesis proposes a framework to automate data collection from sparse web resources and insert it into an ontology. In the first place, a product ontology is created based on the combination of several reference vocabularies, namely GoodRelations, the Basic Formal Ontology, ECLASS stan- dard, and an information model. Then, this study introduces a general procedure for developing a web scraping agent to collect data from the web. Subsequently, an algorithm based on lexical similarity measures is presented to map the collected data to the concepts of the ontology. Lastly, the collected data is inserted into the ontology. To validate the proposed solution, this thesis implements the previous steps to collect information about microcontrollers from three differ- ent websites. Finally, the thesis evaluates the use case results, draws conclusions, and suggests promising directions for future research

    Similar works