Improving the matching of registered unemployed to job offers through machine learning algorithms

Cruz, Paula Isabel Moura Meireles

Improving the matching of registered unemployed to job offers through machine learning algorithms

Authors: Paula Isabel Moura Meireles Cruz
Publication date: 24 February 2017
Publisher

Abstract

Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceDue to the existence of a double-sided asymmetric information problem on the labour market characterized by a mutual lack of trust by employers and unemployed people, not enough job matches are facilitated by public employment services (PES), which seem to be caught in a low-end equilibrium. In order to act as a reliable third party, PES need to build a good and solid reputation among their main clients by offering better and less time consuming pre-selection services. The use of machine-learning, data-driven relevancy algorithms that calculate the viability of a specific candidate for a particular job opening is becoming increasingly popular in this field. Based on the Portuguese PES databases (CVs, vacancies, pre-selection and matching results), complemented by relevant external data published by Statistics Portugal and the European Classification of Skills/Competences, Qualifications and Occupations (ESCO), the current thesis evaluates the potential application of models such as Random Forests, Gradient Boosting, Support Vector Machines, Neural Networks Ensembles and other tree-based ensembles to the job matching activities that are carried out by the Portuguese PES, in order to understand the extent to which the latter can be improved through the adoption of automated processes. The obtained results seem promising and point to the possible use of robust algorithms such as Random Forests within the pre-selection of suitable candidates, due to their advantages at various levels, namely in terms of accuracy, capacity to handle large datasets with thousands of variables, including badly unbalanced ones, as well as extensive missing values and many-valued categorical variables

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Repositório da Universidade Nova de Lisboa

oai:run.unl.pt:10362/20197

Last time updated on 11/05/2018

New University of Lisbon's Repository

oai:run.unl.pt:10362/20197

Last time updated on 18/04/2020