Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceDue to the existence of a double-sided asymmetric information problem on the labour market
characterized by a mutual lack of trust by employers and unemployed people, not enough job matches
are facilitated by public employment services (PES), which seem to be caught in a low-end equilibrium.
In order to act as a reliable third party, PES need to build a good and solid reputation among their main
clients by offering better and less time consuming pre-selection services. The use of machine-learning,
data-driven relevancy algorithms that calculate the viability of a specific candidate for a particular job
opening is becoming increasingly popular in this field. Based on the Portuguese PES databases (CVs,
vacancies, pre-selection and matching results), complemented by relevant external data published by
Statistics Portugal and the European Classification of Skills/Competences, Qualifications and
Occupations (ESCO), the current thesis evaluates the potential application of models such as Random
Forests, Gradient Boosting, Support Vector Machines, Neural Networks Ensembles and other tree-based
ensembles to the job matching activities that are carried out by the Portuguese PES, in order to
understand the extent to which the latter can be improved through the adoption of automated
processes. The obtained results seem promising and point to the possible use of robust algorithms such
as Random Forests within the pre-selection of suitable candidates, due to their advantages at various
levels, namely in terms of accuracy, capacity to handle large datasets with thousands of variables,
including badly unbalanced ones, as well as extensive missing values and many-valued categorical
variables