7 research outputs found
Career Path Prediction using Resume Representation Learning and Skill-based Matching
The impact of person-job fit on job satisfaction and performance is widely
acknowledged, which highlights the importance of providing workers with next
steps at the right time in their career. This task of predicting the next step
in a career is known as career path prediction, and has diverse applications
such as turnover prevention and internal job mobility. Existing methods to
career path prediction rely on large amounts of private career history data to
model the interactions between job titles and companies. We propose leveraging
the unexplored textual descriptions that are part of work experience sections
in resumes. We introduce a structured dataset of 2,164 anonymized career
histories, annotated with ESCO occupation labels. Based on this dataset, we
present a novel representation learning approach, CareerBERT, specifically
designed for work history data. We develop a skill-based model and a text-based
model for career path prediction, which achieve 35.24% and 39.61% recall@10
respectively on our dataset. Finally, we show that both approaches are
complementary as a hybrid approach achieves the strongest result with 43.01%
[email protected]: Accepted to the 3nd Workshop on Recommender Systems for Human
Resources (RecSys in HR 2023) as part of RecSys 202
Extreme Multi-Label Skill Extraction Training using Large Language Models
Online job ads serve as a valuable source of information for skill
requirements, playing a crucial role in labor market analysis and e-recruitment
processes. Since such ads are typically formatted in free text, natural
language processing (NLP) technologies are required to automatically process
them. We specifically focus on the task of detecting skills (mentioned
literally, or implicitly described) and linking them to a large skill ontology,
making it a challenging case of extreme multi-label classification (XMLC).
Given that there is no sizable labeled (training) dataset are available for
this specific XMLC task, we propose techniques to leverage general Large
Language Models (LLMs). We describe a cost-effective approach to generate an
accurate, fully synthetic labeled dataset for skill extraction, and present a
contrastive learning strategy that proves effective in the task. Our results
across three skill extraction benchmarks show a consistent increase of between
15 to 25 percentage points in \textit{R-Precision@5} compared to previously
published results that relied solely on distant supervision through literal
matches.Comment: Accepted to the International workshop on AI for Human Resources and
Public Employment Services (AI4HR&PES) as part of ECML-PKDD 202