872 research outputs found

    Leveraging Knowledge Graphs for Orphan Entity Allocation in Resume Processing

    Full text link
    Significant challenges are posed in talent acquisition and recruitment by processing and analyzing unstructured data, particularly resumes. This research presents a novel approach for orphan entity allocation in resume processing using knowledge graphs. Techniques of association mining, concept extraction, external knowledge linking, named entity recognition, and knowledge graph construction are integrated into our pipeline. By leveraging these techniques, the aim is to automate and enhance the efficiency of the job screening process by successfully bucketing orphan entities within resumes. This allows for more effective matching between candidates and job positions, streamlining the resume screening process, and enhancing the accuracy of candidate-job matching. The approach's exceptional effectiveness and resilience are highlighted through extensive experimentation and evaluation, ensuring that alternative measures can be relied upon for seamless processing and orphan entity allocation in case of any component failure. The capabilities of knowledge graphs in generating valuable insights through intelligent information extraction and representation, specifically in the domain of categorizing orphan entities, are highlighted by the results of our research.Comment: In Proceedings of the 2023 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET

    Break Down Resumes into Sections to Extract Data and Perform Text Analysis using Python

    Get PDF
    The objective of AI-based resume screening is to automate the screening process, and text, keyword, and named entity recognition extraction are critical. This paper discusses segmenting resumes in order to extract data and perform text analysis. The raw CV file has been imported, and the resume data cleaned to remove extra spaces, punctuation and stop words. To extract names from resumes, regular expressions are used. We have also used the spaCy library which is considered the most accurate natural language processing library. It includes already-trained models for entity recognition, parsing, and tagging. The experimental method is used with resume data sourced from Kaggle, and external Source (MTIS)

    Distilling Large Language Models using Skill-Occupation Graph Context for HR-Related Tasks

    Full text link
    Numerous HR applications are centered around resumes and job descriptions. While they can benefit from advancements in NLP, particularly large language models, their real-world adoption faces challenges due to absence of comprehensive benchmarks for various HR tasks, and lack of smaller models with competitive capabilities. In this paper, we aim to bridge this gap by introducing the Resume-Job Description Benchmark (RJDB). We meticulously craft this benchmark to cater to a wide array of HR tasks, including matching and explaining resumes to job descriptions, extracting skills and experiences from resumes, and editing resumes. To create this benchmark, we propose to distill domain-specific knowledge from a large language model (LLM). We rely on a curated skill-occupation graph to ensure diversity and provide context for LLMs generation. Our benchmark includes over 50 thousand triples of job descriptions, matched resumes and unmatched resumes. Using RJDB, we train multiple smaller student models. Our experiments reveal that the student models achieve near/better performance than the teacher model (GPT-4), affirming the effectiveness of the benchmark. Additionally, we explore the utility of RJDB on out-of-distribution data for skill extraction and resume-job description matching, in zero-shot and weak supervision manner. We release our datasets and code to foster further research and industry applications

    Assessing Bias Removal from Word Embeddings

    Get PDF
    As machine learning becomes more influential in everyday life, we must begin addressing potential shortcomings. A current problem area is word embeddings, frameworks that transform words into numbers, allowing the algorithmic analysis of language. Without a method for filtering implicit human bias from the documents used to create these embeddings, they contain and propagate stereotypes. Previous work has shown that one commonly used and distributed word embedding model trained on articles from Google News contained prejudice between gender and occupation (Bolukbasi 2016). While unsurprising, the use of biased data in machine learning models only serves to amplify the problem. Although attempts have been made to remove or reduce these biases, a true solution has yet to be found. Hiring models, tools trained to identify well-fitting job candidates, show the impact of gender stereotypes on occupations. Companies like Amazon have abandoned these systems due to flawed decision-making, even after years of development. I investigated whether the technique of word embedding adjustments from Bolukbasi 2016 made a difference in the results of an emulated hiring model. After collecting and cleaning resumes and job postings, I created a model that predicted whether candidates were a good fit for a job based on a training set of resumes from those already hired. To assess differences, I built the same model with different word vectors, including the original and adjusted word2vec embedding. Results were expected to show some form of bias on classification. I conclude with potential improvements and additional work being done

    Automatic Job Skill Taxonomy Generation For Recruitment Systems

    Get PDF
    The goal of this thesis is to optimize the job recommendation systems by automatically extracting the skills from the job descriptions. With rapid development in technology, new skills are continuously required. This makes the skill tagging of the job descriptions a more difficult problem since a simple keyword match from an already generated skill list is not suitable. A way of automatically populating the skills list to improve the job search engines is needed. This thesis focuses on solving this problem with the help of natural language processing and neural networks. Automatic detection of skills in the unstructured job description dataset is a complex problem as it involves being robust to the ambiguity of natural language and adapting to words not seen in the historical data. This thesis solves this problem by using recurrent neural network models for capturing the context of the skill words. Based on the context captured, the new system is capable of predicting if the word in the given text is a skill or not. Neural network models like Long short-term memory and Bi-directional Long short-term memory are used to capture the long term dependencies in the sentence to identify skills present in the job descriptions. Various natural language processing techniques were utilized to improve the input feature quality to the model. Results obtained from using context before and after the skill words have shown the best results in identifying skills from textual data. This can be applied to capture skills data from job ads as well as it can be extended to extract the skill features from resume data to improve the job recommendation results in the future

    Human Resources Recommender system based on discrete variables

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceNatural Language Processing and Understanding has become one of the most exciting and challenging fields in the area of Artificial Intelligence and Machine Learning. With the rapidly changing business environment and surroundings, the importance of having the data transformed in such a way that makes it easy to interpret is the greatest competitive advantage a company can have. Having said this, the purpose of this thesis dissertation is to implement a recommender system for the Human Resources department in a company that will aid the decision-making process of filling a specific job position with the right candidate. The recommender system fill be fed with applicants, each being represented by their skills, and will produce a subset of most adequate candidates given a job position. This work uses StarSpace, a novelty neural embedding model, whose aim is to represent entities in a common vectorial space and further perform similarity measures amongst them
    corecore