19 research outputs found

    Data-driven Job Search Engine Using Skills and Company Attribute Filters

    Full text link
    According to a report online, more than 200 million unique users search for jobs online every month. This incredibly large and fast growing demand has enticed software giants such as Google and Facebook to enter this space, which was previously dominated by companies such as LinkedIn, Indeed and CareerBuilder. Recently, Google released their "AI-powered Jobs Search Engine", "Google For Jobs" while Facebook released "Facebook Jobs" within their platform. These current job search engines and platforms allow users to search for jobs based on general narrow filters such as job title, date posted, experience level, company and salary. However, they have severely limited filters relating to skill sets such as C++, Python, and Java and company related attributes such as employee size, revenue, technographics and micro-industries. These specialized filters can help applicants and companies connect at a very personalized, relevant and deeper level. In this paper we present a framework that provides an end-to-end "Data-driven Jobs Search Engine". In addition, users can also receive potential contacts of recruiters and senior positions for connection and networking opportunities. The high level implementation of the framework is described as follows: 1) Collect job postings data in the United States, 2) Extract meaningful tokens from the postings data using ETL pipelines, 3) Normalize the data set to link company names to their specific company websites, 4) Extract and ranking the skill sets, 5) Link the company names and websites to their respective company level attributes with the EVERSTRING Company API, 6) Run user-specific search queries on the database to identify relevant job postings and 7) Rank the job search results. This framework offers a highly customizable and highly targeted search experience for end users.Comment: 8 pages, 10 figures, ICDM 201

    Keyword Detection in Text Summarization

    Get PDF
    Summarization is the process of reducing a text document in order to create a summary that retains the most important points of the original document. As the problem of information overload has grown, and as the quantity of data has increased, so has interest in automatic summarization. Extractive summary works on the given text to extract sentences that best convey the message hidden in the text. Most extractive summarization techniques revolve around the concept of indexing keywords and extracting sentences that have more keywords than the rest. Keyword extraction usually is done by extracting important words having a higher frequency than others, with stress on important. However the current techniques to handle this importance include a stop list which might include words that are critically important to the text. In this thesis, I present a work in progress to define an algorithm to extract truly significant keywords which might have lost its significance if subjected to the current keyword extraction algorithms

    A Scalable Feature Selection and Opinion Miner Using Whale Optimization Algorithm

    Get PDF
    Due to the fast-growing volume of text documents and reviews in recent years, current analyzing techniques are not competent enough to meet the users' needs. Using feature selection techniques not only support to understand data better but also lead to higher speed and also accuracy. In this article, the Whale Optimization algorithm is considered and applied to the search for the optimum subset of features. As known, F-measure is a metric based on precision and recall that is very popular in comparing classifiers. For the evaluation and comparison of the experimental results, PART, random tree, random forest, and RBF network classification algorithms have been applied to the different number of features. Experimental results show that the random forest has the best accuracy on 500 features. Keywords: Feature selection, Whale Optimization algorithm, Selecting optimal, Classification algorith

    Data Extract: Mining Context from the Web for Dataset Extraction

    Full text link

    Learning to extract folktale keywords

    Get PDF
    Manually assigned keywords provide a valuable means for accessing large document collections. They can serve as a shallow document summary and enable more efficient retrieval and aggregation of information. In this paper we investigate keywords in the context of the Dutch Folktale Database, a large collection of stories including fairy tales, jokes and urban legends. We carry out a quantitative and qualitative analysis of the keywords in the collection. Up to 80% of the assigned keywords (or a minor variation) appear in the text itself. Human annotators show moderate to substantial agreement in their judgment of keywords. Finally, we evaluate a learning to rank approach to extract and rank keyword candidates. We conclude that this is a promising approach to automate this time intensive task
    corecore