42 research outputs found
JobIQ : recommending study pathways based on career choices
Modern job markets often require an intricate combination of multi-disciplinary skills or specialist and technical knowledge, even for entry-level positions. Such requirements pose increased pressure on higher education graduates entering the job market. This paper presents our JobIQ recommendation system helping prospective students choose educational programs or electives based on their career preferences. While existing recommendation solutions focus on internal institutional data, such as previous student experiences, JobIQ considers external data, recommending educational programs that best cover the knowledge and skills required by selected job roles. To deliver such recommendations, we create and compare skill profiles from job advertisements and educational subjects, aggregating them to skill profiles of job roles and educational programs. Using skill profiles, we build formal models and algorithms for program recommendations. Finally, we suggest other recommendations and benchmarking approaches, helping curriculum developers assess the job readiness of program graduates. The video presenting the JobIQ system is available online∗
Bootstrap confidence intervals for mean average precision
Due to the unconstrained nature of language, search engines (such as the Google search engine) are developed and compared by obtaining a document set, a sample set of queries and the associated relevance judgments for the queries on the document set. The de facto standard function used to measure the accuracy of each search engine on the test data is called mean Average Precision (AP). It is common practice to report mean AP scores and the results of paired significance tests against baseline search engines, but the confidence in the mean AP score is never reported. In this article, we investigate the utility of bootstrap confidence intervals for mean AP. We find that our Standardised logit bootstrap confidence intervals are very accurate for all levels of confidence examined and sample sizes
Fast approximate text document clustering using compressive sampling
Document clustering involves repetitive scanning of a document set, therefore as the size of the set increases, the time required for the clustering task increases and may even become impossible due to computational constraints. Compressive sampling is a feature sampling technique that allows us to perfectly reconstruct a vector from a small number of samples, provided that the vector is sparse in some known domain. In this article, we apply the theory behind compressive sampling to the document clustering problem using k-means clustering. We provide a method of computing high accuracy clusters in a fraction of the time it would have taken by directly clustering the documents. This is performed by using the Discrete Fourier Transform and the Discrete Cosine Transform. We provide empirical results showing that compressive sampling provides a 14 times increase in speed with little reduction in accuracy on 7,095 documents, and we also provide a very accurate clustering of a 231,219 document set, providing 20 times increase in speed when compared to performing k-means clustering on the document set. This shows that compressive clustering is a very useful tool that can be used to quickly compute approximate clusters
Examining document model residuals to provide feedback during information retrieval evaluation
Evaluation of document models for text based Information retrieval is crucial for developing document models that are appropriate for specific domains. Unfortunately, current document model evaluation methods for text retrieval provide no feedback, except for an evaluation score. To improve a model, we must use trial and error. In this article, we examine how we can provide feedback in the document model evaluation process, by providing a method of computing relevance score residuals and document model residuals for a given document-query set. Document model residuals provide us with an indication of where the document model is accurate and where it is not. We derive a simple method of computing the document model residuals using ridge regression. We also provide an analysis of the residuals of two document models, and show how we can use the correlation of document statistics to the residuals to provide statistically significant improvements to the precision of the model
Uncertainty in Rank-Biased Precision
Information retrieval metrics that provide uncertainty intervals when faced with unjudged documents, such as Rank-Biased Precision (RBP), provide us with an indication of the upper and lower bound of the system score. Unfortunately, the uncertainty is disregarded when examining the mean over a set of queries. In this article, we examine the distribution of the uncertainty per query and averaged over all queries, under the assumption that each unjudged document has the same probability of being relevant. We also derive equations for the mean, variance, and distribution of Mean RBP uncertainty. Finally, the impact of our assumption is assessed using simulation. We find that by removing the assumption of equal probability of relevance, we obtain a scaled form of the previously defined mean and standard deviation for the distribution of Mean RBP uncertainty
Confidence intervals for information retrieval evaluation
Information retrieval results are currently limited to the publication in which they exist. Significance tests are used to remove the dependence of the evaluation on the query sample, but the findings cannot be transferred to other systems not involved in the test. Confidence intervals for the population parameters provide query independent results and give insight to how each system is expected to behave when queried. Confidence intervals also allow the reader to compare results across articles because they provide the possible location of a systems population parameter. Unfortunately, we can only construct confidence intervals of population parameters if we have knowledge of the evaluation score distribution for each system. In this article, we investigate the distribution of Average Precision of a set of systems and examine if we can construct confidence intervals for the population mean Average Precision with a given level of confidence. We found that by standardising the scores, the system score distribution and system score sample mean distribution was approximately Normal for all systems, allowing us to construct accurate confidence intervals for the population mean Average Precision
Multiresolution Web link analysis using generalized link relations
Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popularity or authority of the set of Web pages in question. Although global popularity is useful for general queries, we find that global popularity is not as useful for queries in which the global population has less knowledge of. By examining the many different communities that appear within a Web page graph, we are able to compute the popularity or authority from a specific community. Multiresolution popularity lists allow us to observe the popularity of Web pages with respect to communities at different resolutions within the Web. Multiresolution popularity lists have been shown to have high potential when compared against PageRank. In this paper, we generalize the multiresolution popularity analysis to use any form of Web page link relations. We provide results for both the PageRank relations and the In-degree relations. By utilizing the multiresolution popularity lists, we achieve a 13 percent and 25 percent improvement in mean average precision over In-degree and PageRank, respectively
A blended metric for multi-label optimisation and evaluation
In multi-label classification, a large number of evaluation metrics exist, for example Hamming loss, exact match, and Jaccard similarity – but there are many more. In fact, there remains an apparent uncertainty in the multi-label literature about which metrics should be considered and when and how to optimise them. This has given rise to a proliferation of metrics, with some papers carrying out empirical evaluations under 10 or more different metrics in order to analyse method performance. We argue that further understanding of underlying mechanisms is necessary. In this paper we tackle the challenge of having a clearer view of evaluation strategies. We present a blended loss function. This function allows us to evaluate under the properties of several major loss functions with a single parameterisation. Furthermore we demonstrate the successful use of this metric as a surrogate loss for other metrics. We offer experimental investigation and theoretical backing to demonstrate that optimising this surrogate loss offers best results for several different metrics than optimising the metrics directly. It simplifies and provides insight to the task of evaluating multi-label prediction methodologies
By the power of Grayskull : small sample statistical power in information retrieval evaluation
Information Retrieval evaluation is typically performed using a sample of queries and a statistical hypothesis test is used to make inferences about the systems accuracy on the population of queries. Research has shown that the t test is one of a set of tests that provides the greatest statistical power while maintaining acceptable type I error rates, when evaluating with a large sample of queries. In this article, we investigate the effect of using a small query sample on the control of the type I error rate and change in type II error rate of a given set of hypothesis tests, meaning that the hypothesis tests may not satisfy Central Limit Theorem conditions. We found that all test performed similarly for unpaired tests. We also found that the bootstrap test provided greater power for the paired test, but violated the desired type I error rate for the smallest sample size (5 queries)
Approximate document outlier detection using random spectral projection
Outlier detection is an important process for text document collections, but as the collection grows, the detection process becomes a computationally expensive task. Random projection has shown to provide a good fast approximation of sparse data, such as document vectors, for outlier detection. The random samples of Fourier and cosine spectrum have shown to provide good approximations of sparse data when performing document clustering. In this article, we investigate the utility of using these random Fourier and cosine spectral projections for document outlier detection. We show that random samples of the Fourier spectrum for outlier detection provides better accuracy and requires less storage when compared with random projection. We also show that random samples of the cosine spectrum for outlier detection provides similar accuracy and computational time when compared with random projection, but requires much less storage