2 research outputs found
Predicting outcome for collaborative featured article nomination in Wikipedia
In Wikipedia, good articles are wanted. While Wikipedia re-lies on collaborative effort from online volunteers for quality checking, the process of selecting top quality articles is time consuming. At present, the duty of decision making is shoul-dered by only a couple of administrators. Aiming to assist in the quality checking cycles so as to cope with the exponential growth of online contributions to Wikipedia, this work stud-ies the task of predicting the outcome of featured article (FA) nominations. We analyze FA candidate (FAC) sessions col-lected over a period of 3.5 years, and examine the extent to which consensus has been practised in this process. We ex-plore the use of interaction features between FAC reviewers to learn SVM classifiers to predict the nomination outcome. We find that, calibrating the individual user’s polarity of opinions as features improves the prediction accuracy significantly
Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation
Precision-recall (PR) curves and the areas under them are widely used to
summarize machine learning results, especially for data sets exhibiting class
skew. They are often used analogously to ROC curves and the area under ROC
curves. It is known that PR curves vary as class skew changes. What was not
recognized before this paper is that there is a region of PR space that is
completely unachievable, and the size of this region depends only on the skew.
This paper precisely characterizes the size of that region and discusses its
implications for empirical evaluation methodology in machine learning.Comment: ICML2012, fixed citations to use correct tech report numbe