845 research outputs found

    Effectiveness of learning to rank for finding user similarity in social media

    Get PDF

    Inside the Black Box of Search Algorithms

    Get PDF
    A behind-the-scenes look at the algorithms that rank results in Bloomberg Law, Fastcase, Lexis Advance, and Westlaw

    Inside the Black Box of Search Algorithms

    Get PDF
    A behind-the-scenes look at the algorithms that rank results in Bloomberg Law, Fastcase, Lexis Advance, and Westlaw

    Scalable Text and Link Analysis with Mixed-Topic Link Models

    Full text link
    Many data sets contain rich information about objects, as well as pairwise relations between them. For instance, in networks of websites, scientific papers, and other documents, each node has content consisting of a collection of words, as well as hyperlinks or citations to other nodes. In order to perform inference on such data sets, and make predictions and recommendations, it is useful to have models that are able to capture the processes which generate the text at each node and the links between them. In this paper, we combine classic ideas in topic modeling with a variant of the mixed-membership block model recently developed in the statistical physics community. The resulting model has the advantage that its parameters, including the mixture of topics of each document and the resulting overlapping communities, can be inferred with a simple and scalable expectation-maximization algorithm. We test our model on three data sets, performing unsupervised topic classification and link prediction. For both tasks, our model outperforms several existing state-of-the-art methods, achieving higher accuracy with significantly less computation, analyzing a data set with 1.3 million words and 44 thousand links in a few minutes.Comment: 11 pages, 4 figure

    Automated free text marking with Paperless School

    Get PDF
    The Paperless School automarking system utilises a number of novel approaches to address the challenge of providing both summative and formative assessments with little or no human intervention. The Paperless School system is designed primarily for day-to-day, low stakes testing of essay and short-text student inputs. It intentionally sacrifices some degree of accuracy to achieve ease of set up, but nevertheless provides an accurate view of the abilities of each student by averaging marks over a number of essays. The system is designed to function as a back-end service to an Learning Management System (LMS), thus facilitating the marking of large numbers of texts. This should enable considerable teacher resources to be freed up for other teaching tasks. In this paper we will discuss some of the issues involved in bringing computational linguistics to bear in the educational context. We will cover • how Blooms Taxonomy (the pedagogical model underlying most formal grading schemes) can be represented in software. • an overview of the steps required to derive a grade that will sufficiently closely predict the grade a human marker would give. • extending the system to include formative assessment, via intelligent comment banks
    • …
    corecore