845 research outputs found
Inside the Black Box of Search Algorithms
A behind-the-scenes look at the algorithms that rank results in Bloomberg Law, Fastcase, Lexis Advance, and Westlaw
Inside the Black Box of Search Algorithms
A behind-the-scenes look at the algorithms that rank results in Bloomberg Law, Fastcase, Lexis Advance, and Westlaw
Scalable Text and Link Analysis with Mixed-Topic Link Models
Many data sets contain rich information about objects, as well as pairwise
relations between them. For instance, in networks of websites, scientific
papers, and other documents, each node has content consisting of a collection
of words, as well as hyperlinks or citations to other nodes. In order to
perform inference on such data sets, and make predictions and recommendations,
it is useful to have models that are able to capture the processes which
generate the text at each node and the links between them. In this paper, we
combine classic ideas in topic modeling with a variant of the mixed-membership
block model recently developed in the statistical physics community. The
resulting model has the advantage that its parameters, including the mixture of
topics of each document and the resulting overlapping communities, can be
inferred with a simple and scalable expectation-maximization algorithm. We test
our model on three data sets, performing unsupervised topic classification and
link prediction. For both tasks, our model outperforms several existing
state-of-the-art methods, achieving higher accuracy with significantly less
computation, analyzing a data set with 1.3 million words and 44 thousand links
in a few minutes.Comment: 11 pages, 4 figure
Automated free text marking with Paperless School
The Paperless School automarking system utilises a number of novel approaches to address the challenge of providing both summative and formative assessments with little or no human intervention.
The Paperless School system is designed primarily for day-to-day, low stakes testing of essay and short-text student inputs. It intentionally sacrifices some degree of accuracy to achieve ease of set up, but nevertheless provides an accurate view of the abilities of each student by averaging marks over a number of essays.
The system is designed to function as a back-end service to an Learning Management System (LMS), thus facilitating the marking of large numbers of texts. This should enable considerable teacher resources to be freed up for other teaching tasks.
In this paper we will discuss some of the issues involved in bringing
computational linguistics to bear in the educational context. We will
cover
• how Blooms Taxonomy (the pedagogical model underlying most formal grading schemes) can be represented in software.
• an overview of the steps required to derive a grade that will sufficiently closely predict the grade a human marker would give.
• extending the system to include formative assessment, via intelligent comment banks
- …