Search CORE

3,411 research outputs found

Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions

Author: Buttery P
Caines A
Craighead H
Yannakoudakis H
Publication venue: Proceedings of the Annual Meeting of the Association for Computational Linguistics
Publication date: 01/01/2020
Field of study

We address the task of automatically grading the language proficiency of spontaneous speech based on textual features from automatic speech recognition transcripts. Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). We encode the transcriptions with both bi-directional recurrent neural networks and with bi-directional representations from transformers, compare against a feature-rich baseline, and analyse performance at different proficiency levels and with transcriptions of varying error rates. Our best performance comes from a transformer encoder with L1 prediction as an auxiliary task. We discuss areas for improvement and potential applications for text-only speech scoring.Cambridge Assessmen

Crossref

Apollo (Cambridge)

King's Research Portal

Recommended from our members

Skills embeddings: A neural approach to multicomponent representations of students and tasks

Author: Buttery P
Caines A
Elliott M
Moore R
Rice A
Zaidi A
Publication venue: EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining
Publication date: 01/01/2019
Field of study

Educational systems use models of student skill to inform decision-making processes. Defining such a model manually is challenging due to the large number of relevant factors. We introduce an alternative approach by learning multidimensional representations (embeddings) from student activity data. Such embeddings are fixed-length real vectors with three desirable characteristics: co-location of similar students and items in a vector space; magnitude increases with skill, and that absence of a skill can be represented. Based on the Multicomponent Latent Trait Model, we use a neural network with complementary trainable weights to learn these embeddings by backpropagation in an unsupervised manner. We evaluate using synthetic student activity data that provides a ground-truth of student skills in order to understand the impact of number of students, question items and knowledge components in the domain. We find that our data-mined parameter values can recreate the synthetic datasets up to the accuracy of the model that generated them, for domains containing up to 10 simultaneously active knowledge components, which can be effectively mined using relatively small quantities of data (1000 students, 100 items). We describe a procedure to estimate the number of components in a domain, and propose a component-masking logic mechanism that improves performance on high-dimensional datasets.Cambridge Assessmen

Apollo (Cambridge)

Strangeness production in jets from p+p \sqrt{s} = 200 GeV collisions

Author: Albino S
Anthony R Timmins
Anulli F (BABAR Collaboration)
Bassetto A
Caines H (STAR Collaboration)
Dokshitzer Y
Salam G
Sjostrand T
the Star Collaboration
Publication venue: 'IOP Publishing'
Publication date: 17/07/2010
Field of study

Measurements of strangeness production in jets help illuminate the QCD mechanisms in fragmentation. Furthermore, they provide a crucial baseline for heavy-ion studies where modifications in jet chemistry have recently been predicted. We present new results on strange particle production in jets from p+p \sqrt{s} = 200 GeV collisions measured by the STAR experiment. The momentum distributions of the \Lambda, \bar{\Lambda} and K0Short particles are obtained using various jet finding algorithms, and then compared to various models. Strange particle ratios in jets are obtained and compared to values obtained from the inclusive spectra. Finally, we show jets tagged with leading strange baryons and mesons, in order to investigate whether gluon or quark jets can be isolated in this way.Comment: 5 pages, 4 figures, Winter Workshop on Nuclear Dynamics 2010, Jamaic

arXiv.org e-Print Archive

Crossref

Marketing of giftwares in the United States.

Author: Caines Elinor A.
Publication venue: Boston University
Publication date: 01/01/1954
Field of study

Thesis ()--Boston Universit

Boston University Institutional Repository (OpenBU)

Recommended from our members

Accurate modelling of language learning tasks and students using representations of grammatical proficiency

Author: Buttery P
Caines A
Davis C
Moore R
Rice A
Zaidi AH
Publication venue: EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining
Publication date: 01/01/2019
Field of study

Adaptive learning systems aim to learn the relationship between curriculum content and students in order to optimise a student’s learning process. One form of such a system is content recommendation in which the system attempts to predict the most suitable content to next present to the student. In order to develop such a system, we must learn reliable representations of the curriculum content and the student. We consider this in the context of foreign language learning and present a novel neural network architecture to learn such representations. We also show that by incorporating grammatical error distributions as a feature in our neural architecture, we can substantially improve the quality of our representations. Different types of grammatical error are automatically detected in essays submitted by students to an online learning platform. We evaluate our model and representations by predicting student scores and grammatical error distributions on unseen language tasks. We also discuss further uses for our model beyond content recommendation such as inferring student knowledge components for a given domain and optimising spacing and repetition of content for efficient long term retention.Cambridge Assessmen

Apollo (Cambridge)

On saturation of charged hadron production in pp collisions at LHC

Author: Abelev B I (STAR Collaboration)
Bentvelsen S
Caines H (STAR Collaboration)
Dremin I M
I Zborovský
M V Tokarev
Matveev V A
NA61 Collaboration
Polyakov A M
Polyakov A M
The ATLAS Collaboration
Publication venue: 'IOP Publishing'
Publication date: 09/03/2010
Field of study

First results on charged hadron transverse momentum spectra in pp collisions obtained by the CMS Collaboration at LHC were analyzed in z-scaling approach. The first LHC data confirm z-scaling. The saturation regime of the scaling function psi(z) observed in pp and antp-pp interactions at lower energy sqrt s = 19-1960 GeV is verified. The saturation of psi(z) for charged hadrons is found down to z=0.05 at the highest energy sqrt s = 2360 GeV reached till now at colliders. A microscopic scenario of hadron production is discussed in connection with search for new signatures of phase transitions in hadron matter. Constituent energy loss and its dependencies on the transverse momentum of charged hadrons and collision energy are estimated. The beam energy scan at LHC in the saturation region is suggested.Comment: LaTeX, 6 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Impact of ASR performance on free speaking language assessment

Author: Caines AP
Gales MJF
Knill KM
Kyriakopoulos K
Malinin A
Ragni A
Wang Y
Publication venue: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication date: 01/01/2018
Field of study

In free speaking tests candidates respond in spontaneous speech to prompts. This form of test allows the spoken language proficiency of a non-native speaker of English to be assessed more fully than read aloud tests. As the candidate's responses are unscripted, transcription by automatic speech recognition (ASR) is essential for automated assessment. ASR will never be 100% accurate so any assessment system must seek to minimise and mitigate ASR errors. This paper considers the impact of ASR errors on the performance of free speaking test auto-marking systems. Firstly rich linguistically related features, based on part-of-speech tags from statistical parse trees, are investigated for assessment. Then, the impact of ASR errors on how well the system can detect whether a learner's answer is relevant to the question asked is evaluated. Finally, the impact that these errors may have on the ability of the system to provide detailed feedback to the learner is analysed. In particular, pronunciation and grammatical errors are considered as these are important in helping a learner to make progress. As feedback resulting from an ASR error would be highly confusing, an approach to mitigate this problem using confidence scores is also analysed

Crossref

Apollo (Cambridge)

White Rose Research Online

Is soft physics entropy driven?

Author: A. Tounsi
B.B. Back
D. Adamova
E. Andersen
E. Fermi
F. Antinori
H. Caines
H. Satz
J. Adams
L.D. Landau
M. Gazdicki
M.A. Lisa
O. Barannik
R. Hagedron
S. Adler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/09/2006
Field of study

The soft physics, pT < 2 GeV/c, observables at both RHIC and the SPS have now been mapped out in quite specific detail. From these results there is mounting evidence that this regime is primarily driven by the multiplicity per unit rapidity, dNch/deta. This suggests that the entropy of the system alone is the underlying driving force for many of the global observables measured in heavy-ion collisions. That this is the case and there is an apparent independence on collision energy is surprising. I present the evidence for this multiplicity scaling and use it to make some extremely naive predictions for the soft sector results at the LHC.Comment: Proceedings of Hot Quarks 2006. 8 figures, 6 page

arXiv.org e-Print Archive

Crossref

CAMsterdam at SemEval-2019 task 6: Neural and graph-based feature extraction for the identification of offensive tweets

Author: Aglionby G
Buttery P
Caines A
Davis C
Mishra P
Rei M
Shutova E
Yannakoudakis H
Publication venue: NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop
Publication date: 01/01/2019
Field of study

We describe the CAMsterdam team entry to the SemEval-2019 Shared Task 6 on offen-sive language identification in Twitter data.Our proposed model learns to extract tex-tual features using a multi-layer recurrent net-work, and then performs text classification us-ing gradient-boosted decision trees (GBDT). A self-attention architecture enables the model to focus on the most relevant areas in the text.We additionally learn globally optimised em-beddings for hashtags using node2vec, which are given as additional tweet features to the GBDT classifier.Our best model obtains78.79% macro F1-score on detecting offensive language (subtask A), 66.32% on categorising offence types (targeted/untargeted; subtask B),and 55.36% on identifying the target of of-fence (subtask C)

Crossref

Spiral - Imperial College Digital Repository

Apollo (Cambridge)