Search CORE

14 research outputs found

Cross-Lingual Lexico-Semantic Transfer in Language Learning

Author: Kochmar E
Shutova E
Publication venue: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Publication date: 01/01/2016
Field of study

Lexico-semantic knowledge of our native language provides an initial foundation for second language learning. In this paper, we investigate whether and to what extent the lexico-semantic models of the native language (L1) are transferred to the second language (L2). Specifically, we focus on the problem of lexical choice and investigate it in the context of three typologically diverse languages: Russian, Spanish and English. We show that a statistical semantic model learned from L1 data improves automatic error detection in L2 for the speakers of the respective L1. Finally, we investigate whether the semantic model learned from a particular L1 is portable to other, typologically related languages.Ekaterina Kochmar’s research is supported by Cambridge English Language Assessment via the ALTA Institute. Ekaterina Shutova’s research is supported by the Leverhulme Trust Early Career Fellowship

Crossref

Apollo (Cambridge)

Comparative judgments are more consistent than binary classification for labelling word complexity

Author: Blackwell A
Gooding S
Kochmar E
Sarkar A
Publication venue: LAW 2019 - 13th Linguistic Annotation Workshop, Proceedings of the Workshop
Publication date: 01/01/2019
Field of study

© 2019 Association for Computational Linguistics Lexical simplification systems replace complex words with simple ones based on a model of which words are complex in context. We explore how users can help train complex word identification models through labelling more efficiently and reliably. We show that using an interface where annotators make comparative rather than binary judgments leads to more reliable and consistent labels, and explore whether comparative judgments may provide a faster way for collecting labels

Crossref

Apollo (Cambridge)

Recommended from our members

Detecting learner errors in the choice of content words using compositional distributional semantics

Author: Briscoe T
Kochmar E
Publication venue: COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers
Publication date: 23/08/2014
Field of study

We describe a novel approach to error detection in adjective-noun combinations. We present and release a new dataset of annotated errors where the examples are extracted from learner texts and annotated with error types. We show how compositional distributional semantic approaches can be applied to discriminate between correct and incorrect word combinations from learner data. Finally, we show how the output of the compositional distributional semantic models can be used as features in a classifier yielding good precision and accuracy.We are grateful to Cambridge English Language Assessment and Cambridge University Press for supporting this research and for granting us access to the CLC for research purposes

Apollo (Cambridge)

Recommended from our members

‘Calling on the classical phone’: a distributional model of adjective-noun errors in learners’ English

Author: Herbelot A
Kochmar E
Publication venue: Proceedings of COLING 2016
Publication date: 01/01/2016
Field of study

In this paper we discuss three key points related to error detection (ED) in learners’ English. We focus on content word ED as one of the most challenging tasks in this area, illustrating our claims on adjective–noun (AN) combinations. In particular, we (1) investigate the role of context in accurately capturing semantic anomalies and implement a system based on distributional topic coherence, which achieves state-of-the-art accuracy on a standard test set; (2) thoroughly investigate our system’s performance across individual adjective classes, concluding that a class-dependent approach is beneficial to the task; (3) discuss the data size bottleneck in this area, and highlight the challenges of automatic error generation for content words.Ekaterina Kochmar’s research is supported by Cambridge English Language Assessment via the ALTA Institute. Aurélie Herbelot’s contribution to this paper was similarly supported by ALTA

Apollo (Cambridge)

Archive ouverte UNIGE

Classification of twitter accounts into automated agents and human users

Author: Crowcroft J
Gilani Z
Kochmar E
Publication venue: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017
Publication date: 31/07/2017
Field of study

© 2017 Association for Computing Machinery. Online social networks (OSNs) have seen a remarkable rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks (such as Twitter) facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection research tool. We then apply a Random Forests classifier that achieves an accuracy close to human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results

OPUS

Apollo (Cambridge)

Grammatical error correction using hybrid systems and type filtering

Author: Andersen ØE
Felice M
Kochmar E
Yannakoudakis H
Yuan Z
Publication venue: CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings of the Shared Task
Publication date: 01/01/2014
Field of study

This paper describes our submission to the CoNLL 2014 shared task on grammatical error correction using a hybrid approach, which includes both a rule-based and an SMT system augmented by a large webbased language model. Furthermore, we demonstrate that correction type estimation can be used to remove unnecessary corrections, improving precision without harming recall. Our best hybrid system achieves state of-the-art results, ranking first on the original test set and second on the test set with alternative annotations.[We would like to thank] Cambridge English Language Assessment, a division of Cambridge Assessment, for supporting this research

CiteSeerX

Crossref

Apollo (Cambridge)

Recommended from our members

Capturing anomalies in the choice of content words in compositional distributional semantic space

Author: Briscoe T
Kochmar E
Publication venue: Proceedings of Recent Advances in Natural Language Processing
Publication date: 07/09/2013
Field of study

In this work, we present a new task for testing compositional distributional semantic models. Recently, there has been a spate of research into how distributional representations of individual words can be combined to represent the meaning of phrases. Vecchi et al. (2011) have shown that some compositional models, including the additive and multiplicative models of Mitchell and Lapata (2008; 2010) and the linear map-based model of Baroni and Zamparelli (2010), can be applied to detect semantically anomalous adjective- noun combinations. We extend their experiments and apply these models to the combinations extracted from texts written by learners of English. Our work contributes to the field of compositional distributional semantics by introducing a new test paradigm for semantic models and shows how these models can be used for error detection in language learners' content word combinations.We are grateful to Cambridge ESOL, a division of Cambridge Assessment, and Cambridge University Press for supporting this research and for granting us access to the CLC for research purposes

Apollo (Cambridge)

SYNDROMES OF BEHAVIORAL AND SPEECH DISORDERS ASSOCIATED WITH BENIGN EPILEPTIFORM DISCHARGES OF CHILDHOOD ON ELECTROENCEPHALOGRAM

Author: A. V. Polyakov
E. A. Tupikina
I. A. Sadekov
I. V. Sadekova
T. V. Termenzhi
V. Yu. Kochmar
Publication venue: 'Publishing House ABV Press'
Publication date: 01/04/2017
Field of study

Objective: to assess the role and significance of benign epileptiform discharges of childhood (BEDC) on electroencephalogram (EEG) in development of speech and behaviorial disorders in children.Materials and methods. 90 children aged 3–7 years were included in the study: 30 of them were healthy, 30 had attention deficit hyperactivity disorder (ADHD), and 30 had expressive language disorder (ELD). We analyzed the role of persistent epileptiform activity (BEDC type) in EEG as well as frontal intermittent rhythmic delta activity in the development of some neuropsychiatric disorders and speech disorders in children.Results. We suggest to allocate a special variant of ADHD – epileptiform disintegration of behavior; we also propose the strategies for its therapeutic correction.Conclusion. Detection of epileptiform activity (BEDC type) on EEG in children with ELD is a predictor of cognitive disorders development and requires therapeutic correction, which should be aimed at stimulation of brain maturation. Detection of frontal intermittent rhythmic delta activity in children with ELD requires neurovisualization with further determining of treatment strategy

Directory of Open Access Journals