Search CORE

7,322 research outputs found

A Lite Romanian BERT:ALR-BERT

Author: Nicolae Dragoş Constantin
Tufiş Dan
Yadav Rohan Kumar
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.publishedVersio

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

NORA - Norwegian Open Research Archives

Agder University Research Archive

Data Selection for Compact Adapted SMT Models

Author: Besacier Laurent
Mirkin Shachar
Publication venue: HAL CCSD
Publication date: 01/10/2014
Field of study

International audienceData selection is a common technique for adapting statistical translation models for a specific domain, which has been shown to both improve translation quality and to reduce model size. Selection relies on some in-domain data, of the same domain of the texts expected to be translated. Selecting the sentence-pairs that are most similar to the in-domain data from a pool of parallel texts has been shown to be effective; yet, this approach holds the risk of resulting in a limited coverage, when necessary n-grams that do appear in the pool are less similar to in-domain data that is available in advance. Some methods select additional data based on the actual text that needs to be translated. While useful, this is not always a practical scenario. In this work we describe an extensive exploration of data selection techniques over Arabic to French datasets, and propose methods to address both similarity and coverage considerations while maintaining a limited model size

Hal - Université Grenoble Alpes

Using Twitter to learn about the autism community

Author: Arandjelovic Ognjen
Beykikhoshk Adham
Caelli Terry
Phung Dinh
Venkatesh Svetha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Considering the raising socio-economic burden of autism spectrum disorder (ASD), timely and evidence-driven public policy decision making and communication of the latest guidelines pertaining to the treatment and management of the disorder is crucial. Yet evidence suggests that policy makers and medical practitioners do not always have a good understanding of the practices and relevant beliefs of ASD-afflicted individuals' carers who often follow questionable recommendations and adopt advice poorly supported by scientific data. The key goal of the present work is to explore the idea that Twitter, as a highly popular platform for information exchange, could be used as a data-mining source to learn about the population affected by ASD -- their behaviour, concerns, needs etc. To this end, using a large data set of over 11 million harvested tweets as the basis for our investigation, we describe a series of experiments which examine a range of linguistic and semantic aspects of messages posted by individuals interested in ASD. Our findings, the first of their nature in the published scientific literature, strongly motivate additional research on this topic and present a methodological basis for further work.Comment: Social Network Analysis and Mining, 201

arXiv.org e-Print Archive

Deakin Research Online

University of St. Andrews - Pure

Automated detection of age-related macular degeneration in color fundus photography:a systematic review

Author: Cameron James
Dhillon Baljean
Fleming Alan
MacGillivray Thomas
Megaw Roly
Pead Emma
Trucco Emanuele
Publication venue: 'Elsevier BV'
Publication date: 18/02/2019
Field of study

Edinburgh Research Explorer

University of Dundee Online Publications

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Author: Caciularu Avi
Geva Mor
Goldberg Yoav
Wang Kevin Ro
Publication venue
Publication date: 20/04/2022
Field of study

Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. We view the token representation as a changing distribution over the vocabulary, and the output from each FFN layer as an additive update to that distribution. Then, we analyze the FFN updates in the vocabulary space, showing that each update can be decomposed to sub-updates corresponding to single FFN parameter vectors, each promoting concepts that are often human-interpretable. We then leverage these findings for controlling LM predictions, where we reduce the toxicity of GPT2 by almost 50%, and for improving computation efficiency with a simple early exit rule, saving 20% of computation on average

arXiv.org e-Print Archive

Decentralized Large-Scale Natural Language Processing Using Gossip Learning

Author: Alkathiri Abdul Aziz
Publication venue
Publication date: 23/06/2020
Field of study

Pure OAI Repository

“Fallen through the cracks”: Teachers’ perceptions of barriers faced by struggling literacy learners in secondary school

Author: Merga Margaret K.
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2020
Field of study

© 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. Struggling literacy learners are typically low achievers with poor engagement in literacy learning, and the gap between struggling and capable students widens as children move through the years of schooling. Literacy research and interventions for struggling literacy learners typically focus on the primary school years. The 2019 Supporting Struggling Secondary Literacy Learners mixed-methods project collected qualitative data on teacher perceptions of the barriers experienced by their struggling literacy learners in Australian mainstream secondary English classrooms. Recurring barriers included literacy skill gaps and English as an additional language status, absenteeism, home factors, student attitudes and engagement, school and systems factors, and learning difficulties and disabilities influencing learning. This project found high agreement with diverse individual and group level barriers, and diverse learner barriers were negatively associated with perceived adequacy of time to meet the needs of struggling literacy learners

Research Online @ ECU

Dynamic Data Selection for Neural Machine Translation

Author: Bisazza A.
Monz C.
van der Wees M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT). With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data selection. While state-of-the-art data selection (Axelrod et al., 2011) consistently performs well for PBMT, we show that gains are substantially lower for NMT. Next, we introduce dynamic data selection for NMT, a method in which we vary the selected subset of training data between different training epochs. Our experiments show that the best results are achieved when applying a technique we call gradual fine-tuning, with improvements up to +2.6 BLEU over the original data selection approach and up to +3.1 BLEU over a general baseline.Comment: Accepted at EMNLP201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE