Search CORE

35 research outputs found

Contextualized Diachronic Word Representations

Author: Jawahar Ganesh
Seddah Djamé
Publication venue: HAL CCSD
Publication date: 01/01/2019
Field of study

International audienceDiachronic word embeddings play a key role in capturing interesting patterns about how language evolves over time. Most of the existing work focuses on studying corpora spanning across several decades, which is understandably still not a possibility when working on social media-based user-generated content. In this work, we address the problem of studying semantic changes in a large Twitter corpus collected over five years, a much shorter period than what is usually the norm in di-achronic studies. We devise a novel attentional model, based on Bernoulli word embeddings, that are conditioned on contextual extra-linguistic (social) features such as network, spatial and socioeconomic variables, which are associated with Twitter users, as well as topic-based features. We posit that these social features provide an inductive bias that helps our model to overcome the narrow time-span regime problem. Our extensive experiments reveal that our proposed model is able to capture subtle semantic shifts without being biased towards frequency cues and also works well when certain con-textual features are absent. Our model fits the data better than current state-of-the-art dynamic word embedding models and therefore is a promising tool to study diachronic semantic changes over small time periods

Crossref

INRIA a CCSD electronic archive server

What does BERT learn about the structure of language?

Author: Jawahar Ganesh
Sagot Benoît
Seddah Djamé
Publication venue: HAL CCSD
Publication date: 01/01/2019
Field of study

International audienceBERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks. This result indicates the possibility that BERT networks capture structural information about language. In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. We first show that BERT's phrasal representation captures phrase-level information in the lower layers. We also show that BERT's intermediate layers encode a rich hierarchy of linguistic information, with surface features at the bottom, syntactic features in the middle and semantic features at the top. BERT turns out to require deeper layers when long-distance dependency information is required, e.g.~to track subject-verb agreement. Finally, we show that BERT representations capture linguistic information in a compositional way that mimics classical, tree-like structures

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

LLM Performance Predictors are good initializers for Architecture Search

Author: Abdul-Mageed Muhammad
Ding Dujian
Jawahar Ganesh
Lakshmanan Laks V. S.
Publication venue
Publication date: 25/10/2023
Field of study

Large language models (LLMs) have become an integral component in solving a wide range of NLP tasks. In this work, we explore a novel use case of using LLMs to build performance predictors (PP): models that, given a specific deep neural network architecture, predict its performance on a downstream task. We design PP prompts for LLMs consisting of: (i) role: description of the role assigned to the LLM, (ii) instructions: set of instructions to be followed by the LLM to carry out performance prediction, (iii) hyperparameters: a definition of each architecture-specific hyperparameter and (iv) demonstrations: sample architectures along with their efficiency metrics and 'training from scratch' performance. For machine translation (MT) tasks, we discover that GPT-4 with our PP prompts (LLM-PP) can predict the performance of architecture with a mean absolute error matching the SOTA and a marginal degradation in rank correlation coefficient compared to SOTA performance predictors. Further, we show that the predictions from LLM-PP can be distilled to a small regression model (LLM-Distill-PP). LLM-Distill-PP models surprisingly retain the performance of LLM-PP largely and can be a cost-effective alternative for heavy use cases of performance estimation. Specifically, for neural architecture search (NAS), we propose a Hybrid-Search algorithm for NAS (HS-NAS), which uses LLM-Distill-PP for the initial part of search, resorting to the baseline predictor for rest of the search. We show that HS-NAS performs very similar to SOTA NAS across benchmarks, reduces search hours by 50% roughly, and in some cases, improves latency, GFLOPs, and model size

arXiv.org e-Print Archive

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora

Author: Goldberg Yoav
Gonen Hila
Jawahar Ganesh
Seddah Djamé
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

International audienceThe problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science. This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large. However, these methods often require extensive filtering of the vocabulary to perform well, and-as we show in this work-result in unstable, and hence less reliable, results. We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word. The method is simple, interpretable and stable. We demonstrate its effectiveness in 9 different setups, considering different corpus splitting criteria (age, gender and profession of tweet authors, time of tweet) and different languages (English, French and Hebrew)

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation

Author: Abdul-Mageed Muhammad
Awadallah Ahmed Hassan
Bubeck Sebastien
Gao Jianfeng
Jawahar Ganesh
Kim Young Jin
Lakshmanan Laks V. S.
Liu Xiaodong
Mukherjee Subhabrata
Publication venue
Publication date: 07/06/2023
Field of study

Mixture-of-Expert (MoE) models have obtained state-of-the-art performance in Neural Machine Translation (NMT) tasks. Existing works in MoE mostly consider a homogeneous design where the same number of experts of the same size are placed uniformly throughout the network. Furthermore, existing MoE works do not consider computational constraints (e.g., FLOPs, latency) to guide their design. To this end, we develop AutoMoE -- a framework for designing heterogeneous MoE's under computational constraints. AutoMoE leverages Neural Architecture Search (NAS) to obtain efficient sparse MoE sub-transformers with 4x inference speedup (CPU) and FLOPs reduction over manually designed Transformers, with parity in BLEU score over dense Transformer and within 1 BLEU point of MoE SwitchTransformer, on aggregate over benchmark datasets for NMT. Heterogeneous search space with dense and sparsely activated Transformer modules (e.g., how many experts? where to place them? what should be their sizes?) allows for adaptive compute -- where different amounts of computations are used for different tokens in the input. Adaptivity comes naturally from routing decisions which send tokens to experts of different sizes. AutoMoE code, data, and trained models are available at https://aka.ms/AutoMoE.Comment: ACL 2023 Finding

arXiv.org e-Print Archive

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Author: Abdul-Mageed Muhammad
de Rosa Gustavo Henrique
Dey Debadeepta
Jawahar Ganesh
Lakshmanan Laks V. S.
Mendes Caio Cesar Teodoro
Mukherjee Subhabrata
Shah Shital
Publication venue
Publication date: 06/10/2022
Field of study

Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more challenging setting consisting of low frequency user prompt patterns (or broad prompts, e.g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models. We study this problem under memory-constrained settings (e.g., edge devices and smartphones), where character-based representation is effective in reducing the overall model size (in terms of parameters). We use WikiText-103 benchmark to simulate broad prompts and demonstrate that character models rival word models in exact match accuracy for the autocomplete task, when controlled for the model size. For instance, we show that a 20M parameter character model performs similar to an 80M parameter word model in the vanilla setting. We further propose novel methods to improve character models by incorporating inductive bias in the form of compositional information and representation transfer from large word models

arXiv.org e-Print Archive

ELMoLex: Connecting ELMo and Lexicon features for Dependency Parsing

Author: Fethi Amal
Jawahar Ganesh
Martin Louis
Muller Benjamin
Sagot Benoît
Seddah Djamé
Villemonte de La Clergerie Éric
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

International audienceIn this paper, we present the details of the neural dependency parser and the neu-ral tagger submitted by our team 'ParisNLP' to the CoNLL 2018 Shared Task on parsing from raw text to Universal Dependencies. We augment the deep Biaffine (BiAF) parser (Dozat and Manning, 2016) with novel features to perform competitively: we utilize an indomain version of ELMo features (Peters et al., 2018) which provide context-dependent word representations; we utilize disambiguated, embedded, morphosyntactic features from lexicons (Sagot, 2018), which complements the existing feature set. Henceforth , we call our system 'ELMoLex'. In addition to incorporating character embed-dings, ELMoLex leverage pre-trained word vectors, ELMo and morphosyntactic features (whenever available) to correctly handle rare or unknown words which are prevalent in languages with complex morphology. ELMoLex 1 ranked 11th by Labeled Attachment Score metric (70.64%), Morphology-aware LAS metric (55.74%) and ranked 9th by Bilexical dependency metric (60.70%). In an extrinsic evaluation setup, ELMoLex ranked 7 th for Event Extraction, Negation Resolution tasks and 11th for Opinion Analysis task by F1 score

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Prevalence and architecture of de novo mutations in developmental disorders.

Author: Ahmed Munaza
Aitken Stuart
Akawi Nadia
Alvi Mohsan
Ambridge Kirsty
Anjum Uruj
Archer Hayley
Armstrong Ruth
Awada Jana
Balasubramanian Meena
Banka Siddharth
Baralle Diana
Barnicoat Angela
Barrett Daniel M.
Barrett Jeffrey C.
Batstone Paul
Baty David
Bayzetinova Tanya
Bennett Chris
Berg Jonathan
Bernhard Birgitta
Bevan A. Paul
Bitner-Glindzicz Maria
Blair Edward
Blyth Moira
Bohanna David
Bourdon Louise
Bourn David
Bradley Lisa
Brady Angela
Brent Simon
Brewer Carole
Brunstrom Kate
Bunyan David J.
Burn John
Canham Natalie
Castle Bruce
Chandler Kate
Chatzimichali Elena
Cilliers Deirdre
Clarke Angus
Clasper Susan
Clayton Stephen
Clayton-Smith Jill
Clowes Virginia
Coates Andrea
Cole Trevor
Colgiu Irina
Collins Amanda
Collinson Morag N.
Connell Fiona
Cooper Nicola
Cox Helen
Cresswell Lara
Cross Gareth
Crow Yanick
D'Alessandro Mariella
Dabir Tabib
Davidson Rosemarie
Davies Sally
de Vries Dylan
Dean John
Deshpande Charu
Devlin Gemma
Dixit Abhijit
Dobbie Angus
Donaldson Alan
Donnai Dian
Donnelly Carina
Donnelly Deirdre
Douglas Angela
Douzgou Sofia
Duncan Alexis
Eason Jacqueline
Ellard Sian
Ellis Ian
Elmslie Frances
Evans Karenza
Everest Sarah
Fendick Tina
Firth Helen V.
Fisher Richard
Fitzgerald Tomas W.
FitzPatrick David R.
Flinter Frances
Foulds Nicola
Fry Andrew
Fryer Alan
Gardiner Carol
Gaunt Lorraine
Ghali Neeti
Gibbons Richard
Gill Harinder
Goodship Judith
Goudie David
Gray Emma
Green Andrew
Greene Philip
Greenhalgh Lynn
Gribble Susan
Harrison Lucy
Harrison Rachel
Harrison Victoria
Hawkins Rose
He Liu
Hellens Stephen
Henderson Alex
Hewitt Sarah
Hildyard Lucy
Hobson Emma
Holden Simon
Holder Muriel
Holder Susan
Hollingsworth Georgina
Homfray Tessa
Humphreys Mervyn
Hurles Matthew E.
Hurst Jane
Hutton Ben
Ingram Stuart
Irving Melita
Islam Lily
Jackson Andrew
Jarvis Joanna
Jenkins Lucy
Johnson Diana
Jones Elizabeth
Jones Philip
Jones Wendy D.
Josifova Dragana
Joss Shelagh
Kaemba Beckie
Kaplanis Joanna
Kazembe Sandra
Kelsell Rosemary
Kerr Bronwyn
King Daniel
Kingston Helen
Kini Usha
Kinning Esther
Kirby Gail
Kirk Claire
Kivuva Emma
Kraus Alison
Krishnappa Netravathi
Kumar Dhavendra
Kumar V. K. Ajith
Lachlan Katherine
Lam Wayne
Lampe Anne
Langman Caroline
Lees Melissa
Lim Derek
Longman Cheryl
Lowther Gordon
Lynch Sally A.
Magee Alex
Maher Eddy
Male Alison
Mansour Sahar
Marks Karen
Martin Katherine
Mason Laura E.
Maye Una
McCann Emma
McConnell Vivienne
McEntagart Meriel
McGowan Ruth
McKay Kirsten
McKee Shane
McMullan Dominic J.
McNerlan Susan
McRae Jeremy F.
McWilliam Catherine
Mehta Sarju
Metcalfe Kay
Middleton Anna
Miedzybrodzka Zosia
Miles Emma
Mohammed Shehla
Montgomery Tara
Moore David
Morgan Sian
Morton Jenny
Mugalaasi Hood
Murday Victoria
Murphy Helen
Naik Swati
Nellåker Chris
Nemeth Andrea
Nevitt Louise
Newbury-Ecob Ruth
Norman Andrew
O'Shea Rosie
Ogilvie Caroline
Ong Kai-Ren
Park Soo-Mi
Parker Michael
Parker Michael J.
Patel Chirag
Paterson Joan
Payne Stewart
Perrett Daniel
Phipps Julie
Pilz Daniela T.
Pollard Martin
Pottinger Caroline
Poulton Joanna
Pratt Norman
Prescott Katrina
Price Sue
Pridham Abigail
Prigmore Elena
Procter Annie
Purnell Hellen
Quarrell Oliver
Ragge Nicola
Rahbari Raheleh
Rajan Diana
Randall Josh
Rankin Julia
Raymond Lucy
Rice Debbie
Robert Leema
Roberts Eileen
Roberts Gillian
Roberts Jonathan
Roberts Paul
Ross Alison
Rosser Elisabeth
Saggar Anand
Samant Shalaka
Sampson Julian
Sandford Richard
Sarkar Ajoy
Schweiger Susann
Scott Richard
Scurr Ingrid
Selby Ann
Seller Anneke
Sequeira Cheryl
Shannon Nora
Sharif Saba
Shaw-Smith Charles
Shearing Emma
Shears Debbie
Sheridan Eamonn
Sifrim Alejandro
Simonic Ingrid
Singh Tarjinder
Singzon Roldan
Skitt Zara
Smith Audrey
Smith Kath
Smithson Sarah
Sneddon Linda
Splitt Miranda
Squires Miranda
Stewart Fiona
Stewart Helen
Straub Volker
Suri Mohnish
Sutton Vivienne
Swaminathan Ganesh Jawahar
Sweeney Elizabeth
Tatton-Brown Kate
Taylor Cat
Taylor Rohan
Tein Mark
Temple I. Karen
Thomson Jenny
Tischkowitz Marc
Tivey Adrian R.
Tomkins Susan
Torokwa Audrey
Treacy Becky
Turner Claire
Turnpenny Peter
Tysoe Carolyn
Vandersteen Anthony
Varghese Vinod
Vasudevan Pradeep
Vijayarangakannan Parthiban
Vogt Julie
Wakeling Emma
Wallwark Sarah
Waters Jonathon
Weber Astrid
Wellesley Diana
Whiteford Margo
Widaa Sara
Wilcox Sarah
Wilkinson Emily
Williams Denise
Williams Nicola
Wilson Louise
Woods Geoff
Wragg Christopher
Wright Caroline F.
Wright Michael
Yates Laura
Yau Michael
Publication venue: Nature
Publication date: 24/01/2017
Field of study

The genomes of individuals with severe, undiagnosed developmental disorders are enriched in damaging de novo mutations (DNMs) in developmentally important genes. Here we have sequenced the exomes of 4,293 families containing individuals with developmental disorders, and meta-analysed these data with data from another 3,287 individuals with similar disorders. We show that the most important factors influencing the diagnostic yield of DNMs are the sex of the affected individual, the relatedness of their parents, whether close relatives are affected and the parental ages. We identified 94 genes enriched in damaging DNMs, including 14 that previously lacked compelling evidence of involvement in developmental disorders. We have also characterized the phenotypic diversity among these disorders. We estimate that 42% of our cohort carry pathogenic DNMs in coding sequences; approximately half of these DNMs disrupt gene function and the remainder result in altered protein function. We estimate that developmental disorders caused by DNMs have an average prevalence of 1 in 213 to 1 in 448 births, depending on parental age. Given current global demographics, this equates to almost 400,000 children born per year

Southampton (e-Prints Soton)

Online Research @ Cardiff

UCL Discovery

Edinburgh Research Explorer

Open Research Exeter

Enlighten

The University of Manchester - Institutional Repository

Apollo (Cambridge)

White Rose Research Online

St George's Online Research Archive

Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing.

Author: A Kiezun
A McKenna
A Ramu
Alan Fryer
Alejandro Sifrim
Allan Daly
Ami Ketley
Anna Wilsdon
Anne Katrin Lampe
Anne-Karin Kahlert
Ashok Kumar Manickara
B Thienpont
Bernard Keavney
Bernard Thienpont
Brigitte Stiller
C Moore
Carmel Moore
Caroline F Wright
Catherine Cosgrove
CE Schulkey
CF Wright
Chris Thornborough
D Blazek
D Szklarczyk
David J Roberts
David R FitzPatrick
Denise Williams
Diana Rajan
Dorin Manase
Elena Prigmore
Emma Hobson
Felix Berger
Frances Bu'Lock
Ganesh Jawahar Swaminathan
H Li
H Li
H-H Chen
Hans-Heiner Kramer
Hashim Abdul-Khaliq
Helen Cox
Helen V Firth
HK Gill
Hugh Watkins
I Iossifov
Ingo Daehnert
Irina-Gabriela Colgiu
J David Brook
J de Ligt
J Fielitz
J Homsy
Jacoba J Louw
Jamie Bentham
Jeffrey C Barrett
Jennifer Sambrook
Jeremy McRae
Jeroen Breckpot
JIE Hoffman
JM Giroud
John Danesh
Judith Goodship
K Breuer
K Liang
K Tatton-Brown
Karen P McCarthy
Katherine Lachlan
Kay Metcalfe
KE Samocha
Kerry Setchfield
KG Ingram
Kirstin Hoff
Koenraad Devriendt
L Bordoli
Leema Robert
Marc Gewillig
Marc-Phillip Hitz
Martin O Pollard
Matthew E Hurles
ME Pierpont
Michael J Parker
Michael Wright
N Akawi
N Miyake
N Øyen
Natalie Canham
Okan Toka
Piers E F Daubeney
PN Robinson
R Shaheen
RC Bauer
Riyadh Mahdi Abu-Sulaiman
Ruth Newbury-Ecob
S Zaidi
Sabine Klaassen
Saeed H Al Turki
SE Polo
Seema Mital
Seham Osman Omer
Shoumo Bhattacharya
Siddharth Banka
Soo-Mi Park
Tarjinder Singh
Tessa Homfray
Thomas Pickardt
Tomas W Fitzgerald
Ulrike M M Bauer
Willem H Ouwehand
X He
Y Jia
Y Li
Publication venue: Nat Genet
Publication date: 24/06/2016
Field of study

Congenital heart defects (CHDs) have a neonatal incidence of 0.8-1% (refs. 1,2). Despite abundant examples of monogenic CHD in humans and mice, CHD has a low absolute sibling recurrence risk (∼2.7%), suggesting a considerable role for de novo mutations (DNMs) and/or incomplete penetrance. De novo protein-truncating variants (PTVs) have been shown to be enriched among the 10% of 'syndromic' patients with extra-cardiac manifestations. We exome sequenced 1,891 probands, including both syndromic CHD (S-CHD, n = 610) and nonsyndromic CHD (NS-CHD, n = 1,281). In S-CHD, we confirmed a significant enrichment of de novo PTVs but not inherited PTVs in known CHD-associated genes, consistent with recent findings. Conversely, in NS-CHD we observed significant enrichment of PTVs inherited from unaffected parents in CHD-associated genes. We identified three genome-wide significant S-CHD disorders caused by DNMs in CHD4, CDK13 and PRKD1. Our study finds evidence for distinct genetic architectures underlying the low sibling recurrence risk in S-CHD and NS-CHD

Southampton (e-Prints Soton)

Edinburgh Research Explorer

Spiral - Imperial College Digital Repository

The University of Manchester - Institutional Repository

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Oxford University Research Archive

Open Research Exeter

Apollo (Cambridge)

MDC Repository