Search CORE

35,795 research outputs found

Syntactic Topic Models

Author: Blei David M.
Boyd-Graber Jordan
Publication venue
Publication date: 01/01/2008
Field of study

The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics

arXiv.org e-Print Archive

CiteSeerX

Semantic diversity:A measure of contextual variation in word meaning based on latent semantic analysis

Author: AM Woollams
British National Corpus Consortium
C Metzler
CD Piercey
D Kieras
DA Cruse
DE Klein
E Jefferies
E Jefferies
EM Saffran
F Corbett
G Kellas
GA Miller
H Head
H Rubenstein
J Altarriba
J Morton
J Rodd
JD Bransford
JE Jastrzembski
JL Elman
JL McClelland
JM Rodd
JM Rodd
JM Rodd
JS Adelman
JT Giles
K Lund
KA Noonan
M Bedny
M Coltheart
MA Lambon Ralph
Matthew A. Lambon Ralph
MF St. John
MJ Yap
MN Jones
MW Harm
P Hoffman
P Hoffman
P Hoffman
Paul Hoffman
PJ Schwanenflugel
PJ Schwanenflugel
R Borowsky
RC Galbraith
S Zeno
SA McDonald
Timothy T. Rogers
TK Landauer
TK Landauer
TL Griffiths
TT Rogers
TT Rogers
Y Hino
Y Hino
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2013
Field of study

Crossref

Edinburgh Research Explorer

The University of Manchester - Institutional Repository

Analysing Lexical Semantic Change with Contextualised Word Representations

Author: Del Tredici Marco
Fernández Raquel
Giulianelli Mario
Publication venue
Publication date: 01/01/2020
Field of study

This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations. We propose a novel method that exploits the BERT neural language model to obtain representations of word usages, clusters these representations into usage types, and measures change along time with three proposed metrics. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements. Our extensive qualitative analysis demonstrates that our method captures a variety of synchronic and diachronic linguistic phenomena. We expect our work to inspire further research in this direction.Comment: To appear in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL-2020

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

From Frequency to Meaning: Vector Space Models of Semantics

Author: Pantel Patrick
Turney Peter D.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

Learning Language from a Large (Unannotated) Corpus

Author: Goertzel Ben
Vepstas Linas
Publication venue
Publication date: 14/01/2014
Field of study

A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa

arXiv.org e-Print Archive

CiteSeerX