Search CORE

16,516 research outputs found

Recommended from our members

Linguistic Distributional Information and Sensorimotor Similarity BothContribute to Semantic Category Production

Author: Banks Briony
Connell Louise
Wingfield Cai
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

We investigated the contribution of sensorimotor and linguistic distributional information in a semantic category produc-tion task, hypothesizing that the task would rely on both but particularly on linguistic distributional information, whichmay provide a shortcut for conceptual processing. In a pre-registered study, we asked participants to name members ofsemantic categories and tested whether responses were predicted by a novel measure of sensorimotor proximity (based onan 11-dimension representation of sensorimotor experience) and linguistic proximity (based on word co-occurrence de-rived from a large subtitle corpus). Both proximity measures predicted the order and frequency of responses and, critically,linguistic proximity had an effect above and beyond sensorimotor proximity. Our findings support linguistic-sensorimotoraccounts of the conceptual system and suggest that category production is based on both the similarity of sensorimotor ex-perience between the category and member concepts, and on the linguistic distributional relationship between the categoryand member labels

eScholarship - University of California

Hierarchies over Vector Space: Orienting Word and Graph Embeddings

Author: Guo Xingzhi
Skiena Steven
Publication venue
Publication date: 02/11/2022
Field of study

Word and graph embeddings are widely used in deep learning applications. We present a data structure that captures inherent hierarchical properties from an unordered flat embedding space, particularly a sense of direction between pairs of entities. Inspired by the notion of \textit{distributional generality}, our algorithm constructs an arborescence (a directed rooted tree) by inserting nodes in descending order of entity power (e.g., word frequency), pointing each entity to the closest more powerful node as its parent. We evaluate the performance of the resulting tree structures on three tasks: hypernym relation discovery, least-common-ancestor (LCA) discovery among words, and Wikipedia page link recovery. We achieve average 8.98\% and 2.70\% for hypernym and LCA discovery across five languages and 62.76\% accuracy on directed Wiki-page link recovery, with both substantially above baselines. Finally, we investigate the effect of insertion order, the power/similarity trade-off and various power sources to optimize parent selection

arXiv.org e-Print Archive

Using distributional similarity to organise biomedical terminology

Author: Dowdall James
Keller Bill
Schneider Gerold
Weeds Julie
Weir David
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2005
Field of study

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy

ZORA

Sussex Research Online

A distributional model of semantic context effects in lexical processinga

Author: Brew Chris
McDonald Scott
Publication venue
Publication date: 01/01/2002
Field of study

One of the most robust findings of experimental psycholinguistics is that the context in which a word is presented influences the effort involved in processing that word. We present a novel model of contextual facilitation based on word co-occurrence prob ability distributions, and empirically validate the model through simulation of three representative types of context manipulation: single word priming, multiple-priming and contextual constraint. In our simulations the effects of semantic context are mod eled using general-purpose techniques and representations from multivariate statistics, augmented with simple assumptions reflecting the inherently incremental nature of speech understanding. The contribution of our study is to show that special-purpose m echanisms are not necessary in order to capture the general pattern of the experimental results, and that a range of semantic context effects can be subsumed under the same principled account.›

CogPrints Cognitive Sciences Eprint Archive

From Frequency to Meaning: Vector Space Models of Semantics

Author: Pantel Patrick
Turney Peter D.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

Distributional semantics beyond words: Supervised learning of analogy and paraphrase

Author: Turney Peter D.
Publication venue
Publication date: 18/10/2013
Field of study

There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). One way to extend beyond words is to compare two tuples using a function that combines pairwise similarities between the component words in the tuples. A strength of this approach is that it works with both relational similarity (analogy) and compositional similarity (paraphrase). However, past work required hand-coding the combination function for different tasks. The main contribution of this paper is that combination functions are generated by supervised learning. We achieve state-of-the-art results in measuring relational similarity between word pairs (SAT analogies and SemEval~2012 Task 2) and measuring compositional similarity between noun-modifier phrases and unigrams (multiple-choice paraphrase questions)

arXiv.org e-Print Archive

NRC Publications Archive