Search CORE

2,395 research outputs found

Shades of meaning: Uncovering the geometry of ambiguous word representations through contextualised language models

Author: Cevoli Benedetta
Gao Yang
Rastle Kathleen
Watkins Chris
Publication venue
Publication date: 26/04/2023
Field of study

Lexical ambiguity presents a profound and enduring challenge to the language sciences. Researchers for decades have grappled with the problem of how language users learn, represent and process words with more than one meaning. Our work offers new insight into psychological understanding of lexical ambiguity through a series of simulations that capitalise on recent advances in contextual language models. These models have no grounded understanding of the meanings of words at all; they simply learn to predict words based on the surrounding context provided by other words. Yet, our analyses show that their representations capture fine-grained meaningful distinctions between unambiguous, homonymous, and polysemous words that align with lexicographic classifications and psychological theorising. These findings provide quantitative support for modern psychological conceptualisations of lexical ambiguity and raise new challenges for understanding of the way that contextual information shapes the meanings of words across different timescales

arXiv.org e-Print Archive

What can we learn about language from artificial intelligence?:Analysing word meanings through the lenses of modern natural language models

Author: Cevoli Benedetta
Publication venue
Publication date: 01/01/2022
Field of study

Royal Holloway - Pure

How sketches work: a cognitive theory for improved system design

Author: Jonathan C. Fish (7169888)
Publication venue
Publication date: 01/01/1996
Field of study

Evidence is presented that in the early stages of design or composition the mental processes used by artists for visual invention require a different type of support from those used for visualising a nearly complete object. Most research into machine visualisation has as its goal the production of realistic images which simulate the light pattern presented to the retina by real objects. In contrast sketch attributes preserve the results of cognitive processing which can be used interactively to amplify visual thought. The traditional attributes of sketches include many types of indeterminacy which may reflect the artist's need to be "vague". Drawing on contemporary theories of visual cognition and neuroscience this study discusses in detail the evidence for the following functions which are better served by rough sketches than by the very realistic imagery favoured in machine visualising systems. 1. Sketches are intermediate representational types which facilitate the mental translation between descriptive and depictive modes of representing visual thought. 2. Sketch attributes exploit automatic processes of perceptual retrieval and object recognition to improve the availability of tacit knowledge for visual invention. 3. Sketches are percept-image hybrids. The incomplete physical attributes of sketches elicit and stabilise a stream of super-imposed mental images which amplify inventive thought. 4. By segregating and isolating meaningful components of visual experience, sketches may assist the user to attend selectively to a limited part of a visual task, freeing otherwise over-loaded cognitive resources for visual thought. 5. Sequences of sketches and sketching acts support the short term episodic memory for cognitive actions. This assists creativity, providing voluntary control over highly practised mental processes which can otherwise become stereotyped. An attempt is made to unite the five hypothetical functions. Drawing on the Baddeley and Hitch model of working memory, it is speculated that the five functions may be related to a limited capacity monitoring mechanism which makes tacit visual knowledge explicitly available for conscious control and manipulation. It is suggested that the resources available to the human brain for imagining nonexistent objects are a cultural adaptation of visual mechanisms which evolved in early hominids for responding to confusing or incomplete stimuli from immediately present objects and events. Sketches are cultural inventions which artificially mimic aspects of such stimuli in order to capture these shared resources for the different purpose of imagining objects which do not yet exist. Finally the implications of the theory for the design of improved machine systems is discussed. The untidy attributes of traditional sketches are revealed to include cultural inventions which serve subtle cognitive functions. However traditional media have many short-comings which it should be possible to correct with new technology. Existing machine systems for sketching tend to imitate nonselectively the media bound properties of sketches without regard to the functions they serve. This may prove to be a mistake. It is concluded that new system designs are needed in which meaningfully structured data and specialised imagery amplify without interference or replacement the impressive but limited creative resources of the visual brain

Loughborough University Institutional Repository

Training dynamics of neural language models

Author: Saphra Naomi
Publication venue: The University of Edinburgh
Publication date: 31/07/2021
Field of study

Why do artificial neural networks model language so well? We claim that in order to answer this question and understand the biases that lead to such high performing language models---and all models that handle language---we must analyze the training process. For decades, linguists have used the tools of developmental linguistics to study human bias towards linguistic structure. Similarly, we wish to consider a neural network's training dynamics, i.e., the analysis of training in practice and the study of why our optimization methods work when applied. This framing shows us how structural patterns and linguistic properties are gradually built up, revealing more about why LSTM models learn so effectively on language data. To explore these questions, we might be tempted to appropriate methods from developmental linguistics, but we do not wish to make cognitive claims, so we avoid analogizing between human and artificial language learners. We instead use mathematical tools designed for investigating language model training dynamics. These tools can take advantage of crucial differences between child development and model training: we have access to activations, weights, and gradients in a learning model, and can manipulate learning behavior directly or by perturbing inputs. While most research in training dynamics has focused on vision tasks, language offers direct annotation of its well-documented and intuitive latent hierarchical structures (e.g., syntax and semantics) and is therefore an ideal domain for exploring the effect of training dynamics on the representation of such structure. Focusing on LSTM models, we investigate the natural sparsity of gradients and activations, finding that word representations are focused on just a few neurons late in training. Similarity analysis reveals how word embeddings learned for different tasks are highly similar at the beginning of training, but gradually become task-specific. Using synthetic data and measuring feature interactions, we also discover that hierarchical representations in LSTMs may be a result of their learning strategy: they tend to build new trees out of familiar phrases, by mingling together the meaning of constituents so they depend on each other. These discoveries constitute just a few possible explanations for how LSTMs learn generalized language representations, with further theories on more architectures to be uncovered by the growing field of NLP training dynamics

Edinburgh Research Archive

Recommended from our members

Sociolinguistically Driven Approaches for Just Natural Language Processing

Author: Blodgett Su Lin
Publication venue: ScholarWorks@UMass Amherst
Publication date: 06/04/2021
Field of study

Natural language processing (NLP) systems are now ubiquitous. Yet the benefits of these language technologies do not accrue evenly to all users, and indeed they can be harmful; NLP systems reproduce stereotypes, prevent speakers of non-standard language varieties from participating fully in public discourse, and re-inscribe historical patterns of linguistic stigmatization and discrimination. How harms arise in NLP systems, and who is harmed by them, can only be understood at the intersection of work on NLP, fairness and justice in machine learning, and the relationships between language and social justice. In this thesis, we propose to address two questions at this intersection: i) How can we conceptualize harms arising from NLP systems?, and ii) How can we quantify such harms? We propose the following contributions. First, we contribute a model in order to collect the first large dataset of African American Language (AAL)-like social media text. We use the dataset to quantify the performance of two types of NLP systems, identifying disparities in model performance between Mainstream U.S. English (MUSE)- and AAL-like text. Turning to the landscape of bias in NLP more broadly, we then provide a critical survey of the emerging literature on bias in NLP and identify its limitations. Drawing on work across sociology, sociolinguistics, linguistic anthropology, social psychology, and education, we provide an account of the relationships between language and injustice, propose a taxonomy of harms arising from NLP systems grounded in those relationships, and propose a set of guiding research questions for work on bias in NLP. Finally, we adapt the measurement modeling framework from the quantitative social sciences to effectively evaluate approaches for quantifying bias in NLP systems. We conclude with a discussion of recent work on bias through the lens of style in NLP, raising a set of normative questions for future work

ScholarWorks@UMass Amherst

Material Symbols

Author: Clark Andy
Publication venue: Taylor and Francis
Publication date: 01/01/2006
Field of study

What is the relation between the material, conventional symbol structures that we encounter in the spoken and written word, and human thought? A common assumption, that structures a wide variety of otherwise competing views, is that the way in which these material, conventional symbol-structures do their work is by being translated into some kind of content-matching inner code. One alternative to this view is the tempting but thoroughly elusive idea that we somehow think in some natural language (such as English). In the present treatment I explore a third option, which I shall call the “complementarity” view of language. According to this third view the actual symbol structures of a given language add cognitive value by complementing (without being replicated by) the more basic modes of operation and representation endemic to the biological brain. The “cognitive bonus” that language brings is, on this model, not to be cashed out either via the ultimately mysterious notion of “thinking in a given natural language” or via some process of exhaustive translation into another inner code. Instead, we should try to think in terms of a kind of coordination dynamics in which the forms and structures of a language qua material symbol system play a key and irreducible role. Understanding language as a complementary cognitive resource is, I argue, an important part of the much larger project (sometimes glossed in terms of the “extended mind”) of understanding human cognition as essentially and multiply hybrid: as involving a complex interplay between internal biological resources and external non-biological resources

Edinburgh Research Archive