Search CORE

2 research outputs found

Fluent dreaming for language models

Author: Sklar Michael
Straznickas Zygimantas
Thompson T. Ben
Publication venue
Publication date: 24/01/2024
Field of study

Feature visualization, also known as "dreaming", offers insights into vision models by optimizing the inputs to maximize a neuron's activation or other internal component. However, dreaming has not been successfully applied to language models because the input space is discrete. We extend Greedy Coordinate Gradient, a method from the language model adversarial attack literature, to design the Evolutionary Prompt Optimization (EPO) algorithm. EPO optimizes the input prompt to simultaneously maximize the Pareto frontier between a chosen internal feature and prompt fluency, enabling fluent dreaming for language models. We demonstrate dreaming with neurons, output logits and arbitrary directions in activation space. We measure the fluency of the resulting prompts and compare language model dreaming with max-activating dataset examples. Critically, fluent dreaming allows automatically exploring the behavior of model internals in reaction to mildly out-of-distribution prompts. Code for running EPO is available at https://github.com/Confirm-Solutions/dreamy. A companion page demonstrating code usage is at https://confirmlabs.org/posts/dreamy.htmlComment: 11 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Distributed Negative Sampling for Word Embeddings

Author: Stergiou Stergios
Straznickas Zygimantas
Tsioutsiouliklis Kostas
Wu Rolina
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 13/02/2017
Field of study

Word2Vec recently popularized dense vector word representations as fixed-length features for machine learning algorithms and is in widespread use today. In this paper we investigate one of its core components, Negative Sampling, and propose efficient distributed algorithms that allow us to scale to vocabulary sizes of more than 1 billion unique words and corpus sizes of more than 1 trillion words

Association for the Advancement of Artificial Intelligence: AAAI Publications