44 research outputs found

    Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model

    Full text link
    Latent diffusion models (LDMs) exhibit an impressive ability to produce realistic images, yet the inner workings of these models remain mysterious. Even when trained purely on images without explicit depth information, they typically output coherent pictures of 3D scenes. In this work, we investigate a basic interpretability question: does an LDM create and use an internal representation of simple scene geometry? Using linear probes, we find evidence that the internal activations of the LDM encode linear representations of both 3D depth data and a salient-object / background distinction. These representations appear surprisingly early in the denoising process−-well before a human can easily make sense of the noisy images. Intervention experiments further indicate these representations play a causal role in image synthesis, and may be used for simple high-level editing of an LDM's output.Comment: 17 pages, 13 figure

    Revealing individual and collective pasts : visualizations of online social archives

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 118-125).As mediated communication becomes an increasingly central part of everyday life, people have started going online to conduct business, to get emotional support, to find communities of interest, and to look for potential romantic partners. Most of these social activities take place primarily through the exchange of conversational texts that, over time, accrue into vast archives. As valuable as these collections of documents may be for our comprehension of the online social world, they are usually cumbersome, impenetrable records of the past. This thesis posits that history visualization- the visualization of people's past presence and activities in mediated environments- helps users make better sense of the online social spaces they inhabit and the relationships they maintain. Here, a progressive series of experimental visualizations explores different ways in which history may enhance social legibility. The projects visualize the history of people's activities in four different environments: a graphical chat room, a wiki site, Usenet newsgroups, and email. History and the persistent nature of online communication are the common threads connecting these projects. Evaluation of these tools shows that history visualizations can be utilized in a variety of ways, ranging from aids for quicker impression formation and mirrors for self-reflection, to catalysts for storytelling and artifacts for posterity.by Fernanda Bertini Viégas.Ph.D

    Collections : adapting the display of personal objects for different audiences

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.Includes bibliographical references (leaves 64-68).Although current networked systems and online applications provide new opportunities for displaying and sharing personal information, they do not account for the underlying social contexts that frame such interactions. Existing categorization and management mechanisms for digital content have been designed to focus on the data they handle without much regard for the social circumstances within which their content is shared. As we share large collections of personal information over mediated environments, our tools need to account for the social scenarios that surround our interactions. This thesis presents Collections: an application for the management of digital pictures according to their intended audiences. The goal is to create a graphical interface that supports the creation of fairly complex privacy decisions concerning the display of digital photographs. Simple graphics are used to enable the collector to create a wide range of audience arrangements for her digital photographs. The system allows users to express their preferences in sharing their personal pictures over a disembodied environment such as the Web. The system also introduces an original approach to the presentation interface of photographic collections on the Web: a viewing application that takes into account the viewing history of the photographs and the integration of text comments to images.by Fernanda Bertini Viégas.S.M

    Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

    Full text link
    We introduce Inference-Time Intervention (ITI), a technique designed to enhance the truthfulness of large language models (LLMs). ITI operates by shifting model activations during inference, following a set of directions across a limited number of attention heads. This intervention significantly improves the performance of LLaMA models on the TruthfulQA benchmark. On an instruction-finetuned LLaMA called Alpaca, ITI improves its truthfulness from 32.5% to 65.1%. We identify a tradeoff between truthfulness and helpfulness and demonstrate how to balance it by tuning the intervention strength. ITI is minimally invasive and computationally inexpensive. Moreover, the technique is data efficient: while approaches like RLHF require extensive annotations, ITI locates truthful directions using only few hundred examples. Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface.Comment: code: https://github.com/likenneth/honest_llam

    Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task

    Full text link
    Language models show a surprising range of capabilities, but the source of their apparent competence is unclear. Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello. Although the network has no a priori knowledge of the game or its rules, we uncover evidence of an emergent nonlinear internal representation of the board state. Interventional experiments indicate this representation can be used to control the output of the network and create "latent saliency maps" that can help explain predictions in human terms.Comment: ICLR 2023 oral (notable-top-5%): https://openreview.net/forum?id=DeG07_TcZvT; code: https://github.com/likenneth/othello_worl
    corecore