Search CORE

13 research outputs found

Towards Automated Circuit Discovery for Mechanistic Interpretability

Author: Conmy Arthur
Garriga-Alonso Adrià
Heimersheim Stefan
Lynch Aengus
Mavor-Parker Augustine N.
Publication venue
Publication date: 28/04/2023
Field of study

Recent work in mechanistic interpretability has reverse-engineered nontrivial behaviors of transformer models. These contributions required considerable effort and researcher intuition, which makes it difficult to apply the same methods to understand the complex behavior that current models display. At their core however, the workflow for these discoveries is surprisingly similar. Researchers create a data set and metric that elicit the desired model behavior, subdivide the network into appropriate abstract units, replace activations of those units to identify which are involved in the behavior, and then interpret the functions that these units implement. By varying the data set, metric, and units under investigation, researchers can understand the functionality of each neural network region and the circuits they compose. This work proposes a novel algorithm, Automatic Circuit DisCovery (ACDC), to automate the identification of the important units in the network. Given a model's computational graph, ACDC finds subgraphs that explain a behavior of the model. ACDC was able to reproduce a previously identified circuit for Python docstrings in a small transformer, identifying 6/7 important attention heads that compose up to 3 layers deep, while including 91% fewer the connections

arXiv.org e-Print Archive

Jardins per a la salut

Author: Albertí Sancho Blanca
Alguacil Aguilar Julia
Ardanuy Comas Albert
Arderiu Formentí Alba
Armengol Andos Rosa
Arribas Queralt Teresa
Bachiller García Mireia
Baladón Ramírez Jorge
Balcells de Martí Inés
Baldi Lartuna Judith
Bentanachs Raset Roger
Berredo Roldán Noelia
Berrio Avalos Víctor
Bolance Navarro Raquel
Bordoy Guerra Maria Milagros
Borràs Expósito Mireia
Borràs Rodrigo Marta
Caballero Prior Laura
Calafat Pla Joan Feliu
Calvo Silveria Sara
Canillas Mata Laura
Carbonell Vergés Núria
Cardoso Gasch Maria
Carrillo Ruíz Andrea
Casanovas Montasell Mireia
Castellà Soler Àngels
Cavero Garriga Eduard
Chanla Pizá Marina
Codina Jiménez Carla
Coll Satué Clara
Collado Lorenzo Jessica
Comajuan Mendoza Carla
Creixell Turón Anna
Desoi Artús Anna
Domingo Llopart Joan
Díaz Tejada Héctor
El Muhandiz Ikram
Escudero Rotger Jose María
Espinosa Busquets Martí
Estrada Nieto Lidia
Farré Altarriba Nil
Fernandez Martinez Marta
Fernández Tomás Carlos
Ferré Viña Gemma
Franco Fobe Laura
Franco Pons Clàudia
Fraschi Nieto Alex
Frigola Beván Gemma
Garcia Gonzalez Susana
Garcia Pastallé Arnau
Garcia Xipell Sandra
García Navarro Sandra
Gomez Olivella Adrià
González Cerezo Patricia
González Melarde Blanca
González Molina Paula
Grañana Castillo Sandra
Gurung Ashma
Gómez Folch Paula
Gómez i Serra Enric
Hermoso Gallego Yaiza
Hernández Hotter Elena
Hidalgo Josa Dana
Jaume Capó Marta
Jiménez Martín Paola
Jorba Cortada Cèlia
Lalueza Puértolas Jana
Lamiel Membrilla Alberto
Lasurt Barés Claudia
Llop Algueró Alba
Luque Gimeno Paula
López Sánchez Irene
Manchón Contreras Marc
Martell Alonso Clàudia
Martínez Castro Paula
Martínez Escobar Maria
Martínez Montañez Noelia
Masip Guasch Victor
Molina Pita Patricia
Mondejar Ferrer Júlia
Muro Blanc Elena
Oliva Falcó Laura
Ortuño Ruiz Yaiza
Padilla Patón Laura
Pagès Sanchis Marta
Palma Galeto Sara
Paredes García Maria Luisa
Pascual Carbonell David
Pegueroles Monllau Lluís
Picazos Muniesa Maria del Mar
Porta Magriña Maria
Pou Torres Pilar
Pradera Carazo Elena
Prat Castro Sandra
Puente Rodríguez Celia
Pérez Cao Gerard
Pérez Herrero Sara
Ramis Barceló Marina
Ramírez Martín Ana
Ramírez Rojo Paula
Redondo Vahle Ana
Rodríguez Sabates Mercè
Roig Rallo Laia
Roig Turner Gemma
Rosés Gimeno Marta
Ruyra Ripoll Jordi
Safont-Tria Jové Laura
San José Oliva Nerea
Sastre Gelabert Aina
Saurí Ramos Albert
Sempere Comet Anna
Sendín Emiliano Alejandro
Shults Vladyslav
Tarré Vandrell Gina
Tomàs Güell Núria
Torrandell i Haro Georgina
Torras Romero Mariona
Torres Solera Olga
Valero Via Eugènia
Vargas Guerras Pablo
Ventós Martí Laia
Vidosa Artigas Guillem
Villanova Errando Santiago
Álvarez Aunòs Maria
Álvarez Lorenzo Paula
Publication venue
Publication date: 01/07/2015
Field of study

Facultat de Farmàcia, Universitat de Barcelona. Ensenyament: Grau de Farmàcia. Assignatura: Botànica farmacèutica. Curs: 2014-2015. Coordinadors: Joan Simon, Cèsar Blanché i Maria Bosch.Els materials que aquí es presenten són el recull de les fitxes botàniques de 128 espècies presents en el Jardí Ferran Soldevila de l’Edifici Històric de la UB. Els treballs han estat realitzats manera individual per part dels estudiants dels grups M-3 i T-1 de l’assignatura Botànica Farmacèutica durant els mesos de febrer a maig del curs 2014-15 com a resultat final del Projecte d’Innovació Docent «Jardins per a la salut: aprenentatge servei a Botànica farmacèutica» (codi 2014PID-UB/054). Tots els treballs s’han dut a terme a través de la plataforma de GoogleDocs i han estat tutoritzats pels professors de l’assignatura. L’objectiu principal de l’activitat ha estat fomentar l’aprenentatge autònom i col·laboratiu en Botànica farmacèutica. També s’ha pretès motivar els estudiants a través del retorn de part del seu esforç a la societat a través d’una experiència d’Aprenentatge-Servei, deixant disponible finalment el treball dels estudiants per a poder ser consultable a través d’una Web pública amb la possibilitat de poder-ho fer in-situ en el propi jardí mitjançant codis QR amb un smartphone

Diposit Digital de la Universitat de Barcelona

Recommended from our members

Priors in finite and infinite Bayesian convolutional neural networks

Author: Garriga Alonso Adrià
Publication venue: https://agarri.ga/publication/prior-bayesian-cnn/
Publication date: 29/01/2023
Field of study

Bayesian neural networks (BNNs) have undergone many changes since the seminal work of Neal [Nea96]. Advances in approximate inference and the use of GPUs have scaled BNNs to larger data sets, and much higher layer and parameter counts. Yet, the priors used for BNN parameters have remained essentially the same. The isotropic Gaussian prior introduced by Neal, where each element of the weights and biases is drawn independently from a Gaussian, is still used almost everywhere. This thesis seeks to undo the neglect in the development of priors for BNNs, especially convolutional BNNs, using a two-pronged approach. First, I theoretically examine the effect of the Gaussian isotropic prior on the distribution over functions of a deep BNN prior. I show that, as the number of channels of a convolutional BNN goes to infinity, its output converges in distribution to a Gaussian process (GP). Thus, we can draw rough conclusions about the function-space of finite BNNs by looking at the mean and covariance of their limiting GPs. The limiting GP itself performs surprisingly well at image classification, suggesting that knowledge encoded in the convolutional neural network (CNN) architecture, as opposed to the learned features, plays a larger role than previously thought. Examining the derived CNN kernel shows that, if the weights are independent, the output of the limiting GP loses translation equivariance. This is an important inductive bias for learning from images. We can prevent this loss by introducing spatial correlations in the weight prior of a Bayesian CNN, which still results in a GP in the infinite width limit. The second prong is an empirical methodology for identifying new priors for BNNs. Since BNNs are often considered to underfit, I examine the empirical distribution of weights learned using stochastic gradient descent (SGD). The resulting weight distributions tend to have heavier tails than a Gaussian, and display strong spatial correlations in CNNs. I incorporate the found features into BNN priors, and test the performance of the resulting posterior. The spatially correlated priors, recommended by both prongs, greatly increase the classification performance of Bayesian CNNs. However, they do not at all reduce the cold-posterior effect (CPE), which indicates model misspecification or inference failure in BNNs. Heavy-tailed priors somewhat reduce the CPE in fully connected neural networks. Ultimately, it is unlikely that the remaining misspecification is all in the prior. Nevertheless, I have found better priors for Bayesian CNNs. I have provided empirical methods that can be used to further improve BNN priors

Apollo (Cambridge)

Solving Montezuma's Revenge with Planning and Reinforcement Learning

Author: Garriga Alonso Adrià
Publication venue
Publication date: 21/04/2017
Field of study

Treball de fi de grau en informàticaTutor: Anders JonssonTraditionally, methods for solving Sequential Decision Processes (SDPs) have not worked well with those that feature sparse feedback. Both planning and reinforcement learning, methods for solving SDPs, have trouble with it. With the rise to prominence of the Arcade Learning Environment (ALE) in the broader research community of sequential decision processes, one SDP featuring sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this particular game, the great amount of knowledge the human player already possesses, and uses to find rewards, cannot be bridged by blindly exploring in a realistic time. We apply planning and reinforcement learning approaches, combined with domain knowledge, to enable an agent to obtain better scores in this game. We hope that these domain-specific algorithms can inspire better approaches to solve SDPs with sparse feedback in general

UPF Digital Repository

BNNpriors: A library for Bayesian neural network inference with different prior distributions

Author: Aitchison Laurence
Fortuin Vincent
Garriga-Alonso Adrià
van der Wilk Mark
Publication venue: 'Elsevier BV'
Publication date: 01/08/2021
Field of study

Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs well and can lead to suboptimal performance. Our new library, BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks with a wide range of predefined priors, including heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows a modular approach that eases the design and implementation of new custom priors. It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks and will hopefully catalyze future research as well as practical applications in this area.ISSN:2665-963

Repository for Publications and Research Data

TuringLang/AdvancedHMC.jl: v0.6.0

Author: Adrià Garriga-Alonso
Andreas Noack
Andreas Scheidegger
Cameron Pfiffer
Chad Scherrer
Chris Elrod
Christopher Rackauckas
Colin Eberl Coe
David Widmann
Hong Ge
Ivan Yashchuk
Jaime RZ
Julia TagBot
Kaan Öcal
Kai Xu
Martin Trapp
Mohamed Tarek
Saranjeet Kaur
Seth Axen
Simeon Schaub
Théo Galy-Fajou
Tim Reichelt
Tor Erlend Fjelde
Vaibhav Kumar Dixit
Will Tebbutt
Publication venue: Zenodo
Publication date: 02/11/2023
Field of study

<h2>AdvancedHMC v0.6.0</h2> <a href="https://github.com/TuringLang/AdvancedHMC.jl/compare/v0.5.5...v0.6.0">Diff since v0.5.5</a> Merged pull requests: <ul> <li>fix: arg order (#349) (@xukai92)</li> <li>CompatHelper: bump compat for AbstractMCMC to 5, (keep existing compat) (#352) (@github-actions[bot])</li> <li>Deprecate <code>init_params</code> which is no longer in AbstractMCMC (#353) (@torfjelde)</li> <li>CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#354) (@github-actions[bot])</li> <li>Removed deprecation of init_params + bump minor version (#355) (@torfjelde)</li> <li>Fix some tests. (#356) (@yebai)</li> <li>Fix docs CI (#357) (@yebai)</li> </ul> Closed issues: <ul> <li>Doc string error for NUTS (#346)</li> </ul&gt

ZENODO