13 research outputs found

    Towards Automated Circuit Discovery for Mechanistic Interpretability

    Full text link
    Recent work in mechanistic interpretability has reverse-engineered nontrivial behaviors of transformer models. These contributions required considerable effort and researcher intuition, which makes it difficult to apply the same methods to understand the complex behavior that current models display. At their core however, the workflow for these discoveries is surprisingly similar. Researchers create a data set and metric that elicit the desired model behavior, subdivide the network into appropriate abstract units, replace activations of those units to identify which are involved in the behavior, and then interpret the functions that these units implement. By varying the data set, metric, and units under investigation, researchers can understand the functionality of each neural network region and the circuits they compose. This work proposes a novel algorithm, Automatic Circuit DisCovery (ACDC), to automate the identification of the important units in the network. Given a model's computational graph, ACDC finds subgraphs that explain a behavior of the model. ACDC was able to reproduce a previously identified circuit for Python docstrings in a small transformer, identifying 6/7 important attention heads that compose up to 3 layers deep, while including 91% fewer the connections

    Jardins per a la salut

    Get PDF
    Facultat de FarmĂ cia, Universitat de Barcelona. Ensenyament: Grau de FarmĂ cia. Assignatura: BotĂ nica farmacĂšutica. Curs: 2014-2015. Coordinadors: Joan Simon, CĂšsar BlanchĂ© i Maria Bosch.Els materials que aquĂ­ es presenten sĂłn el recull de les fitxes botĂ niques de 128 espĂšcies presents en el JardĂ­ Ferran Soldevila de l’Edifici HistĂČric de la UB. Els treballs han estat realitzats manera individual per part dels estudiants dels grups M-3 i T-1 de l’assignatura BotĂ nica FarmacĂšutica durant els mesos de febrer a maig del curs 2014-15 com a resultat final del Projecte d’InnovaciĂł Docent «Jardins per a la salut: aprenentatge servei a BotĂ nica farmacĂšutica» (codi 2014PID-UB/054). Tots els treballs s’han dut a terme a travĂ©s de la plataforma de GoogleDocs i han estat tutoritzats pels professors de l’assignatura. L’objectiu principal de l’activitat ha estat fomentar l’aprenentatge autĂČnom i col·laboratiu en BotĂ nica farmacĂšutica. TambĂ© s’ha pretĂšs motivar els estudiants a travĂ©s del retorn de part del seu esforç a la societat a travĂ©s d’una experiĂšncia d’Aprenentatge-Servei, deixant disponible finalment el treball dels estudiants per a poder ser consultable a travĂ©s d’una Web pĂșblica amb la possibilitat de poder-ho fer in-situ en el propi jardĂ­ mitjançant codis QR amb un smartphone

    Solving Montezuma's Revenge with Planning and Reinforcement Learning

    No full text
    Treball de fi de grau en informàticaTutor: Anders JonssonTraditionally, methods for solving Sequential Decision Processes (SDPs) have not worked well with those that feature sparse feedback. Both planning and reinforcement learning, methods for solving SDPs, have trouble with it. With the rise to prominence of the Arcade Learning Environment (ALE) in the broader research community of sequential decision processes, one SDP featuring sparse feedback has become familiar: the Atari game Montezuma’s Revenge. In this particular game, the great amount of knowledge the human player already possesses, and uses to find rewards, cannot be bridged by blindly exploring in a realistic time. We apply planning and reinforcement learning approaches, combined with domain knowledge, to enable an agent to obtain better scores in this game. We hope that these domain-specific algorithms can inspire better approaches to solve SDPs with sparse feedback in general

    BNNpriors: A library for Bayesian neural network inference with different prior distributions

    No full text
    Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs well and can lead to suboptimal performance. Our new library, BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks with a wide range of predefined priors, including heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows a modular approach that eases the design and implementation of new custom priors. It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks and will hopefully catalyze future research as well as practical applications in this area.ISSN:2665-963

    TuringLang/AdvancedHMC.jl: v0.6.0

    No full text
    <h2>AdvancedHMC v0.6.0</h2> <p><a href="https://github.com/TuringLang/AdvancedHMC.jl/compare/v0.5.5...v0.6.0">Diff since v0.5.5</a></p> <p><strong>Merged pull requests:</strong></p> <ul> <li>fix: arg order (#349) (@xukai92)</li> <li>CompatHelper: bump compat for AbstractMCMC to 5, (keep existing compat) (#352) (@github-actions[bot])</li> <li>Deprecate <code>init_params</code> which is no longer in AbstractMCMC (#353) (@torfjelde)</li> <li>CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#354) (@github-actions[bot])</li> <li>Removed deprecation of init_params + bump minor version (#355) (@torfjelde)</li> <li>Fix some tests. (#356) (@yebai)</li> <li>Fix docs CI (#357) (@yebai)</li> </ul> <p><strong>Closed issues:</strong></p> <ul> <li>Doc string error for NUTS (#346)</li> </ul&gt
    corecore