95 research outputs found
Representation Engineering: A Top-Down Approach to AI Transparency
In this paper, we identify and characterize the emerging area of
representation engineering (RepE), an approach to enhancing the transparency of
AI systems that draws on insights from cognitive neuroscience. RepE places
population-level representations, rather than neurons or circuits, at the
center of analysis, equipping us with novel methods for monitoring and
manipulating high-level cognitive phenomena in deep neural networks (DNNs). We
provide baselines and an initial analysis of RepE techniques, showing that they
offer simple yet effective solutions for improving our understanding and
control of large language models. We showcase how these methods can provide
traction on a wide range of safety-relevant problems, including honesty,
harmlessness, power-seeking, and more, demonstrating the promise of top-down
transparency research. We hope that this work catalyzes further exploration of
RepE and fosters advancements in the transparency and safety of AI systems.Comment: Code is available at
https://github.com/andyzoujm/representation-engineerin
Recommended from our members
Groundwater residence time distributions in peatlands: implications for peat decomposition and accumulation
Peat soils consist of poorly decomposed plant detritus, preserved by low decay rates, and deep peat deposits are globally significant stores in the carbon cycle. High water tables and low soil temperatures are commonly held to be the primary reasons for low peat decay rates. However, recent studies suggest a thermodynamic limit to peat decay, whereby the slow turnover of peat soil pore water may lead to high concentrations of phenols and dissolved inorganic carbon. In sufficient concentrations, these chemicals may slow or even halt microbial respiration, providing a negative feedback to peat decay. We document the analysis of a simple, one-dimensional theoretical model of peatland pore water residence time distributions (RTDs). The model suggests that broader, thicker peatlands may be more resilient to rapid decay caused by climate change because of slow pore water turnover in deep layers. Even shallow peat deposits may also be resilient to rapid decay if rainfall rates are low. However, the model suggests that even thick peatlands may be vulnerable to rapid decay under prolonged high rainfall rates, which may act to flush pore water with fresh rainwater. We also used the model to illustrate a particular limitation of the diplotelmic (i.e., acrotelm and catotelm) model of peatland structure. Model peatlands of contrasting hydraulic structure exhibited identical water tables but contrasting RTDs. These scenarios would be treated identically by diplotelmic models, although the thermodynamic limit suggests contrasting decay regimes. We therefore conclude that the diplotelmic model be discarded in favor of model schemes that consider continuous variation in peat properties and processes
Vessel Shrinkage as a Sign of Atherosclerosis Progression in Type 2 Diabetes : A Serial Intravascular Ultrasound Analysis
OBJECTIVE—The aim of this study was to determine the natural history of vascular remodeling of atherosclerotic plaques in patients with type 2 diabetes and the predictors of vessel shrinkage
Point mutations in the Rpb9-homologous domain of Rpc11 that impair transcription termination by RNA polymerase III
RNA polymerase III recognizes and pauses at its terminator, an oligo(dT) tract in non-template DNA, terminates 3′ oligo(rU) synthesis within this sequence, and releases the RNA. The pol III subunit Rpc11p (C11) mediates RNA 3′–5′ cleavage in the catalytic center of pol III during pausing. The amino and carboxyl regions of C11 are homologous to domains of the pol II subunit Rpb9p, and the pol II elongation and RNA cleavage factor, TFIIS, respectively. We isolated C11 mutants from Schizosaccharomyces pombe that cause pol III to readthrough terminators in vivo. Mutant RNA confirmed the presence of terminator readthrough transcripts. A predominant mutation site, F32, resides in the C11 Rpb9-like domain. Another mutagenic approach confirmed the F32 mutation and also isolated I34 and Y30 mutants. Modeling Y30, F32 and I34 of C11 in available cryoEM pol III structures predicts a hydrophobic patch that may interface with C53/37. Another termination mutant, Rpc2-T455I, appears to reside internally, near the RNA–DNA hybrid. We show that the Rpb9 and TFIIS homologous mutants of C11 reflect distinct activities, that differentially affect terminator recognition and RNA 3′ cleavage. We propose that these C11 domains integrate action at the upper jaw and center of pol III during termination
Recommended from our members
Headwater Streams and Wetlands are Critical for Sustaining Fish, Fisheries, and Ecosystem Services
Headwater streams and wetlands are integral components of watersheds that are critical for biodiversity, fisheries, ecosystem functions, natural resource-based economies, and human society and culture. These and other ecosystem services provided by intact and clean headwater streams and wetlands are critical for a sustainable future. Loss of legal protections for these vulnerable ecosystems would create a cascade of consequences, including reduced water quality, impaired ecosystem functioning, and loss of fish habitat for commercial and recreational fish species. Many fish species currently listed as threatened or endangered would face increased risks, and other taxa would become more vulnerable. In most regions of the USA, increased pollution and other impacts to headwaters would have negative economic consequences. Headwaters and the fishes they sustain have major cultural importance for many segments of U.S. society. Native peoples, in particular, have intimate relationships with fish and the streams that support them. Headwaters ecosystems and the natural, socio-cultural, and economic services they provide are already severely threatened, and would face even more loss under the Waters of the United States (WOTUS) rule recently proposed by the Trump administration
Assisting Human Cognition in Visual Data Mining
As discussed in Part 1 of the book in chapter Form-Semantics-Function. A Framework for Designing Visualisation Models for Visual Data Mining the development of consistent visualisation techniques requires systematic approach related to the tasks of the visual data mining process. Chapter Visual discovery of network patterns of interaction between attributes presents a methodology based on viewing visual data mining as a reflection-in-action process. This chapter follows the same perspective and focuses on the subjective bias that may appear in visual data mining. The work is motivated by the fact that visual, though very attractive, means also subjective, and non-experts are often left to utilise visualisation methods (as an understandable alternative to the highly complex statistical approaches) without the ability to understand their applicability and limitations. The chapter presents two strategies addressing the subjective bias: guided cognition and validated cognition, which result in two types of visual data mining techniques: interaction with visual data representations, mediated by statistical techniques, and validation of the hypotheses coming as an output of the visual analysis through another analytics method, respectively
Existential categories in some works of Hemingway and Camus
The purpose of this thesis is to apply a method of criticism that has been proposed by Dr. James V. Baker. This method proposed that an examination of literature through the "lens" of existentialist philosophy will produce a heightened sense of appreciation. Existential criticism is not intended as a substitute for all other methods of criticism; rather, it is primarily a means to the appreciation of the literary work of arts viewed as a whole. [...]English, Department o
Water system of the lake Druksiai Transboundary Catchment under anthropogenic pressure
Praca poświęcona jest ocenie bilansu wód powierzchniowych i podziemnych w transgranicznej zlewni jeziora Druksiai, znajdującej się w północno-wschodniej Litwie i obejmującej także przyległe obszary Białorusi i Łotwy. Zasoby tych wód były i są intensywnie eksploatowane. Zlewnia podlega silnej antropopresji z powodu urbanizacji i uprzemysłowienia oraz w mniejszym zakresie z powodu rozwoju rolnictwa. Ta presja powodowana jest głównie przez gorące wody pochodzące z chłodzenia elektrowni atomowej w Ignalinie, przez zanieczyszczenia wypływające z miejskiej oczyszczalni ścieków (miasto Visaginas) oraz przez zanieczyszczenia chemiczne pochodzące ze źródeł punktowych i rozproszonych. Z hydrogeologicznego punktu widzenia zlewnia jeziora Druksiai należy do wschodniej części bałtyckiego basenu artezyjskiego. Wody podziemne nie wpływają poważniej na wymianę wód jeziora (współczynnik wymiany tych wód z nie izolowanymi wodami podziemnymi wynosi 0,009). Z drugiej strony wody podziemne, pochodzące zwłaszcza z izolowanego zbiornika górnego środkowego dewonu, są głównym źródłem zaopatrzenia całego regionu w wodę i w ten sposób pośrednio stają się najważniejszym nośnikiem dostarczanych do jeziora składników organicznych, pochodzących z gospodarstw domowych, jak i gorących wód z ignalińskiej elektrowni atomowej
Visual Data Mining: An Introduction and Overview
In our everyday life we interact with various information media, which present us with facts and opinions, supported with some evidence, based, usually, on condensed information extracted from data. It is common to communicate such condensed information in a visual form - a static or animated, preferably interactive, visualisation. For example, when we watch familiar weather programs on the TV, landscapes with cloud, rain and sun icons and numbers next to them quickly allow us to build a picture about the predicted weather pattern in a region. Playing sequences of such visualisations will easily communicate the dynamics of the weather pattern, based on the large amount of data collected by many thousands of climate sensors and monitors scattered across the globe and on weather satellites. These pictures are fine when one watches the weather on Friday to plan what to do on Sunday - after all if the patterns are wrong there are always alternative ways of enjoying a holiday. Professional decision making would be a rather different scenario. It will require weather forecasts at a high level of granularity and precision, and in real-time. Such requirements translate into requirements for high volume data collection, processing, mining, modelling and communicating the models quickly to the decision makers. Further, the requirements translate into high-performance computing with integrated efficient interactive visualisation. From practical point of view, if a weather pattern can not be depicted fast enough, then it has no value. Recognising the power of the human visual perception system and pattern recognition skills adds another twist to the requirements - data manipulations need to be completed at least an order of magnitude faster than real-time in order to combine them with a variety of highly interactive visualisations, allowing easy remapping of data attributes to the features of the visual metaphor, used to present the data. In this few steps in the weather domain, we have specified some requirements towards a visual data mining system
- …