6 research outputs found

    Representation Engineering: A Top-Down Approach to AI Transparency

    Full text link
    In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equipping us with novel methods for monitoring and manipulating high-level cognitive phenomena in deep neural networks (DNNs). We provide baselines and an initial analysis of RepE techniques, showing that they offer simple yet effective solutions for improving our understanding and control of large language models. We showcase how these methods can provide traction on a wide range of safety-relevant problems, including honesty, harmlessness, power-seeking, and more, demonstrating the promise of top-down transparency research. We hope that this work catalyzes further exploration of RepE and fosters advancements in the transparency and safety of AI systems.Comment: Code is available at https://github.com/andyzoujm/representation-engineerin

    A Search for Technosignatures Around 31 Sun-like Stars with the Green Bank Telescope at 1.15-1.73 GHz

    Full text link
    We conducted a search for technosignatures in April of 2018 and 2019 with the L-band receiver (1.15-1.73 GHz) of the 100 m diameter Green Bank Telescope. These observations focused on regions surrounding 31 Sun-like stars near the plane of the Galaxy. We present the results of our search for narrowband signals in this data set as well as improvements to our data processing pipeline. Specifically, we applied an improved candidate signal detection procedure that relies on the topographic prominence of the signal power, which nearly doubles the signal detection count of some previously analyzed data sets. We also improved the direction-of-origin filters that remove most radio frequency interference (RFI) to ensure that they uniquely link signals observed in separate scans. We performed a preliminary signal injection and recovery analysis to test the performance of our pipeline. We found that our pipeline recovers 93% of the injected signals over the usable frequency range of the receiver and 98% if we exclude regions with dense RFI. In this analysis, 99.73% of the recovered signals were correctly classified as technosignature candidates. Our improved data processing pipeline classified over 99.84% of the ~26 million signals detected in our data as RFI. Of the remaining candidates, 4539 were detected outside of known RFI frequency regions. The remaining candidates were visually inspected and verified to be of anthropogenic nature. Our search compares favorably to other recent searches in terms of end-to-end sensitivity, frequency drift rate coverage, and signal detection count per unit bandwidth per unit integration time.Comment: 20 pages, 8 figures, in press at the Astronomical Journal (submitted on Sept. 9, 2020; reviews received Nov. 6; re-submitted Nov. 6; accepted Nov. 17

    Proportional Aggregation of Preferences for Sequential Decision Making

    No full text
    We study the problem of fair sequential decision making given voter preferences. In each round, a decision rule must choose a decision from a set of alternatives where each voter reports which of these alternatives they approve. Instead of going with the most popular choice in each round, we aim for proportional representation, using axioms inspired by the multi-winner voting literature. The axioms require that every group of α% of the voters, if it agrees in every round (i.e., approves a common alternative), then those voters must approve at least α% of the decisions. A stronger version of the axioms requires that every group of α% of the voters that agrees in a β fraction of rounds must approve β⋅α% of the decisions. We show that three attractive voting rules satisfy axioms of this style. One of them (Sequential Phragmén) makes its decisions online, and the other two satisfy strengthened versions of the axioms but make decisions semi-online (Method of Equal Shares) or fully offline (Proportional Approval Voting). We present empirical results for these rules based on synthetic data and U.S. political elections. We also run experiments using the moral machine dataset about ethical dilemmas. We train preference models on user responses from different countries and let the models cast votes. We find that aggregating these votes using our rules leads to a more equal utility distribution across demographics than making decisions using a single global preference model

    Donor–Acceptor Biarylcarbazoles as Efficient Host Materials for Solution-Processable High-Performance Phosphorescent Organic Light-Emitting Diodes

    No full text
    Host materials having high triplet energies offer great commercial potential for the development of solution-processable high-performance phosphorescent organic light-emitting diodes (PhOLEDs). While plenty of vacuum-deposited host materials are available, the literature reveals a dearth of solution-processable host materials. Therefore, a series of biarylcarbazoles (BACs) were designed as host materials by incorporating donor–acceptor functionalities and doped with blue, green, yellow, and orange phosphorescent emitters to develop energy-saving high-performance PhOLEDs with low turn-on voltages. All of the synthesized host materials exhibited good thermal stability in the range of 294–355 °C and exhibited remarkably high triplet energies of 2.50–2.81 eV. Surprisingly, PhOLEDs prepared by incorporating a host material 6a doped with a green phosphorescent emitter, i.e., Ir(ppy)3, displayed admirable efficiencies with a maximum power efficiency (PE) of 55.6 lm/W, a current efficiency (CE) of 53.2 cd/A, and an external quantum efficiency of 17.1% with a maximum brightness (Lmax) of 27 000 cd/m2. BAC host material 6a exhibited better performance compared to that of commercial host 4,4′-bis(N-carbazolyl)-1,1′-biphenyl (CBP) and 4,4′,4″-tris(carbazol-9-yl)triphenylamine. The BAC 6a host was also found to be compatible with orange, yellow, and blue phosphorescent emitters, which displayed PEs of 31.9, 21.4, and 14.1 lm/W, respectively, at a brightness of 100 cd/m2. Notably, the green PhOLED with donor–acceptor-based host 6a exhibited 23% roll-up in CE while moving from 100 to 1000 cd/m2. The enhancement of the performance of the green PhOLED is attributed to higher singlet and triplet energies of host 6a compared to that of the utilized green emitter tris(2-phenylpyridine)iridium(III), leading to effective host–guest energy transfer and the ability to form efficient excitons in the host–guest matrix, thus enhancing the OLED performance. Thus, BAC 6a has commercial potential as a suitable host material for the fabrication of efficient multicolor PhOLEDs
    corecore