28 research outputs found

    Can LLMs get help from other LLMs without revealing private information?

    Full text link
    Cascades are a common type of machine learning systems in which a large, remote model can be queried if a local model is not able to accurately label a user's data by itself. Serving stacks for large language models (LLMs) increasingly use cascades due to their ability to preserve task performance while dramatically reducing inference costs. However, applying cascade systems in situations where the local model has access to sensitive data constitutes a significant privacy risk for users since such data could be forwarded to the remote model. In this work, we show the feasibility of applying cascade systems in such setups by equipping the local model with privacy-preserving techniques that reduce the risk of leaking private information when querying the remote model. To quantify information leakage in such setups, we introduce two privacy measures. We then propose a system that leverages the recently introduced social learning paradigm in which LLMs collaboratively learn from each other by exchanging natural language. Using this paradigm, we demonstrate on several datasets that our methods minimize the privacy loss while at the same time improving task performance compared to a non-cascade baseline

    Social Learning: Towards Collaborative Learning with Large Language Models

    Full text link
    We introduce the framework of "social learning" in the context of large language models (LLMs), whereby models share knowledge with each other in a privacy-aware manner using natural language. We present and evaluate two approaches for knowledge transfer between LLMs. In the first scenario, we allow the model to generate abstract prompts aiming to teach the task. In our second approach, models transfer knowledge by generating synthetic examples. We evaluate these methods across diverse datasets and quantify memorization as a proxy for privacy loss. These techniques inspired by social learning yield promising results with low memorization of the original data. In particular, we show that performance using these methods is comparable to results with the use of original labels and prompts. Our work demonstrates the viability of social learning for LLMs, establishes baseline approaches and highlights several unexplored areas for future work

    Engaging Engineering Teams Through Moral Imagination: A Bottom-Up Approach for Responsible Innovation and Ethical Culture Change in Technology Companies

    Full text link
    We propose a "Moral Imagination" methodology to facilitate a culture of responsible innovation for engineering and product teams in technology companies. Our approach has been operationalized over the past two years at Google, where we have conducted over 50 workshops with teams across the organization. We argue that our approach is a crucial complement to existing formal and informal initiatives for fostering a culture of ethical awareness, deliberation, and decision-making in technology design such as company principles, ethics and privacy review procedures, and compliance controls. We characterize some of the distinctive benefits of our methodology for the technology sector in particular.Comment: 16 pages, 1 figur

    Engaging Engineering Teams Through Moral Imagination: A Bottom-Up Approach for Responsible Innovation and Ethical Culture Change in Technology Companies

    Get PDF
    We propose a ‘Moral Imagination’ methodology to facilitate a culture of responsible innovation for engineering and product teams in technology companies. Our approach has been operationalized over the past two years at Google, where we have conducted over 50 workshops with teams from across the organization. We argue that our approach is a crucial complement to existing formal and informal initiatives for fostering a culture of ethical awareness, deliberation, and decision-making in technology design such as company principles, ethics and privacy review procedures, and compliance controls. We characterize some distinctive benefits of our methodology for the technology sector in particular

    Shifts in Coding Properties and Maintenance of Information Transmission during Adaptation in Barrel Cortex

    Get PDF
    Neuronal responses to ongoing stimulation in many systems change over time, or “adapt.” Despite the ubiquity of adaptation, its effects on the stimulus information carried by neurons are often unknown. Here we examine how adaptation affects sensory coding in barrel cortex. We used spike-triggered covariance analysis of single-neuron responses to continuous, rapidly varying vibrissa motion stimuli, recorded in anesthetized rats. Changes in stimulus statistics induced spike rate adaptation over hundreds of milliseconds. Vibrissa motion encoding changed with adaptation as follows. In every neuron that showed rate adaptation, the input–output tuning function scaled with the changes in stimulus distribution, allowing the neurons to maintain the quantity of information conveyed about stimulus features. A single neuron that did not show rate adaptation also lacked input–output rescaling and did not maintain information across changes in stimulus statistics. Therefore, in barrel cortex, rate adaptation occurs on a slow timescale relative to the features driving spikes and is associated with gain rescaling matched to the stimulus distribution. Our results suggest that adaptation enhances tactile representations in primary somatosensory cortex, where they could directly influence perceptual decisions

    Large Language Models Encode Clinical Knowledge

    Full text link
    Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications
    corecore