Search CORE

6 research outputs found

Return on Data: Personalizing Consumer Guidance in Data Exchanges

Author: Kolt Noam
Publication venue: Yale Law School Legal Scholarship Repository
Publication date: 19/05/2020
Field of study

Consumers routinely supply personal data to technology companies in exchange for services. Yet, the relationship between the utility (U) consumers gain and the data (D) they supply — “return on data” (ROD) — remains largely unexplored. Expressed as a ratio, ROD = U / D. While lawmakers strongly advocate protecting consumer privacy, they tend to overlook ROD. Are the benefits of the services enjoyed by consumers, such as social networking and predictive search, commensurate with the value of the data extracted from them? How can consumers compare competing data-for-services deals? Currently, the legal frameworks regulating these transactions, including privacy law, aim primarily to protect personal data

Yale Law School Legal Scholarship Repository

Model evaluation for extreme risks

Author: Anderljung Markus
Avin Shahar
Bengio Yoshua
Bolina Vijay
Christiano Paul
Clark Jack
Dafoe Allan
Farquhar Sebastian
Gabriel Iason
Garfinkel Ben
Hawkins Will
Ho Lewis
Kim Been
Kokotajlo Daniel
Kolt Noam
Leung Jade
Marchal Nahema
Phuong Mary
Shevlane Toby
Siddarth Divya
Whittlestone Jess
Publication venue
Publication date: 24/05/2023
Field of study

Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify dangerous capabilities (through "dangerous capability evaluations") and the propensity of models to apply their capabilities for harm (through "alignment evaluations"). These evaluations will become critical for keeping policymakers and other stakeholders informed, and for making responsible decisions about model training, deployment, and security

arXiv.org e-Print Archive

Frontier AI Regulation: Managing Emerging Risks to Public Safety

Author: Anderljung Markus
Avin Shahar
Barnhart Joslyn
Brundage Miles
Bullock Justin
Cass-Beggs Duncan
Chang Ben
Collins Tantum
Fist Tim
Hadfield Gillian
Hayes Alan
Ho Lewis
Hooker Sara
Horvitz Eric
Kolt Noam
Korinek Anton
Leung Jade
O'Keefe Cullen
Schuett Jonas
Shavit Yonadav
Siddarth Divya
Trager Robert
Whittlestone Jess
Wolf Kevin
Publication venue
Publication date: 04/09/2023
Field of study

Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model's capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.Comment: Update July 11th: - Added missing footnote back in. - Adjusted author order (mistakenly non-alphabetical among the first 6 authors) and adjusted affiliations (Jess Whittlestone's affiliation was mistagged and Gillian Hadfield had SRI added to her affiliations) Updated September 4th: Various typo

arXiv.org e-Print Archive

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Author: Chilton Adam
Chohlas-Wood Alex
Choi Jonathan H.
Dickinson Gregory M.
Fagan Frank
Gandhi Sunny
Gao Shang
Goel Sharad
Guha Neel
Hagan Margaret
Hegland Jason
Henderson Peter
Ho Daniel E.
Holzenberger Nils
Hoque Enam
Iyer Varun
Kolt Noam
Li Zehua
Livermore Michael A.
Ma Megan
Narayana Aditya
Nay John
Niklaus Joel
Nudell Joe
Nyarko Julian
Peters Austin
Porat Haggai
Rasumov-Rahe Nikon
Rehaag Sean
Rockmore Daniel
Ré Christopher
Sarfaty Galit
Surani Faiz
Talisman Dmitry
Tobia Kevin
Waldon Brandon
Williams Spencer
Wu Jessica
Zambrano Diego A.
Zur Tom
Publication venue: Osgoode Digital Commons
Publication date: 26/09/2023
Field of study

The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning—which distinguish between its many forms—correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables

York University, Osgoode Hall Law School

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning -- which distinguish between its many forms -- correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.Comment: 143 pages, 79 tables, 4 figure

arXiv.org e-Print Archive

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Author: Chilton Adam
Chohlas-Wood Alex
Choi Jonathan H.
Dickinson Gregory M.
Fagan Frank
Gandhi Sunny
Gao Shang
Goel Sharad
Guha Neel
Hagan Margaret
Hegland Jason
Henderson Peter
Ho Daniel E.
Holzenberger Nils
Hoque Enam
Iyer Varun
Kolt Noam
Li Zehua
Livermore Michael A.
Ma Megan
Narayana Aditya
Nay John
Niklaus Joel
Nudell Joe
Nyarko Julian
Peters Austin
Porat Haggai
Rasumov-Rahe Nikon
Rehaag Sean
Rockmore Daniel
Ré Christopher
Sarfaty Galit
Surani Faiz
Talisman Dmitry
Tobia Kevin
Waldon Brandon
Williams Spencer
Wu Jessica
Zambrano Diego A.
Zur Tom
Publication venue: Osgoode Digital Commons
Publication date: 26/09/2023
Field of study

bepress Legal Repository