11 research outputs found
The History and Risks of Reinforcement Learning and Human Feedback
Reinforcement learning from human feedback (RLHF) has emerged as a powerful
technique to make large language models (LLMs) easier to use and more
effective. A core piece of the RLHF process is the training and utilization of
a model of human preferences that acts as a reward function for optimization.
This approach, which operates at the intersection of many stakeholders and
academic disciplines, remains poorly understood. RLHF reward models are often
cited as being central to achieving performance, yet very few descriptors of
capabilities, evaluations, training methods, or open-source models exist. Given
this lack of information, further study and transparency is needed for learned
RLHF reward models. In this paper, we illustrate the complex history of
optimizing preferences, and articulate lines of inquiry to understand the
sociotechnical context of reward models. In particular, we highlight the
ontological differences between costs, rewards, and preferences at stake in
RLHF's foundations, related methodological tensions, and possible research
directions to improve general understanding of how reward models function.Comment: 14 pages, 3 figure
Recommended from our members
Food Support Networks and their Relationship to Food Insecurity in Colorado Counties
Food insecurity has reemerged as a significant social problem in the United States, despite the fact that we produce more than enough food as a nation to feed all of our citizens. Since the economic recession in 2007, food insecurity has increased, and in recent years has remained at 14.3%. Many strategies have been adopted to address food insecurity in the U.S., some of which are sponsored by the federal government, such as SNAP and the School Lunch Program, while others are donation-driven non-profit organizations, such as food pantries. While there are a number of food support networks that have been established with the intent of decreasing food insecurity, there are still gaps in the food system in which food is wasted and people are hungry. This study explores contemporary food insecurity within Colorado counties, specifically the effectiveness of existing food support networks, the drivers of food insecurity (aside from the factors that are used to calculate the county food insecurity rate), and how effective two local non-profit organizations, Boulder Food Rescue and Denver Food Rescue, have been at addressing hunger and food insecurity in the communities in which they operate. In this study I used both quantitative analyses of Colorado counties as well as qualitative interviews with key players addressing food insecurity. Results demonstrated that the number of food pantries in a county and the presence of a food rescue organization are both positively related to the county\u27s food insecurity rate. As the literature suggests, this indicates that food pantries and food rescue organizations are more likely to locate in areas of high food insecurity. The most statistically significant drivers of food insecurity are the percentage of individuals with a high school diploma (the higher the percentage, the lower the rate of food insecurity) and the number of individuals where English is not their first language (the higher the percentage, the higher the rate of food insecurity). Lastly, both Boulder and Denver Food Rescue have filled an interesting gap in the food system, as both organizations are helping supplement a growing trend of providing fresh and nutritious fruits and vegetables to food-insecure individuals. Furthermore, both non-profits have succeeded at reaching several key traditionally unreachable food insecure populations, such as the elderly and people for whom English is a second language
The Making of Modern America: Quantifying Chaos
As we begin to explore the Gilded Age (1870-1900), that era in American History sandwiched between the Civil War/Reconstruction and the Progressive Era to the Great War, we want students to grasp the enormity of the changes impacting the lives of Americans who have largely been engaged in farming in many cases not so different than their ancestors had for several hundreds of years. Technological changes in the first half of the 19th century contributed to some mechanization and manufacturing, but the enormity of the Civil War and the acquisition of the entire continental territory in the 1850s, accelerated changes in the production of goods, in the development of communication and transportation, in the growth of cities, in the opportunities for immigrants, for participation in politics, and in the reach of the government. In this lesson, students will dip into the many changes over the decades from 1860 to 1900 by searching for information on a variety of topics, including: Banking or Finance, Demographics, Government, Industrialization, Immigration, Middle Class Angst, Military, Natural Resources, Politics, Racism, Robber Barons/Captains of Industry, Technological Innovations, Transportation, Urbanization, Voter Turnout, and Xenophobia.https://repository.stcloudstate.edu/gilded_age/1001/thumbnail.jp
Reward Reports for Reinforcement Learning
The desire to build good systems in the face of complex societal effects requires a dynamic approach towards equity and access. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and post-deployment performance unexamined. Meanwhile, recent work in reinforcement learning design has shown that the effects of optimization objectives on the resultant system behavior can be wide-ranging and unpredictable. In this paper we sketch a framework for documenting deployed learning systems, which we call Reward Reports. Taking inspiration from various contributions to the technical literature on reinforcement learning, we outline Reward Reports as living documents that track updates to design choices and assumptions behind what a particular automated system is optimizing for. They are intended to track dynamic phenomena arising from system deployment, rather than merely static properties of models or data. After presenting the elements of a Reward Report, we provide three examples: DeepMind's MuZero, MovieLens, and a hypothetical deployment of a Project Flow traffic control policy