Search CORE

29 research outputs found

The Public Distribution Systems of Foodgrains and Implications for Food Security: A Comparison of the Experiences of India and China

Author: Wan Guanghua
Zhou Zhangyue
Publication venue
Publication date
Field of study

public distribution system, food security, poverty, food subsidy, India, China

Research Papers in Economics

Income Inequality in Rural China: Regression-based Decomposition Using Household Data

Author: Wan Guanghua
Zhou Zhangyue
Publication venue
Publication date
Field of study

inequality decomposition, regression, income generating function, China

Research Papers in Economics

Secrets of RLHF in Large Language Models Part I: PPO

Author: Chang Cheng
Chen Lu
Cheng Wensen
Dou Shihan
Gao Songyang
Gui Tao
Hua Yuan
Huang Haoran
Huang Xuanjing
Jin Senjie
Lai Wenbin
Liu Qin
Liu Yan
Qiu Xipeng
Shen Wei
Sun Tianxiang
Wang Binghai
Weng Rongxiang
Xi Zhiheng
Xiong Limao
Xu Nuo
Yan Hang
Yin Zhangyue
Zhang Qi
Zheng Rui
Zhou Yuhao
Zhu Minghao
Publication venue
Publication date: 10/07/2023
Field of study

Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO code

arXiv.org e-Print Archive

Expert Judgement on the Effects of the Grain Marketing System on Grain Production in China: A Survey

Author: Zhou Zhangyue
Publication venue
Publication date: 01/04/2017
Field of study

AgEcon Search - Research in Agricultural & Applied Economics

AgEcon Search: Research in Agricultural and Applied Economics

Grain Supply Response in China and India: A Comparative Study

Author: Zhou Zhangyue
Publication venue
Publication date: 01/04/2017
Field of study

AgEcon Search - Research in Agricultural & Applied Economics

AgEcon Search: Research in Agricultural and Applied Economics

Expert Judgement on the Effects of the Grain Marketing System on Grain Production in India: A Survey

Author: Zhou Zhangyue
Publication venue
Publication date: 01/04/2017
Field of study

AgEcon Search - Research in Agricultural & Applied Economics

AgEcon Search: Research in Agricultural and Applied Economics

Achieving food security in China: past three decades and beyond

Author: Zhangyue Zhou
Publication venue: 'Emerald'
Publication date
Field of study

Crossref

Achieving food security in China: past three decades and beyond

Author: Zhangyue Zhou
Publication venue
Publication date
Field of study

Purpose – The paper aims to review and assess China's food security practice over the past three decades with a view of drawing implications for further improving its food security in the future. Design/methodology/approach – A normative food security framework is used to assess China's food security achievements and examine any remaining and emerging issues in its pursuit for food security. Findings – China has done well in achieving grain security in the past three decades. However, it cannot be concluded that China has achieved its food security according to the normative food security framework. This is because there are serious problems in the aspects of food safety and quality, environmental sustainability, and social stability. To achieve long-term food security, China has to tackle the wide spread issues of unsafe foods and foods of dubious quality, environmental pollution and degradation, and the establishment of a social security system. Originality/value – Examining China's food security practice over the past three decades can generate experiences and lessons valuable not only for China, but also for other developing countries in their efforts to achieving national food security. Issues are identified to which the Chinese government needs to pay attention in order to improve China's food security in the future.China, Contamination, Environmental health and safety, Food products, Food safety, Social welfare

Research Papers in Economics

China's feedgrain demand in global perspective

Author: Xin Xian
Zhou Zhangyue
Publication venue: China Agriculture Press
Publication date: 01/01/2006
Field of study

China's feedgrain use has increased remarkably in the past two decades. Its demand for feedgrain is expected to further grow, and by 2010, China's demand for feedgrain is expected to exceed that of foodgrain. This paper places\ud China's feedgrain demand in the global perspective and discusses the likely impact of China's rising demand for feedgrains on the world grain market and in turn on China's own domestic grain market and livestock industries. The paper concludes with recommendations on policy options available for China to deal with its fast-growing demand for feedgrains

ResearchOnline@JCU

ResearchOnline at James Cook University

Income inequality in rural China: regression-based decomposition using household data

Author: Wan Guanghua
Zhou Zhangyue
Publication venue: Wiley-Blackwell
Publication date: 01/02/2005
Field of study

A considerable literature exists on the measurement of income inequality in China and its increasing trend. Much less is known about the driving forces of this trend and their quantitative contributions. Conventional decompositions, by factor components or by population subgroups, provide only limited information on the\ud determinants of income inequality. This paper represents an early attempt to apply the regression-based decomposition framework to the study of inequality accounting in rural China, using household-level data. It is found that geography has been the dominant factor but is becoming less important in explaining total inequality. Capital input emerges as a most significant determinant of income inequality. Farming structure is more important than labor and other inputs in contributing to income inequality across households

ResearchOnline@JCU

ResearchOnline at James Cook University