30 research outputs found

    The Public Distribution Systems of Foodgrains and Implications for Food Security: A Comparison of the Experiences of India and China

    Get PDF
    public distribution system, food security, poverty, food subsidy, India, China

    Income Inequality in Rural China: Regression-based Decomposition Using Household Data

    Get PDF
    inequality decomposition, regression, income generating function, China

    Turn Waste into Worth: Rectifying Top-kk Router of MoE

    Full text link
    Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-kk routing mechanism suffers from redundancy computation and memory costs due to the unbalanced routing. Some experts are overflow, where the exceeding tokens are dropped. While some experts are vacant, which are padded with zeros, negatively impacting model performance. To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification. The Intra-GPU Rectification handles dropped tokens, efficiently routing them to experts within the GPU where they are located to avoid inter-GPU communication. The Fill-in Rectification addresses padding by replacing padding tokens with the tokens that have high routing scores. Our experimental results demonstrate that the Intra-GPU Rectification and the Fill-in Rectification effectively handle dropped tokens and padding, respectively. Furthermore, the combination of them achieves superior performance, surpassing the accuracy of the vanilla top-1 router by 4.7%

    Secrets of RLHF in Large Language Models Part I: PPO

    Full text link
    Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO code

    Achieving food security in China: past three decades and beyond

    No full text
    Purpose – The paper aims to review and assess China's food security practice over the past three decades with a view of drawing implications for further improving its food security in the future. Design/methodology/approach – A normative food security framework is used to assess China's food security achievements and examine any remaining and emerging issues in its pursuit for food security. Findings – China has done well in achieving grain security in the past three decades. However, it cannot be concluded that China has achieved its food security according to the normative food security framework. This is because there are serious problems in the aspects of food safety and quality, environmental sustainability, and social stability. To achieve long-term food security, China has to tackle the wide spread issues of unsafe foods and foods of dubious quality, environmental pollution and degradation, and the establishment of a social security system. Originality/value – Examining China's food security practice over the past three decades can generate experiences and lessons valuable not only for China, but also for other developing countries in their efforts to achieving national food security. Issues are identified to which the Chinese government needs to pay attention in order to improve China's food security in the future.China, Contamination, Environmental health and safety, Food products, Food safety, Social welfare

    China's feedgrain demand in global perspective

    No full text
    China's feedgrain use has increased remarkably in the past two decades. Its demand for feedgrain is expected to further grow, and by 2010, China's demand for feedgrain is expected to exceed that of foodgrain. This paper places\ud China's feedgrain demand in the global perspective and discusses the likely impact of China's rising demand for feedgrains on the world grain market and in turn on China's own domestic grain market and livestock industries. The paper concludes with recommendations on policy options available for China to deal with its fast-growing demand for feedgrains
    corecore