26 research outputs found

    Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

    Full text link
    Large Language Models (LLMs) can acquire extensive world knowledge through pre-training on large corpora. However, due to exposure to low-quality data, LLMs may exhibit harmful behavior without aligning with human values. The dominant approach for steering LLMs towards beneficial behavior involves Reinforcement Learning with Human Feedback (RLHF), with Proximal Policy Optimization (PPO) serving as the default RL optimizer. Despite its effectiveness, PPO has limitations when optimizing rewards trained from comparison-based loss. Primarily, PPO is not invariant to equivalent reward functions containing identical preference information due to the need to calibrate the reward scale. Additionally, PPO's necessity for token-wise updates introduces complexity in both function approximation and algorithm design compared to trajectory-wise optimization. This paper proposes a new framework, reinforcement learning with relative feedback, and a novel trajectory-wise policy gradient algorithm, Pairwise Proximal Policy Optimization (P3O) that operates directly on comparative rewards. We show theoretically that P3O is invariant to equivalent rewards and avoids the complexity of PPO. Empirical evaluations demonstrate that P3O outperforms PPO in the KL-Reward trade-off and can align with human preferences as well as or better than prior methods. In summary, this work introduces a simpler yet effective approach for aligning LLMs to human preferences through relative feedback.Comment: 19 pages, 5 figure

    PO-084 Research of HIIT Detraining on Mitochondria of Soleus Muscle Beclin1 and Bnip3 Contents in Aging Rats

    Get PDF
    Objective To observe the temporal variation of Beclin1 and Bnip3 protein in skeletal muscle aging degeneration by constructing the aged rat model, and to observe the effect of HIIT intervention on the changes of Beclin1 and Bnip3 protein and the relationship between the two. It provides a theoretical basis for the effect of exercise on the aging degeneration of skeletal muscle by affecting the level of mitochondrial autophagy. Methods 40 male Wistar rats aged 8 months were randomly divided into quiet control group (C) and HIIT intervention group (H). After the rats entered the animal room for one week of adaptive feeding and exercise, the rats in the C group did not exercise, and the H group exercise alternately based on the maximum oxygen uptake test results of the rats with the 70%-90%-50%VO2max intensity. Once every two weeks, the maximal oxygen uptake of the rats in group H and group C was tested. Group H underwent 50min/ days, 5 days / weeks, and lasted for 16 weeks. The rats in the two groups were randomly selected after the first VO2 test and eighth and sixteenth weeks after intervention. After anesthesia, blood was collected from the abdominal aorta and soleus tissue was obtained. The ROS activity in soleus muscle was tested by fluorescence enzyme labeling method. Isolation of mitochondria from soleus muscle using tissue mitochondria Isolation Kit, and the expression of Beclin1 and Bnip3 in the mitochondria of the soleus muscle was tested by Western blot. The Image Lab 4 software was used to collect the data of the protein test strip, and the SPSS 17 software was used to analyze the data. The results of the data analysis were presented in the form of mean standard deviation. In the process of protein strip analysis, the relative value of the protein content of each sample was obtained by the gray scale analysis method. The results of the first sampling were taken as the baseline value, and the ratio of the H group in the C group of 8 weeks and 16 weeks was obtained with the baseline value, that is, the relative value of the protein content. Then, repeated measurement of variance analysis was used to analyze the differences of different indicators at baseline level, 8 weeks and 12 weeks between group C and group H. The independent sample T test was used without interaction effect, and multivariate analysis of variance was used. A significant level of alpha =0.05 is set. Results (1) the content of ROS in skeletal muscle of rats was related to the process of natural aging (F=119.314, P < 0.001), and the level of ROS would rise with the process of natural collars (F=28.884, P=0.001; F=127.607, P < 0.001) through the comparison of the time points in the group C and the H group. At the same time, the level of ROS in group H was lower than that in group C, but there was no significant difference (P=0.310). And the interaction effect of time and exercise mode (HIIT) will not affect the result (F=0.814, P=0.477). But the growth rate of ROS in group H was lower than that in C group. ⑵Exercise, time change and their interaction did not affect the content of Beclin1 in rat skeletal muscle mitochondria (P > 0.05). ⑶The mitochondrial Bnip3 content in H group and C group was significantly different at 8 weeks (F=14.500, P=0.001), H group was significantly higher than that in C group, but there was no significant difference in mitochondrial Bnip3 content at the 16 week (F=0.090, P=0.767), and the Bnip3 content of skeletal muscle mitochondria changed with age (F=20.852, 0.001). The trend of H increased, but then decreased. There was a linear trend (F=6.950, P=0.005) between the level of mitochondrial Bnip3 content and the intergroup factors (time point changes) and the interaction between time and HIIT movement in rats. Conclusions  With the process of aging, (1) The content of ROS in skeletal muscle of rats increased significantly, while long-term HIIT training could delay the increase, but the best exercise time was unknown. (2) There was no obvious change in Beclin1 content in skeletal muscle mitochondria of rats, and HIIT training had no obvious effect on it. However, the changes in mitochondrial Beclin1 content relative to the total Beclin1 content of skeletal muscle need to be further studied; (3) The content of Bnip3 in skeletal muscle mitochondria in rats is increased, and long-term HIIT training has a delayed effect

    PO-098 Effect of HIIT on mitochondrial telomerase of skeletal muscle in aged rats: There is no full text article associated with this abstract

    Get PDF
    Objective The HIIT and moderate-intensity exercise are two different exercise models among the public fitness. In recent years, HIIT become more and more popular, unfortunately, there is a tremendous lack of research being done effects of mitochondrial reverse transcriptase (TERT) on age-related degeneration of skeletal muscle by HIIT. The purpose of this study was to compare the HIIT group and moderate-intensity group, and research difference of telomerase expression and cardiopulmonary endurance between the exercise group and the quiet control group was discussed. Methods  fifty-nine male Wistar rats were divided into three groups at random: control group (Q=19), moderate-intensity intervention group (M=20), and HIIT intervention group (H=20). The rats in Q group did not any exercise, and the rats in M group developed the exercise with 60% VO2max intensity for 8 weeks. H group did a training program for an 8-week exercise with alternating 40%, 60%, and 80% VO2max intensities. The rats in the experimental group were exercised for 50 minutes every day and trained for 5 days per week. After the baseline value group was sampled, each group of rats was selected after the training reached the specified number of weeks (4 and 8 weeks), and the maximum oxygen uptake test was performed before the material was taken. Single factor analysis of variance were used to assess differences in VO2max, and expression of protein between conditions. Results It was found that H group VO2max was significantly higher than M group and Q group (P<0.05). At same time, the mTERT expression of the M group at the 4th week was significantly higher than that of the Q group (P<0.05). The mTERT expression in group H was significantly higher than that in group Q at week 8 (P<0.05).There was no significant difference between the H group and the Q group at 8th week (P<0.05). Conclusions 1. HIIT exercise lasting for 8 weeks can effectively inhibit the decrease of maximal oxygen uptake in aging rats compared with moderate exercise. 2. HIIT training for 8 weeks promotes the expression of mTERT; 3. The maintenance of VO2max in aging rats may be related to the enhancement of mitochondrial antioxidant function by HIIT-promoted TERT to mitochondrial translocation

    OR-005 Effects of HIITand MICT for 10 weeks on myocardial AMPK and PGC-1α in rats

    Get PDF
    Objective: The improvement of cardiorespiratory fitness (CRF) is known as an effective strategy for prevention cardiovascular risk. Myocardial aerobic oxidation which control by the signal way of adenosine monophosphate -activated protein kinase (AMPK)- peroxisome proliferators γ activated receptor coativator-1-α (PGC-1α)  is the key for CRF. Previous studies only discuss the effect of the Moderate-Intensity Continuous Training (MICT) and High Intensive Interval Training (HIIT) on the signal way of AMPK- PGC-1α in skeletal muscle but not in the myocardium. The aim of this study was to compare the effects of 10 weeks HIIT and MICT on the expression of AMPK and PGC-1α in the myocardium of wistar male rats. Methods: Wistar male rats (n=30) aged 6 weeks were randomly divided into HIIT or MICT or control (CON) group. The training groups ran on a treadmill 5 days/week for 10 weeks. HIIT group ran six times 3 minutes (0° slope) 90% of Vmax separated by 3 minutes 50% of Vmax and MICT group ran for 50min (0° slope) at 60–70% of maximal speed (Vmax). The expression of AMPK and PGC-1α were assessed by Western Blotting. Results: After 10 weeks training, HIIT and MICT both increased the AMPK and PGC-1α expression compared with the CON group. Compared with the MICT group, the expression of AMPK and PGC-1α were significantly higher than the HIIT group (p<0.05). AMPK in MICT group were significant increased 1.16 times, and in HIIT group were significant increased 1.28 times to CON (P<0.05). PGC-1α level of HIIT was significant increased to 1.32 times to CON and also significant increased to 1.15 times to Group M (P<0.05); PGC-1α level of MICT was significant increased to 1.15 times to CON. Conclusion:HIIT seems to improve myocardial AMPK and PGC-1α more efficiently than MICT in rats after 10 weeks training.&nbsp

    PO-014 The effects of HIIT on ROS-AMPK- PGC-1α pathway in skeletal muscle and VO2max of ageing Wistar rats.

    Get PDF
    Objective To observe the 16 weeks of HIIT intervention on SOD,ROS and its related factors AMPK and oxidation capacity PGC-1α expression and the influence of the VO2max and its change rule in the process of the natural aging rats, To explore the correlation between the expression of ROS, AMPK and PGC-1αand the change of VO2max; Furthermore, it provides a theoretical basis for HIIT to delay the reduction of aerobic capacity in skeletal muscle ageing. Methods 58 male wistar rats(age:32 weeks) were selected and randomly divided into quiet group (C) and HIIT intervention group (H). All rats were fed in the barrier environment. Each group of rats entered the animal laboratory for a week of adaptive feeding and exercise. VO2max was tested and observed every two weeks in each group. Rats of group C don't exercise, group H at a rate of 50%, 70% and 90% VO2max corresponding alternation of 50 min/day, 5 days/week, for 16 weeks of exercise intervention, and according to the VO2max test results the exercise intensity. Both groups of rats in the intervention of 8 weeks, 16 weeks after the end of the 24 hours of materials, stripping rats soleus, SOD and content of ROS was tested by multifunctional enzyme mark, using western blot test the expression of AMPK and PGC-1α. VO2max, SOD, ROS test results and AMPK, PGC-1α, and relative expression data were analyzed using SPSS for one way ANOVA. Results The cardiopulmonary endurance of rats in group C and group H showed a decreasing trend in group C and group H during HIIT intervention, but the decrease trend in group H was slower than that in group C. 2. During 16 weeks aging , SOD expression of group C in the process of rendering first rise after falling, and expressed in 8 weeks SOD content was significantly lower than base value (P < 0.05), 16 weeks group C SOD levels higher than the base state. After 16 weeks of intervention, the expression of SOD in group H was relatively flat in the first 8 weeks, and the trend was in 8-16 weeks, and was significantly lower than 8 weeks in 16 weeks (P < 0.05). 3.The ROS content was significantly higher than basic state in 8 weeks and 16 weeks in the intervention process (P<0.05), and the ROS content was significantly higher than 8 weeks (P<0.05) at 16 weeks. The ROS content of group C and group H was significantly higher than that in the group at 8 weeks (P < 0.05). 4.The AMPK content in group C was significantly lower than that of the basic value (P<0.05), and the AMPK content in group H was significantly higher than that in group C (P<0.05). 5.After the intervention of 16 weeks, the content of PGC-1α in group C and group H showed a decrease trend and significantly lower than the basic value (P<0.05), but the content of group H was significantly higher than that of group C (P<0.05). 6.The changes of AMPK, PGC-1α and cardiopulmonary endurance were the same in all groups during the intervention. Conclusions 1.16 weeks of HIIT can effectively delay the decrease of SOD content in the aging rats, thus inhibiting the accumulation of ROS in the body. 2.16 weeks of HIIT intervention can effectively delay the expression of VO2max and AMPK and PGC-1α in aging rats. 3.16 weeks HIIT may delay the decrease of AMPK-PGC1 protein expression by inhibiting the accumulation of skeletal muscle ROS in the aging rats, thus inhibiting the decrease of VO2max

    PO-082 16-Week high intensity interval training does not alter LKB1 and AMPKα protein in Rats Liver

    Get PDF
    Objective Liver, as one of the most important organs involved in lipids and glucose metabolism, yet no study has examined the response of liver kinase B1 (LKB1) and AMP-activated protein kinase α(AMPKα) signaling after high intensity interval training. This study aims to evaluate the effect of 16-week high intensity interval training intervention on the expression of LKB1、AMPKα in liver of aging rats. Methods 8 -month-old male Wistar rats(n=40)were randomly divided into control group (C) and HIIT group (H). Group H with 70%-90%-50%VO2max intensity training for 50min/ day, 5 days / week, lasted for 16 weeks. Rats were killed on 0, 8 and 16 weeks. We examined the protein expression of LKB1 and AMPKα in liver. Proteins were analyzed by western blot analysis. Data are mean±SD; for ANOVA, p<0.05 was significant. Results The AMPKα levels in group C and group H increased with time and there was no significant difference between the groups. The content of LKB1 in group C and group H both increased first and then decreased, but there was no significant difference between the groups. Conclusions 16-week high intensity interval training intervention had no effect on LKB1, AMPKα protein expression in aging rats

    Sintering behavior and mechanism of tungsten powders prepared by solution combustion synthesis combined with hydrogen reduction

    No full text
    Nanosized tungsten powders were fabricated by solution combustion synthesis combined with hydrogen reduction. The powder had a size of 20 nm but possessed a large numbers of lattice defects. The fracture surface images at different temperatures show that the as-synthesized tungsten powder could be sintered via a pressureless process to relative density up to 95.78% at 1773 K. Kinetic analysis suggests that grain-boundary diffusion is one of the primary mechanisms of mass transport during the intermediate stage of sintering. The sintering properties are attributed to the ultrafine grain and the high sintering activation caused by the effect of the solution combustion synthesis method. It reveals in detail that the as-synthesized tungsten powder has a lower sintering activation energy compared to commercial nanosized tungsten powder, with a measured hardness of 633 HV
    corecore