9 research outputs found
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Aligning language models (LMs) with preferences is an important problem in
natural language generation. A key challenge is that preferences are typically
provided at the sequence level while LM training and generation both occur at
the token level. There is, therefore, a granularity mismatch between the
preference and the LM training losses, which may complicate the learning
problem. In this paper, we address this issue by developing an alternate
training process, where we iterate between grounding the sequence-level
preference into token-level training guidance, and improving the LM with the
learned guidance. For guidance learning, we design a framework that extends the
pairwise-preference learning in imitation learning to both variable-length LM
generation and utilizing the preference among multiple generations. For LM
training, based on the amount of supervised data, we present two minimalist
learning objectives that utilize the learned guidance. In experiments, our
method performs competitively on two distinct representative LM tasks --
discrete-prompt generation and text summarization
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Offline reinforcement learning (RL) extends the paradigm of classical RL
algorithms to purely learning from static datasets, without interacting with
the underlying environment during the learning process. A key challenge of
offline RL is the instability of policy training, caused by the mismatch
between the distribution of the offline data and the undiscounted stationary
state-action distribution of the learned policy. To avoid the detrimental
impact of distribution mismatch, we regularize the undiscounted stationary
distribution of the current policy towards the offline data during the policy
optimization process. Further, we train a dynamics model to both implement this
regularization and better estimate the stationary distribution of the current
policy, reducing the error induced by distribution mismatch. On a wide range of
continuous-control offline RL datasets, our method indicates competitive
performance, which validates our algorithm. The code is publicly available.Comment: International Conference on Machine Learning (ICML) 202
A Regularized Implicit Policy for Offline Reinforcement Learning
Offline reinforcement learning enables learning from a fixed dataset, without
further interactions with the environment. The lack of environmental
interactions makes the policy training vulnerable to state-action pairs far
from the training dataset and prone to missing rewarding actions. For training
more effective agents, we propose a framework that supports learning a flexible
yet well-regularized fully-implicit policy. We further propose a simple
modification to the classical policy-matching methods for regularizing with
respect to the dual form of the Jensen--Shannon divergence and the integral
probability metrics. We theoretically show the correctness of the
policy-matching approach, and the correctness and a good finite-sample property
of our modification. An effective instantiation of our framework through the
GAN structure is provided, together with techniques to explicitly smooth the
state-action mapping for robust generalization beyond the static dataset.
Extensive experiments and ablation study on the D4RL dataset validate our
framework and the effectiveness of our algorithmic designs
Visual analysis of global research on immunotherapy for gastric cancer: A literature mining from 2012 to 2022
Gastric cancer (GC) is one of the most common malignancies. Immunotherapy becomes an indispensable part of GC. This study conducts bibliometric analysis of immunotherapy for GC to clarify the research status and identify potential new research directions. VOS viewer and CiteSpace visualization software were used to demonstrate collaborations and correlations. A total of 1141 English publications from 2012 to 2022 were included. The number of publications increased year by year. The publications were mainly from China (n = 579, 50.70%), followed by the United States. Fudan University published the most publications (n = 48, 4.21%). Frontiers in Oncology and Journal of Clinical Oncology ranked first in cited and co-cited journals, respectively. Kim Kyoung-Mee published the most publications on immunotherapy for GC (n = 14). The clustering of timeline view and co-cited references show the hotspot transformation on immunotherapy for GC. Initially, the hot topic was “cytokine-induced killer cells” and “myeloid-derived suppressor cells.” In recent years, the focus has turned to “targeted therapy.” “CAR-T” has become the hottest topic, and GC has entered precision therapy phase. Screening patients who can benefit from immunotherapy is key to improving prognosis. The combination of immunotherapy with other treatment options, such as chemotherapy and targeted therapy, is currently the focus of research. Chimeric antigen receptor T cell will be further studied in the future
Efficacy and safety of totally laparoscopic gastrectomy with uncut Roux-en-Y for gastric cancer: a dual-center retrospective study
Abstract Background Uncut Roux-en-Y (URY) effectively alleviates the prevalent complexities connected with RY, such as Roux-en-Y stasis syndrome (RSS). Nevertheless, for gastric cancer (GC) patients, it is still controversial whether URY has an impact on long-term prognosis and whether it has fewer afferent loop recanalization. Therefore, compare whether URY and RY have differences in prognosis and long-term complications of GC patients undergoing totally laparoscopic gastrectomy (TLG). Methods We analyzed the data of patients who underwent TLG combined with digestive tract reconstruction from dual-center between 2016 and 2022. Only patients undergoing URY and RY were selected for analysis. Relapse-free survival (RFS) and overall survival (OS) were estimated. Bias between the groups was reduced by propensity score matching (PSM). The Cox proportional hazard regression model was used to further analyze the influence of URY on prognosis. Results Two hundred forty two GC patients were enrolled. The URY had significantly shorter operation time, liquid food intake time, and in-hospital stays than the RY (P < 0.001). The URY had fewer long-term and short-term postoperative complications than the RY, especially with regard to RSS, reflux esophagitis, and reflux gastritis. The 3-year and 5-year OS of the URY group and the RY group before PSM: 87.5% vs. 65.6% (P < 0.001) and 81.4% vs. 61.7% (P = 0.001). PSM and Cox multivariate analysis confirmed that compared to RY, URY can improve the short-term and long-term prognosis of GC patients. Conclusion TLG combined with URY for GC, especially for advanced, older, and poorly differentiated patients, may promote postoperative recovery and improve long-term prognosis
The Efficacy and Safety of the Chinese Herbal Formula, JTTZ, for the Treatment of Type 2 Diabetes with Obesity and Hyperlipidemia: A Multicenter Randomized, Positive-Controlled, Open-Label Clinical Trial
Background and Aim. Studies have shown an increasing number of type 2 diabetes (T2D) patients with concomitant obesity and hyperlipidemia syndromes, resulting from relevant metabolic disorders. However, there are few medications and therapies, which can thoroughly address these issues. Therefore, the current study evaluated the efficacy and safety of using JTTZ, a Chinese herbal formula, to treat T2D with obesity and hyperlipidemia. Methods. A total of 450 participants with T2D (HbA1c ≥ 7.0%; waist circumference ≥ 90 cm and 80 cm in males and females, resp.; and triglycerides (TG) ≥ 1.7 mmol/L) were randomly assigned, in equal proportions, to two groups in this multicenter randomized, positive-controlled, open-label trial. One group received JTTZ formula, and the other received metformin (MET) for 12 consecutive weeks. The primary efficacy outcomes were changes in HbA1c, TG, weight, and waist circumference. Adverse reactions and hypoglycemia were monitored. Results. HbA1c decreased by 0.75 ± 1.32% and 0.71 ± 1.2% in the JTTZ and MET groups, respectively, after 12 weeks of treatment. TG levels in the JTTZ and MET groups were reduced by 0.64 ± 2.37 mmol/L and 0.37 ± 2.18 mmol/L, respectively. Weight was decreased by 2.47 ± 2.71 kg in the JTTZ group and by 2.03 ± 2.36 kg in the MET group. JTTZ also appeared to alleviate insulin resistance and increase HOMA-β. In addition, symptoms were significantly relieved in participants in the JTTZ group compared to those in the MET group. One case of hypoglycemia was reported in the MET group. No severe adverse events were reported in either group. Conclusions. The JTTZ formula led to safe and significant improvements in the blood glucose, blood lipids, and weight levels; relieved symptoms; and enhanced β cell function for T2D patients with obesity and hyperlipidemia. The JTTZ formula has shown that it could potentially be developed as an alternative medicine for patients with T2D, particularly those who cannot tolerate metformin or other hypoglycemic drugs. This trial was registered with Clinicaltrials.gov NCT01471275
Highly Hydroxide-Conductive Nanostructured Solid Electrolyte via Predesigned Ionic Nanoaggregates
The
creation of interconnected ionic nanoaggregates within solid electrolytes
is a crucial yet challenging task for fabricating high-performance
alkaline fuel cells. Herein, we present a facile and generic approach
to embedding ionic nanoaggregates via predesigned hybrid core–shell
nanoarchitecture within nonionic polymer membranes as follows: (i)
synthesizing core–shell nanoparticles composed of SiO<sub>2</sub>/densely quaternary ammonium-functionalized polystyrene. Because
of the spatial confinement effect of the SiO<sub>2</sub> “core”,
the abundant hydroxide-conducting groups are locally aggregated in
the functionalized polystyrene “shell”, forming ionic
nanoaggregates bearing intrinsic continuous ion channels; (ii) embedding
these ionic nanoaggregates (20–70 wt %) into the polysulfone
matrix to construct interconnected hydroxide-conducting channels.
The chemical composition, physical morphology, amount, and distribution
of the ionic nanoaggregates are facilely regulated, leading to highly
connected ion channels with high effective ion mobility comparable
to that of the state-of-the-art Nafion. The resulting membranes display
strikingly high hydroxide conductivity (188.1 mS cm<sup>–1</sup> at 80 °C), which is one of the highest results to date. The
membranes also exhibit good mechanical properties. The independent
manipulation of the conduction function and nonconduction function
by the ionic nanoaggregates and nonionic polymer matrix, respectively,
opens a new avenue, free of microphase separation, for designing high-performance
solid electrolytes for diverse application realms