466 research outputs found
TextGAIL: Generative Adversarial Imitation Learning for Text Generation
Generative Adversarial Networks (GANs) for text generation have recently
received many criticisms, as they perform worse than their MLE counterparts. We
suspect previous text GANs' inferior performance is due to the lack of a
reliable guiding signal in their discriminators. To address this problem, we
propose a generative adversarial imitation learning framework for text
generation that uses large pre-trained language models to provide more reliable
reward guidance. Our approach uses contrastive discriminator, and proximal
policy optimization (PPO) to stabilize and improve text generation performance.
For evaluation, we conduct experiments on a diverse set of unconditional and
conditional text generation tasks. Experimental results show that TextGAIL
achieves better performance in terms of both quality and diversity than the MLE
baseline. We also validate our intuition that TextGAIL's discriminator
demonstrates the capability of providing reasonable rewards with an additional
task.Comment: AAAI 202
Cu-Based Electrocatalysts for Carbon Dioxide Conversion to Value-Added Chemicals
Massive usage of fossil fuel has being causing considerable emission of CO2, which increases the temperature of the planet and greatly threaten human living environment, such as soil degradation, lower agricultural productivity, desertification, less biodiversity, fresh-water reduction, ocean acidification, ozone sphere destruction, etc. A number of technologies are being developed to reduce the CO2 amount, however, all existing technologies except utilizing CO2 as a feedstock, are hardly to essentially close the anthropogenic carbon loop. Currently, considering the economy and operability, electroreduction of CO2 seems to be the most promising strategy to convert CO2 to high value chemicals.
During the process of CO2 electroreduction, Cu-based catalysts become the most popular because they meet the requirements of activating CO2 and intermediates, suppression of hydrogen formation, and electron transportation. Herein, the factors that affect the Cu-based catalysts’ performance, including morphology, particle sizes, presence of atomic-scale defects, surface roughness, residual oxygen atoms, and so on, have been surveyed and discussed. In addition, the most probable reaction pathways to synthesize the desirable C2 products under different situation have been identified, which follow *CO + *CO → *COCO, *CO + *COH → C2, *CO + *CHO → C2 and *COH → *CH2 → C2. This report will benefit the design and optimization of Cu-based catalysts for the conversion of CO2 to high value chemicals with high efficiency and selectivity
Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning
A fundamental question in any peer-to-peer ride-sharing system is how to,
both effectively and efficiently, meet the request of passengers to balance the
supply and demand in real time. On the passenger side, traditional approaches
focus on pricing strategies by increasing the probability of users' call to
adjust the distribution of demand. However, previous methods do not take into
account the impact of changes in strategy on future supply and demand changes,
which means drivers are repositioned to different destinations due to
passengers' calls, which will affect the driver's income for a period of time
in the future. Motivated by this observation, we make an attempt to optimize
the distribution of demand to handle this problem by learning the long-term
spatio-temporal values as a guideline for pricing strategy. In this study, we
propose an offline deep reinforcement learning based method focusing on the
demand side to improve the utilization of transportation resources and customer
satisfaction. We adopt a spatio-temporal learning method to learn the value of
different time and location, then incentivize the ride requests of passengers
to adjust the distribution of demand to balance the supply and demand in the
system. In particular, we model the problem as a Markov Decision Process (MDP)
Synthesizing mixed-integer linear programming models from natural language descriptions
Numerous real-world decision-making problems can be formulated and solved
using Mixed-Integer Linear Programming (MILP) models. However, the
transformation of these problems into MILP models heavily relies on expertise
in operations research and mathematical optimization, which restricts
non-experts' accessibility to MILP. To address this challenge, we propose a
framework for automatically formulating MILP models from unstructured natural
language descriptions of decision problems, which integrates Large Language
Models (LLMs) and mathematical modeling techniques. This framework consists of
three phases: i) identification of decision variables, ii) classification of
objective and constraints, and iii) finally, generation of MILP models.
In this study, we present a constraint classification scheme and a set of
constraint templates that can guide the LLMs in synthesizing a complete MILP
model. After fine-tuning LLMs, our approach can identify and synthesize logic
constraints in addition to classic demand and resource constraints. The logic
constraints have not been studied in existing work.
To evaluate the performance of the proposed framework, we extend the NL4Opt
dataset with more problem descriptions and constraint types, and with the new
dataset, we compare our framework with one-step model generation methods
offered by LLMs. The experimental results reveal that with respect to the
accuracies of generating the correct model, objective, and constraints, our
method which integrates constraint classification and templates with LLMs
significantly outperforms the others. The prototype system that we developed
has a great potential to capture more constraints for more complex MILPs. It
opens up opportunities for developing training tools for operations research
practitioners and has the potential to be a powerful tool for automatic
decision problem modeling and solving in practice
- …