6,011 research outputs found
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Aligning language models (LMs) with preferences is an important problem in
natural language generation. A key challenge is that preferences are typically
provided at the sequence level while LM training and generation both occur at
the token level. There is, therefore, a granularity mismatch between the
preference and the LM training losses, which may complicate the learning
problem. In this paper, we address this issue by developing an alternate
training process, where we iterate between grounding the sequence-level
preference into token-level training guidance, and improving the LM with the
learned guidance. For guidance learning, we design a framework that extends the
pairwise-preference learning in imitation learning to both variable-length LM
generation and utilizing the preference among multiple generations. For LM
training, based on the amount of supervised data, we present two minimalist
learning objectives that utilize the learned guidance. In experiments, our
method performs competitively on two distinct representative LM tasks --
discrete-prompt generation and text summarization
Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture
In this paper, we propose a deep learning based vehicle trajectory prediction
technique which can generate the future trajectory sequence of surrounding
vehicles in real time. We employ the encoder-decoder architecture which
analyzes the pattern underlying in the past trajectory using the long
short-term memory (LSTM) based encoder and generates the future trajectory
sequence using the LSTM based decoder. This structure produces the most
likely trajectory candidates over occupancy grid map by employing the beam
search technique which keeps the locally best candidates from the decoder
output. The experiments conducted on highway traffic scenarios show that the
prediction accuracy of the proposed method is significantly higher than the
conventional trajectory prediction techniques
Learning the Tangent Space of Dynamical Instabilities from Data
For a large class of dynamical systems, the optimally time-dependent (OTD)
modes, a set of deformable orthonormal tangent vectors that track directions of
instabilities along any trajectory, are known to depend "pointwise" on the
state of the system on the attractor, and not on the history of the trajectory.
We leverage the power of neural networks to learn this "pointwise" mapping from
phase space to OTD space directly from data. The result of the learning process
is a cartography of directions associated with strongest instabilities in phase
space. Implications for data-driven prediction and control of dynamical
instabilities are discussed
- …