Understanding trajectory diversity is a fundamental aspect of addressing
practical traffic tasks. However, capturing the diversity of trajectories
presents challenges, particularly with traditional machine learning and
recurrent neural networks due to the requirement of large-scale parameters. The
emerging Transformer technology, renowned for its parallel computation
capabilities enabling the utilization of models with hundreds of millions of
parameters, offers a promising solution. In this study, we apply the
Transformer architecture to traffic tasks, aiming to learn the diversity of
trajectories within vehicle populations. We analyze the Transformer's attention
mechanism and its adaptability to the goals of traffic tasks, and subsequently,
design specific pre-training tasks. To achieve this, we create a data structure
tailored to the attention mechanism and introduce a set of noises that
correspond to spatio-temporal demands, which are incorporated into the
structured data during the pre-training process. The designed pre-training
model demonstrates excellent performance in capturing the spatial distribution
of the vehicle population, with no instances of vehicle overlap and an RMSE of
0.6059 when compared to the ground truth values. In the context of time series
prediction, approximately 95% of the predicted trajectories' speeds closely
align with the true speeds, within a deviation of 7.5144m/s. Furthermore, in
the stability test, the model exhibits robustness by continuously predicting a
time series ten times longer than the input sequence, delivering smooth
trajectories and showcasing diverse driving behaviors. The pre-trained model
also provides a good basis for downstream fine-tuning tasks. The number of
parameters of our model is over 50 million.Comment: 16 pages, 6 figures, under reviewed by Transportation Research Board
Annual Meeting, work in updat