5 research outputs found

    Another Vertical View: A Hierarchical Network for Heterogeneous Trajectory Prediction via Spectrums

    Full text link
    With the fast development of AI-related techniques, the applications of trajectory prediction are no longer limited to easier scenes and trajectories. More and more heterogeneous trajectories with different representation forms, such as 2D or 3D coordinates, 2D or 3D bounding boxes, and even high-dimensional human skeletons, need to be analyzed and forecasted. Among these heterogeneous trajectories, interactions between different elements within a frame of trajectory, which we call the ``Dimension-Wise Interactions'', would be more complex and challenging. However, most previous approaches focus mainly on a specific form of trajectories, which means these methods could not be used to forecast heterogeneous trajectories, not to mention the dimension-wise interaction. Besides, previous methods mostly treat trajectory prediction as a normal time sequence generation task, indicating that these methods may require more work to directly analyze agents' behaviors and social interactions at different temporal scales. In this paper, we bring a new ``view'' for trajectory prediction to model and forecast trajectories hierarchically according to different frequency portions from the spectral domain to learn to forecast trajectories by considering their frequency responses. Moreover, we try to expand the current trajectory prediction task by introducing the dimension MM from ``another view'', thus extending its application scenarios to heterogeneous trajectories vertically. Finally, we adopt the bilinear structure to fuse two factors, including the frequency response and the dimension-wise interaction, to forecast heterogeneous trajectories via spectrums hierarchically in a generic way. Experiments show that the proposed model outperforms most state-of-the-art methods on ETH-UCY, Stanford Drone Dataset and nuScenes with heterogeneous trajectories, including 2D coordinates, 2D and 3D bounding boxes

    BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

    Full text link
    Visual images usually contain the informative context of the environment, thereby helping to predict agents' behaviors. However, they hardly impose the dynamic effects on agents' actual behaviors due to the respectively fixed semantics. To solve this problem, we propose a deterministic model named BGM to construct a guidance map to represent the dynamic semantics, which circumvents to use visual images for each agent to reflect the difference of activities in different periods. We first record all agents' activities in the scene within a period close to the current to construct a guidance map and then feed it to a Context CNN to obtain their context features. We adopt a Historical Trajectory Encoder to extract the trajectory features and then combine them with the context feature as the input of the social energy based trajectory decoder, thus obtaining the prediction that meets the social rules. Experiments demonstrate that BGM achieves state-of-the-art prediction accuracy on the two widely used ETH and UCY datasets and handles more complex scenarios

    View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums

    Full text link
    Understanding and forecasting future trajectories of agents are critical for behavior analysis, robot navigation, autonomous cars, and other related applications. Previous methods mostly treat trajectory prediction as time sequence generation. Different from them, this work studies agents' trajectories in a "vertical" view, i.e., modeling and forecasting trajectories from the spectral domain. Different frequency bands in the trajectory spectrums could hierarchically reflect agents' motion preferences at different scales. The low-frequency and high-frequency portions could represent their coarse motion trends and fine motion variations, respectively. Accordingly, we propose a hierarchical network V2^2-Net, which contains two sub-networks, to hierarchically model and predict agents' trajectories with trajectory spectrums. The coarse-level keypoints estimation sub-network first predicts the "minimal" spectrums of agents' trajectories on several "key" frequency portions. Then the fine-level spectrum interpolation sub-network interpolates the spectrums to reconstruct the final predictions. Experimental results display the competitiveness and superiority of V2^2-Net on both ETH-UCY benchmark and the Stanford Drone Dataset.Comment: Accepted to ECCV 202

    MSN: Multi-Style Network for Trajectory Prediction

    Full text link
    Trajectory prediction aims at forecasting agents' possible future locations considering their observations along with the video context. It is strongly required by a lot of autonomous platforms like tracking, detection, robot navigation, self-driving cars, and many other computer vision applications. Whether it is agents' internal personality factors, interactive behaviors with the neighborhood, or the influence of surroundings, all of them might represent impacts on agents' future plannings. However, many previous methods model and predict agents' behaviors with the same strategy or the ``single'' feature distribution, making them challenging to give predictions with sufficient style differences. This manuscript proposes the Multi-Style Network (MSN), which utilizes style hypothesis and stylized prediction two sub-networks, to give agents multi-style predictions in a novel categorical way adaptively. We use agents' end-point plannings and their interaction context as the basis for the behavior classification, so as to adaptively learn multiple diverse behavior styles through a series of style channels in the network. Then, we assume one by one that the target agents will plan their future behaviors according to each of these categorized styles, thus utilizing different style channels to give a series of predictions with significant style differences in parallel. Experiments show that the proposed MSN outperforms current state-of-the-art methods up to 10\% - 20\% quantitatively on two widely used datasets, and presents better multi-style characteristics qualitatively
    corecore