MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge - Motion Prediction
Abstract
- 本文是MTR的变体,主要提出了一个使用motion query pair的transformer多模态轨迹预测框架,联合优化了意图位置和迭代运动微调;
- 代码开源:https://github.com/sshaoshuai/MTR
1 Introduction
- 预测算法:
- goal-based strategy
- direct predict a set of future trajectories
2 Method
2.1 Context Encoding with Transformer Encoders
- 建模 agents 之间的交互,并编码 road environment
Input representation
- 使用 polyline representations
- agent 特征:
- map 特征:
是agent数量; 是map polylines 数量; 是特征维度;
Scene context encoding with transformer encoder
- 以agent特征和map特征为输入
- a set of self-attention modules are then adopted on A and M to model the interaction of agent and also encode the scene environment features for the following decoder network.
2.2 Multimodal Trajectory Prediction
- 受到目标检测任务中的 concept of object query 启发
- Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020. 2
- 设计了 motion query pair 来建模运动预测,pair 包括两部分,static intention query 和 dynamic searching query
Motion query pair for motion prediction
生成
个表征 intention point ,通过在GT轨迹终点上使用k-means算法获得这k个点 static intention query:
是 sinusoidal position encoding trajectory-specific features for iterative motion refinement
dynamic searching query:对 (j+1)-th 层
是第 j-th 层 decoder layer 预测轨迹的终点; 是预测轨迹的帧数; 同时收集预测轨迹附近的128个map ploylines作为道路特征
Attention with motion query pair
Motion prediction head with GMM
是真值轨迹点
2.3 Model Ensemble
为提高自举能力,提出了一个集成策略 model ensemble strategy
使用
个训练好的模型,每个模型产生6个模态的轨迹,一共有 条轨迹;根据置信度最高的6个轨迹,通过对每个轨迹endpoint进行 non-maximum-suppression (NMS),距离阈值 随着轨迹长度L变化 这一集成策略下的MTR模型被称为 MTR-A ,这也是本模型与MTR的主要区别