- 本文总结了工业界和学术界2022-2023的端到端规划算法:
- Tesla FSD V12
- Momenta 2023
- Horizon Robotics 2023
- Motional RoboTaxi 2022
- Woven Planet (Toyota): Urban Driver
- Nvidia
- the state-of-the-art academic studies
1. Introduction
-
- https://opendrivelab.com/e2ead/cvpr23
- https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL
- https://www.youtube.com/watch?v=hFQLJIvdQNU
- https://youtu.be/ZwhXilQKULY?si=nQjgx96PCiXld-uJ
- https://planning.l5kit.org/
- https://github.com/georgeliu233/OPGP
- https://github.com/autonomousvision/tuplan_garage
- https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL
- https://github.com/autonomousvision/tuplan_garage
- https://opendrivelab.com/e2ead/cvpr23
- https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL
2. INDUSTRY
A. Tesla FSD V12 2023
- CVPR 2023 Workshop on End-to-End autonomous driving 发布了Tesla的端到端规划技术的相应视频
- end-to-end occupancy network in BEV space for planning 降低了对高精地图的依赖,但是这中方案需要大量的数据
- 目前仍然缺少更多公开的细节
B. Momenta 2023
- CVPR 2023 Workshop on Autonomous Driving 上有一个公开的视屏演讲,题目是 “How DataDriven Flywheel Enables Scalable Path to Full Autonomy”
C. Horizon Robotics 2023
地平线的方案获得了NuPlan Challenge 2023第二名的成绩,他们提出了一种基于heatmap的规划表征方式;
“Imitation with spatial-temporal heatmap: 2nd place solution for nuplan challenge,” arXiv preprint arXiv:2306.15700, 2023
D. Motional L4-RoboTaxi 2022
- 使用了IRL(Inverse Reinforcement Learning),但是方案的细节没有公开,并且该方案不是一个严格意义上的端到端方案,
- “Driving in real life with inverse reinforcement learning,” arXiv preprint arXiv:2206.03004, 2022
E. Woven Planet (Toyota): Urban Driver
- Urban Driver 2022 已经成为了一个被广泛使用的对比基准
- 使用了closed-loop training, mid-level representations and a data-driven simulator
- 方案提出比较早,效果可能已经不太好了
F. Nvidia
- Tree-structured Policy Planning
2023,使用了两颗树,一颗树用于自车轨迹选择,一棵场景树用于多模态自车状态环境预测(an
ego trajectory tree for ego trajectory options, and a scenario tree for
multi-modal ego-conditioned environment predictions)
- Treestructured policy planning with learned behavior models,” arXiv preprint arXiv:2301.11902, 2023.
- Differentiable Tree Policy Planning 2023,( Differentiable Tree
Policy Planning (DTPP))使用了 query-centric transformer
model,在规划部分使用a learnable context-aware cost function with latent
interaction features
- 总结了两个结论:
- 联合训练两个模块比分别训练效果更好;Joint training delivers significantly better performance than separate training of the two modules.
- 树结构的策略规划效果比一般的单阶段规划效果好;Tree-structured policy planning outperforms the conventional single-stage planning approach.
- DTPP主要包含三个模块:
- Conditional prediction: The conditional prediction generates the predicted trajectories of all vehicles with the inputs of the ego-vehicles and the other vehicles.
- Scoring module: Inverse reinforcement learning is used to learn the scoring of the predicted trajectories.
- Tree Policy Search: Tree policy search is used to explore various candidate trajectories.
- DTPP的一些优势:
- 规划过程中博弈计算高效;Conditional prediction provides the gaming effect for planning.
- 可微分特性使得预测和IRL可以一起训练,对端到端效果有帮助;Derivable in DTPP that returns the gradient and allows prediction and IRL to be trained together, which is also a necessary condition for end-to-end autonomous driving.
- 树结构提供一些交互推断能力;Tree policy planning would provide certain interactive deduction capabilities.
- 使用一次encode和多权重的decode降低计算延时;DTPP uses one-time re-encoding and multiple lightweight decodes to effectively reduce the calculation delay
- DTPP shows state-of-the-art learning-based planning and remarkable closed-loop planning.
- DTPP的待提升点:
- 预测模型可以引导规则方案来生成候选轨迹;Prediction models can be used to guide the rules for generating the candidate trajectories.
- 联合预测在预测中考虑了交互情况;Conditional joint prediction can be used to generate predictions interactively for each trajectory which improves the performance by the gaming effects.
- 联合预测和IRL模块可以用来选择最佳轨迹;Conditional joint prediction can be merged with inverse reinforcement learning for scoring the trajectories and selecting the best one.
- 相关论文:“Dtpp: Differentiable joint conditional prediction and cost evaluation for tree policy planning in autonomous driving,” arXiv preprint arXiv:2310.05885, 2023.
- 总结了两个结论:
3. ACADEMIA
A. Occupancy Prediction Planning
- 原始论文:“Occupancy prediction-guided neural planner for autonomous driving,” arXiv preprint arXiv:2305.03303, 2023.
- 分为两个阶段:
- 阶段1:an integrated learning-based framework with Transformer backbones is designed for comprehensive occupancy predictions and multi-modal planning objectives
- 阶段2:a transformed occupancy-guided optimization, built upon a curvilinear frame, achieves direct planning refinement through handcrafted cost function designs
B. UniAD 2023
- UniAD框架:
- 构造任务相关的queries,联合优化多个任务;
C. NTU planning
- “Conditional predictive behavior planning with inverse reinforcement learning for human-like autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, 2023
- 三个步骤:
- 1)a behavior generation module that produces a diverse set of candidate behaviors in the form of trajectory proposals,
- 2)a conditional prediction module that predicts future trajectories of other agents based on each proposal,
- 3)a scoring module that evaluates the candidate plans using maximum entropy inverse reinforcement learning
D. nuPlan Planning Challenge CVPR 2023
将短期规划和长期预测相结合,在nuPlan比赛中取得了最好的成绩
“Parting with misconceptions about learning-based vehicle motion planning,” arXiv preprint arXiv:2306.07962, 2023.
总结
本文对规划算法的总结有个基本的认识,但是对于每个算法的分析都不是很深入,基本都只是一个大致框架的说明,甚至还有很多配图和引用错误;针对具体的框架和问题,可以继续深挖