Autonomous Driving | End-To-End Planning of Autonomous Driving in Industry and Academia: 2022-2023

本文总结了工业界和学术界2022-2023的端到端规划算法：
- Tesla FSD V12
- Momenta 2023
- Horizon Robotics 2023
- Motional RoboTaxi 2022
- Woven Planet (Toyota): Urban Driver
- Nvidia
- the state-of-the-art academic studies

1. Introduction

- 1. https://opendrivelab.com/e2ead/cvpr23
  2. https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL
  3. https://www.youtube.com/watch?v=hFQLJIvdQNU
  4. https://youtu.be/ZwhXilQKULY?si=nQjgx96PCiXld-uJ
  5. https://planning.l5kit.org/
  6. https://github.com/georgeliu233/OPGP
  7. https://github.com/autonomousvision/tuplan_garage
  8. https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL
  9. https://github.com/autonomousvision/tuplan_garage
  10. https://opendrivelab.com/e2ead/cvpr23
  11. https://youtu.be/OKDRsVXv49A?si=Y7dYYFXLqPorcxQL

2. INDUSTRY

A. Tesla FSD V12 2023

CVPR 2023 Workshop on End-to-End autonomous driving 发布了Tesla的端到端规划技术的相应视频
end-to-end occupancy network in BEV space for planning 降低了对高精地图的依赖，但是这中方案需要大量的数据
目前仍然缺少更多公开的细节

B. Momenta 2023

CVPR 2023 Workshop on Autonomous Driving 上有一个公开的视屏演讲，题目是 “How DataDriven Flywheel Enables Scalable Path to Full Autonomy”

C. Horizon Robotics 2023

地平线的方案获得了NuPlan Challenge 2023第二名的成绩，他们提出了一种基于heatmap的规划表征方式；
“Imitation with spatial-temporal heatmap: 2nd place solution for nuplan challenge,” arXiv preprint arXiv:2306.15700, 2023

D. Motional L4-RoboTaxi 2022

使用了IRL（Inverse Reinforcement Learning），但是方案的细节没有公开，并且该方案不是一个严格意义上的端到端方案，
“Driving in real life with inverse reinforcement learning,” arXiv preprint arXiv:2206.03004, 2022

E. Woven Planet (Toyota): Urban Driver

Urban Driver 2022 已经成为了一个被广泛使用的对比基准
使用了closed-loop training, mid-level representations and a data-driven simulator
方案提出比较早，效果可能已经不太好了

F. Nvidia

Tree-structured Policy Planning 2023，使用了两颗树，一颗树用于自车轨迹选择，一棵场景树用于多模态自车状态环境预测（an ego trajectory tree for ego trajectory options, and a scenario tree for multi-modal ego-conditioned environment predictions）
- Treestructured policy planning with learned behavior models,” arXiv preprint arXiv:2301.11902, 2023.

Differentiable Tree Policy Planning 2023，（ Differentiable Tree Policy Planning (DTPP)）使用了 query-centric transformer model，在规划部分使用a learnable context-aware cost function with latent interaction features
- 总结了两个结论：
  - 联合训练两个模块比分别训练效果更好；Joint training delivers significantly better performance than separate training of the two modules.
  - 树结构的策略规划效果比一般的单阶段规划效果好；Tree-structured policy planning outperforms the conventional single-stage planning approach.
- DTPP主要包含三个模块：
  - Conditional prediction: The conditional prediction generates the predicted trajectories of all vehicles with the inputs of the ego-vehicles and the other vehicles.
  - Scoring module: Inverse reinforcement learning is used to learn the scoring of the predicted trajectories.
  - Tree Policy Search: Tree policy search is used to explore various candidate trajectories.
- DTPP的一些优势：
  - 规划过程中博弈计算高效；Conditional prediction provides the gaming effect for planning.
  - 可微分特性使得预测和IRL可以一起训练，对端到端效果有帮助；Derivable in DTPP that returns the gradient and allows prediction and IRL to be trained together, which is also a necessary condition for end-to-end autonomous driving.
  - 树结构提供一些交互推断能力；Tree policy planning would provide certain interactive deduction capabilities.
  - 使用一次encode和多权重的decode降低计算延时；DTPP uses one-time re-encoding and multiple lightweight decodes to effectively reduce the calculation delay
  - DTPP shows state-of-the-art learning-based planning and remarkable closed-loop planning.
- DTPP的待提升点：
  - 预测模型可以引导规则方案来生成候选轨迹；Prediction models can be used to guide the rules for generating the candidate trajectories.
  - 联合预测在预测中考虑了交互情况；Conditional joint prediction can be used to generate predictions interactively for each trajectory which improves the performance by the gaming effects.
  - 联合预测和IRL模块可以用来选择最佳轨迹；Conditional joint prediction can be merged with inverse reinforcement learning for scoring the trajectories and selecting the best one.
- 相关论文：“Dtpp: Differentiable joint conditional prediction and cost evaluation for tree policy planning in autonomous driving,” arXiv preprint arXiv:2310.05885, 2023.

The proposed decision-making framework of NVidia & NTU DTPP 2023

The DTPP framework with joint learnable prediction and cost evaluation models

3. ACADEMIA

A. Occupancy Prediction Planning

原始论文：“Occupancy prediction-guided neural planner for autonomous driving,” arXiv preprint arXiv:2305.03303, 2023.
分为两个阶段：
- 阶段1：an integrated learning-based framework with Transformer backbones is designed for comprehensive occupancy predictions and multi-modal planning objectives
- 阶段2：a transformed occupancy-guided optimization, built upon a curvilinear frame, achieves direct planning refinement through handcrafted cost function designs

B. UniAD 2023

UniAD框架：

构造任务相关的queries，联合优化多个任务；

C. NTU planning

“Conditional predictive behavior planning with inverse reinforcement learning for human-like autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, 2023
三个步骤：
1）a behavior generation module that produces a diverse set of candidate behaviors in the form of trajectory proposals,
2）a conditional prediction module that predicts future trajectories of other agents based on each proposal,
3）a scoring module that evaluates the candidate plans using maximum entropy inverse reinforcement learning

D. nuPlan Planning Challenge CVPR 2023

将短期规划和长期预测相结合，在nuPlan比赛中取得了最好的成绩
“Parting with misconceptions about learning-based vehicle motion planning,” arXiv preprint arXiv:2306.07962, 2023.

总结

本文对规划算法的总结有个基本的认识，但是对于每个算法的分析都不是很深入，基本都只是一个大致框架的说明，甚至还有很多配图和引用错误；针对具体的框架和问题，可以继续深挖