Q1470 Shuffle the Array-简单
Q226 Invert Binary Tree-简单-递归
Q226 Invert Binary Tree-简单-递归
Reinforcement Learning | Unconscious RL of hidden brain states supported by confidence
Unconscious RL of hidden brain states supported by confidence
Reinforcement Learning | Making Efficient Use of Demonstrations to Solve Hard Exploration Problems (R2D3)
针对有困难探索的任务,通过示教实现高效的学习。
Neuroscience | Vector-based navigation using grid-like representations
被赋予了grid-like representations的agent在导航任务上性能超过了人类专家in challenging, unfamiliar, and changeable environments,甚至有走捷径的能力。
Q1431 Kids With the Greatest Number of Candies-简单-枚举
Q1431 Kids With the Greatest Number of Candies-简单-枚举
Q1480 Running Sum of 1d Array-简单
Q1480 Running Sum of 1d Array-简单
Reinforcement Learning | HIerarchical Reinforcement learning with Off-policy correction (HIRO)
本文用off-policy的方式训练higher- and lower-level controllers,提高了模型的数据利用率,性能超过了之前SOTA的Option-Critic。