第39届国际机器学习会议(International Conference on Machine Learning, ICML 2022)于北京时间7月17日至7月23日,在美国马里兰州巴尔的摩市以线上线下结合的方式举办。
本文列举了会议主题与强化学习(Reinforcement Learning, RL)有关的论文:
- [1]. EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning.
- [2]. Optimizing Sequential Experimental Design with Deep Reinforcement Learning.
- [3]. Interactive Inverse Reinforcement Learning for Cooperative Games.
- [4]. Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency.
- [5]. Stabilizing Off-Policy Deep Reinforcement Learning from Pixels.
- [6]. Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation.
- [7]. Adversarially Trained Actor Critic for Offline Reinforcement Learning.
- [8]. Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning.
- [9]. Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation.
- [10]. DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations.
- [11]. Branching Reinforcement Learning.
- [12]. Provable Reinforcement Learning with a Short-Term Memory.
- [13]. DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck.
- [14]. Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning.
- [15]. Fast Population-Based Reinforcement Learning on a Single Machine.
- [16]. Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning.
- [17]. Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning.
- [18]. Retrieval-Augmented Reinforcement Learning.
- [19]. The State of Sparse Training in Deep Reinforcement Learning.
- [20]. Learning Pseudometric-based Action Representations for Offline Reinforcement Learning.
- [21]. Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity.
- [22]. Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes.
- [23]. Off-Policy Reinforcement Learning with Delayed Rewards.
- [24]. Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning.
- [25]. Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation.
- [26]. On the Role of Discount Factor in Offline Reinforcement Learning.
- [27]. MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer.
- [28]. Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling.
- [29]. Curriculum Reinforcement Learning via Constrained Optimal Transport.
- [30]. Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters.
- [31]. Goal Misgeneralization in Deep Reinforcement Learning.
- [32]. Scalable Deep Reinforcement Learning Algorithms for Mean Field Games.
- [33]. Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning.
- [34]. Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning.
- [35]. PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration.
- [36]. Delayed Reinforcement Learning by Imitation.
- [37]. Constrained Variational Policy Optimization for Safe Reinforcement Learning.
- [38]. Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy.
- [39]. Learning Dynamics and Generalization in Deep Reinforcement Learning.
- [40]. Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning.
- [41]. On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning.
- [42]. Optimizing Tensor Network Contraction Using Reinforcement Learning.
- [43]. A Simple Reward-free Approach to Constrained Reinforcement Learning.
- [44]. EqR: Equivariant Representations for Data-Efficient Reinforcement Learning.
- [45]. The Primacy Bias in Deep Reinforcement Learning.
- [46]. History Compression via Language Models in Reinforcement Learning.
- [47]. Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification.