gym/Gymnasium强化学习玩推箱子游戏
gym 框架
源码 https://github.com/openai/gym
文档 https://www.gymlibrary.dev/
自 2021 年以来一直维护 Gym 的团队已将所有未来的开发转移到 Gymnasium,这是 Gym 的替代品(将 gymnasium 导入为 gym),Gym 将不会收到任何未来的更新。请尽快切换到 Gymnasium
Gymnasium 框架
源码 https://github.com/Farama-Foundation/Gymnasium
文档 https://gymnasium.farama.org/
推箱子环境
源码 https://github.com/mpSchrader/gym-sokoban
我用的环境是:
$ python --version
Python 3.7.16
$ python -m pip list
Package Version
------------------ ---------
certifi 2022.12.7
charset-normalizer 3.3.2
cloudpickle 2.2.1
gym 0.26.2
gym-notices 0.0.8
gym-sokoban 0.0.6
idna 3.7
imageio 2.31.2
importlib-metadata 6.7.0
numpy 1.21.6
Pillow 9.5.0
pip 22.3.1
pygame 2.6.0
requests 2.31.0
setuptools 65.6.3
tqdm 4.66.5
typing_extensions 4.7.1
urllib3 2.0.7
wheel 0.37.1
zipp 3.15.0
安装
我用的是 Python 3.7.16
conda create -p ./venv python=3.7
conda activate ./venv
直接 pip :
python -m pip install gym-sokoban
或者源码安装
git clone git@github.com:mpSchrader/gym-sokoban.git
cd gym-sokoban
python -m pip install -e .
然后跑代码测试
test.py
import gym
import gym_sokoban
env = gym.make('Sokoban-v2')
# 初始化环境
observation = env.reset()
for t in range(10000):
env.render(mode='human')
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
print(f"Step {t}: Action={action}, Reward={reward}, Done={done}, Info={info}")
if done:
observation = env.reset()
env.close()