题意:如何在OpenAI的Gym中注册一个自定义环境?
问题背景:
I have created a custom environment, as per the OpenAI Gym framework; containing step
, reset
, action
, and reward
functions. I aim to run OpenAI baselines on this custom environment. But prior to this, the environment has to be registered on OpenAI gym. I would like to know how the custom environment could be registered on OpenAI gym? Also, Should I be modifying the OpenAI baseline codes to incorporate this?
我已经按照OpenAI Gym框架创建了一个自定义环境,包含`step`、`reset`、`action`和`reward`函数。我计划在这个自定义环境上运行OpenAI的基线算法。但在此之前,必须在OpenAI Gym中注册这个环境。我想知道如何在OpenAI Gym中注册自定义环境?另外,我是否需要修改OpenAI基线代码以纳入这个自定义环境?
问题解决:
You do not need to modify baselines repo.
你不需要修改`baselines`仓库。
Here is a minimal example. Say you have myenv.py
, with all the needed functions (step
, reset
, ...). The name of the class environment is MyEnv
, and you want to add it to the classic_control
folder. You have to
这是一个最小的示例。假设你有一个名为`myenv.py`的文件,其中包含所有必要的函数(`step`、`reset`等)。环境的类名是`MyEnv`,你想将其添加到`classic_control`文件夹中。你需要做以下操作:
- Place
myenv.py
file ingym/gym/envs/classic_control
将`myenv.py`文件放在`gym/gym/envs/classic_control`目录下。
-
Add to
__init__.py
(located in the same folder)
将以下内容添加到`__init__.py`文件中(位于同一文件夹下):
from gym.envs.classic_control.myenv import MyEnv
-
Register the environment in
gym/gym/envs/__init__.py
by adding
在`gym/gym/envs/__init__.py`中注册环境,通过添加以下内容:
gym.envs.register(
id='MyEnv-v0',
entry_point='gym.envs.classic_control:MyEnv',
max_episode_steps=1000,
)
At registration, you can also add reward_threshold
and kwargs
(if your class takes some arguments).
在注册时,你还可以添加`reward_threshold`和`kwargs`(如果你的类需要一些参数的话)。
You can also directly register the environment in the script you will run (TRPO, PPO, or whatever) instead of doing it in gym/gym/envs/__init__.py
.
你也可以直接在你将运行的脚本中(如TRPO、PPO或其他)注册环境,而不是在`gym/gym/envs/__init__.py`中进行注册。
EDIT
This is a minimal example to create the LQR environment.
这是一个创建LQR环境的最小示例。
Save the code below in lqr_env.py
and place it in the classic_control folder of gym.
将下面的代码保存为`lqr_env.py`并放在Gym的`classic_control`文件夹中。
import gym
from gym import spaces
from gym.utils import seeding
import numpy as np
class LqrEnv(gym.Env):
def __init__(self, size, init_state, state_bound):
self.init_state = init_state
self.size = size
self.action_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
self.observation_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
self._seed()
def _seed(self, seed=None):
self.np_random, seed = seeding.np_random(seed)
return [seed]
def _step(self,u):
costs = np.sum(u**2) + np.sum(self.state**2)
self.state = np.clip(self.state + u, self.observation_space.low, self.observation_space.high)
return self._get_obs(), -costs, False, {}
def _reset(self):
high = self.init_state*np.ones((self.size,))
self.state = self.np_random.uniform(low=-high, high=high)
self.last_u = None
return self._get_obs()
def _get_obs(self):
return self.state
Add from gym.envs.classic_control.lqr_env import LqrEnv
to __init__.py
(also in classic_control).
将`from gym.envs.classic_control.lqr_env import LqrEnv`添加到`__init__.py`(同样在`classic_control`文件夹中)。
In your script, when you create the environment, do
在你的脚本中,当你创建环境时,执行以下操作:
gym.envs.register(
id='Lqr-v0',
entry_point='gym.envs.classic_control:LqrEnv',
max_episode_steps=150,
kwargs={'size' : 1, 'init_state' : 10., 'state_bound' : np.inf},
)
env = gym.make('Lqr-v0')