题意:OpenAI Gym: 理解action_space
表示法(spaces.Box)
问题背景:
I want to setup an RL agent on the OpenAI CarRacing-v0
environment, but before that I want to understand the action space. In the code on github line 119 says:
我想在OpenAI的CarRacing-v0环境上设置一个强化学习(RL)智能体,但在此之前,我想先了解动作空间。在GitHub上的代码中,第119行提到:
self.action_space = spaces.Box( np.array([-1,0,0]), np.array([+1,+1,+1])) # steer, gas, brake
How do I read this line? Although my problem is concrete wrt CarRacing-v0
I would like to understand the spaces.Box()
notation in general
如何解读这一行代码?尽管我的问题是关于CarRacing-v0环境的具体问题,但我想一般性地了解spaces.Box()
这个表示法的含义。
问题解决:
Box
means that you are dealing with real valued quantities.
"Box" 在这里指的是你正在处理的是实数值(real valued quantities)
The first array np.array([-1,0,0]
are the lowest accepted values, and the second np.array([+1,+1,+1])
are the highest accepted values. In this case (using the comment) we see that we have 3 available actions:
“第一个数组 np.array([-1.0, -1.0, -1.0])
是每个维度上可接受的最小值,而第二个数组 np.array([+1.0, +1.0, +1.0])
是每个维度上可接受的最大值。在这种情况下(参照注释),我们可以看到我们有3个可用于动作的维度,每个维度都可以独立地在-1.0到+1.0的范围内取值。”
- Steering: Real valued in
[-1, 1]
转向(Steering): 实数值,范围在[-1, 1]内 - Gas: Real valued in
[0, 1]
油门(Gas): 实数值,范围在[0, 1]内 - Brake: Real valued in
[0, 1]
刹车(Brake): 实数值,范围在[0, 1]内