随机策略¶
- torchrl.envs.utils.RandomPolicy(action_spec: TensorSpec, action_key: NestedKey = 'action')[source]¶
用于数据收集器的随机策略。
这是对 action_spec.rand 方法的包装器。
- 参数::
action_spec – 描述动作规范的 TensorSpec 对象
示例
>>> from tensordict import TensorDict >>> from torchrl.data.tensor_specs import BoundedTensorSpec >>> action_spec = BoundedTensorSpec(-torch.ones(3), torch.ones(3)) >>> actor = RandomPolicy(action_spec=action_spec) >>> td = actor(TensorDict({}, batch_size=[])) # selects a random action in the cube [-1; 1]