MarlGroupMapType¶
- torchrl.envs.MarlGroupMapType(value, names=None, *, module=None, qualname=None, type=None, start=1)[源代码]¶
Marl 组映射类型。
作为 torchrl 多智能体的功能,您可以控制环境中智能体的分组。您可以将智能体分组在一起(将它们的张量堆叠在一起),以在将它们通过相同的 神经网络时利用向量化。您可以将智能体分成不同的组,在这些组中它们是异构的或应该由不同的 神经网络处理。要进行分组,您只需要在环境构建时传递一个
group_map
。否则,您可以从该类中选择一种预制的分组策略。
使用
group_map=MarlGroupMapType.ALL_IN_ONE_GROUP
和智能体["agent_0", "agent_1", "agent_2", "agent_3"]
,来自您的环境的进出张量字典将类似于>>> print(env.rand_action(env.reset())) TensorDict( fields={ agents: TensorDict( fields={ action: Tensor(shape=torch.Size([4, 9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([4, 1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([4, 3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([4]))}, batch_size=torch.Size([])) >>> print(env.group_map) {"agents": ["agent_0", "agent_1", "agent_2", "agent_3]}
使用
group_map=MarlGroupMapType.ONE_GROUP_PER_AGENT
和智能体["agent_0", "agent_1", "agent_2", "agent_3"]
,来自您的环境的进出张量字典将类似于>>> print(env.rand_action(env.reset())) TensorDict( fields={ agent_0: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_1: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_2: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, agent_3: TensorDict( fields={ action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False), done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False), observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)}, batch_size=torch.Size([]))}, batch_size=torch.Size([])) >>> print(env.group_map) {"agent_0": ["agent_0"], "agent_1": ["agent_1"], "agent_2": ["agent_2"], "agent_3": ["agent_3"]}