快捷方式

MarlGroupMapType

torchrl.envs.MarlGroupMapType(value, names=None, *, module=None, qualname=None, type=None, start=1)[源代码]

Marl Group Map 类型。

作为 torchrl 多智能体的一个特性,你可以控制环境中智能体的分组。你可以将智能体组合在一起(堆叠它们的张量),以便在通过相同的神经网络时利用向量化。你可以将智能体分成不同的组,在这些组中,它们是异构的,或者应该由不同的神经网络处理。要分组,你只需要在环境构建时传递一个 group_map

否则,你可以从此类中选择预制的分组策略之一。

  • 使用 group_map=MarlGroupMapType.ALL_IN_ONE_GROUP 和智能体 ["agent_0", "agent_1", "agent_2", "agent_3"],来自和去往你的环境的 tensordict 将如下所示

    >>> print(env.rand_action(env.reset()))
    TensorDict(
        fields={
            agents: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([4, 9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([4, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([4, 3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([4]))},
        batch_size=torch.Size([]))
    >>> print(env.group_map)
    {"agents": ["agent_0", "agent_1", "agent_2", "agent_3]}
    
  • 使用 group_map=MarlGroupMapType.ONE_GROUP_PER_AGENT 和智能体 ["agent_0", "agent_1", "agent_2", "agent_3"],来自和去往你的环境的 tensordict 将如下所示

    >>> print(env.rand_action(env.reset()))
    TensorDict(
        fields={
            agent_0: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_1: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_2: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
            agent_3: TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                    observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
                batch_size=torch.Size([]))},
        batch_size=torch.Size([]))
    >>> print(env.group_map)
    {"agent_0": ["agent_0"], "agent_1": ["agent_1"], "agent_2": ["agent_2"], "agent_3": ["agent_3"]}
    

文档

访问 PyTorch 的全面开发者文档

查看文档

教程

获取面向初学者和高级开发者的深入教程

查看教程

资源

查找开发资源并获得问题解答

查看资源