remove_duplicates¶

class tensordict.utils.remove_duplicates(input: TensorDictBase, key: NestedKey, dim: int = 0, *, return_indices: bool = False)¶

移除指定维度中 key 中重复的索引。

此方法沿着指定的 dim 检测与指定 key 关联的张量中的重复元素，并移除 TensorDict 中所有其他张量中相同索引处的元素。期望 dim 是输入 TensorDict 批处理大小中的一个维度，以确保所有张量中的一致性。否则，将引发错误。

参数：

input (TensorDictBase) – 包含潜在重复元素的 TensorDict。
key (NestedKey) – 用于识别和移除重复元素的张量的键。它必须是 TensorDict 中的一个叶子键，指向一个张量而非另一个 TensorDict。
dim (int, optional) – 用于识别和移除重复元素的维度。它必须是输入 TensorDict 批处理大小中的一个维度。默认为 0。
return_indices (bool, optional) – 如果为 True，也将返回输入张量中唯一元素的索引。默认为 False。

返回：

输入 tensordict，其中对应于张量 key: 在维度 dim 上的重复元素的索引已移除。
unique_indices (torch.Tensor, optional): 输入 tensordict 中指定 key 在指定 dim 上唯一元素的第一次出现索引。: 仅在 return_index 为 True 时提供。

返回类型：

output (TensorDictBase)

示例

>>> td = TensorDict(
...     {
...         "tensor1": torch.tensor([[1, 2, 3], [4, 5, 6], [1, 2, 3], [7, 8, 9]]),
...         "tensor2": torch.tensor([[10, 20], [30, 40], [40, 50], [50, 60]]),
...     }
...     batch_size=[4],
... )
>>> output_tensordict = remove_duplicate_elements(td, key="tensor1", dim=0)
>>> expected_output = TensorDict(
...     {
...         "tensor1": torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...         "tensor2": torch.tensor([[10, 20], [30, 40], [50, 60]]),
...     },
...     batch_size=[3],
... )
>>> assert (td == expected_output).all()

remove_duplicates¶

文档

教程

资源