torcharrow.DataFrame.to_tensor¶
- DataFrame.to_tensor(conversion=None)¶
转换为 PyTorch 容器(张量、PackedList、PackedMap 等)
- 参数:
conversion (TensorConversion 或 dict) – 对于 DataFrame.to_tensor(),conversion 只能是 dict 类型。该字典将列名映射到转换方法。对于字典中不包含的列名,将使用默认的 PyTorch 转换。
示例
>>> import torcharrow as ta >>> import torcharrow.pytorch as tap >>> df = ta.dataframe({"label_ids": [0, 1], "token_ids": [[1, 2, 3, 4, 5], [101, 102]]})
>>> df index label_ids token_ids ------- ----------- --------------- 0 0 [1, 2, 3, 4, 5] 1 1 [101, 102] dtype: Struct([Field('label_ids', int64), Field('token_ids', List(int64))]), count: 2, null_count: 0
>>> df.to_tensor({"token_ids": tap.PadSequence(padding_value=-1)}) TorchArrowStruct_0( label_ids=tensor([0, 1]), token_ids=tensor([ [ 1, 2, 3, 4, 5], [101, 102, -1, -1, -1]] ) )