torchtext.functional¶

to_tensor¶

torchtext.functional.to_tensor(input: Any, padding_value: Optional[int] = None, dtype: dtype = torch.int64) → Tensor[source]¶

将输入转换为 torch 张量

参数:

返回类型:

张量

使用 to_tensor 的教程: 使用 XLM-RoBERTa 模型进行 SST-2 二元文本分类

使用 XLM-RoBERTa 模型进行 SST-2 二元文本分类

torchtext.functional.truncate(input: Any, max_seq_len: int) → Any[source]¶

截断输入序列或批次

参数:

input (Union[List[Union[str, int]], List[List[Union[str, int]]]]) – 要截断的输入序列或批次
max_seq_len (int) – 输入被丢弃的最大长度

返回:

截断后的序列

返回类型:

Union[List[Union[str, int]], List[List[Union[str, int]]]]

torchtext.functional.add_token(input: Any, token_id: Any, begin: bool = True) → Any[source]¶

在序列的开头或结尾添加标记

参数:

input (Union[List[Union[str, int]], List[List[Union[str, int]]]]) – 输入序列或批次
token_id (Union[str, int]) – 要添加的标记
begin (bool, optional) – 是否在序列的开头或结尾插入标记，默认为 True

返回:

在输入的开头或结尾添加了 token_id 的序列或批次

返回类型:

Union[List[Union[str, int]], List[List[Union[str, int]]]]

torchtext.functional.str_to_int(input: Any) → Any[source]¶

将字符串标记转换为整数（单个序列或批次）。