cuda_ctc_decoder¶

torchaudio.models.decoder.cuda_ctc_decoder(tokens: Union[str, List[str]], nbest: int = 1, beam_size: int = 10, blank_skip_threshold: float = 0.95) → CUCTCDecoder[源码]¶

构建一个 CUCTCDecoder 的实例。

参数：

tokens (str 或 List[str]) – 包含有效 tokens 的文件或列表。如果使用文件，期望的格式是映射到同一索引的 tokens 位于同一行。
beam_size (int, optional) – 每个解码步骤后保留的最大假设数（默认为 10）。
nbest (int) – 返回的最佳解码结果数量。
blank_id (int) – 对应于空白符号的 token ID。
blank_skip_threshold (float) – 如果 log_prob(blank) > log(blank_skip_threshold)，则跳过帧，以加速解码（默认为 0.95）。

返回：

解码器

返回类型：

CUCTCDecoder

示例

>>> decoder = cuda_ctc_decoder(
>>>     vocab_file="tokens.txt",
>>>     blank_skip_threshold=0.95,
>>> )
>>> results = decoder(log_probs, encoder_out_lens) # List of shape (B, nbest) of Hypotheses

使用 cuda_ctc_decoder 的教程

使用 CUDA CTC Decoder 进行 ASR 推断

cuda_ctc_decoder¶

文档

教程

资源