torchaudio.prototype.models.conformer_wav2vec2_base¶
- torchaudio.prototype.models.conformer_wav2vec2_base(extractor_input_dim: int = 64, extractor_output_dim: int = 256, encoder_projection_dropout: float = 0.0) Wav2Vec2Model [source]¶
构建来自 Conformer-Based Slef-Supervised Learning for Non-Speech Audio Tasks 的“小型”架构 Conformer Wav2Vec2 模型 [Srivastava et al., 2022]
- 参数:
extractor_input_dim (int, 可选) – 特征提取器的输入维度。(默认值:64)
extractor_output_dim (int, 可选) – 特征提取器的输出维度。(默认值:256)
encoder_projection_dropout (float, 可选) – 在特征投影后应用的 Dropout 概率。(默认值:0.0)
- 返回值:
带有 conformer 编码器和
base
配置的结果 wav2vec2 模型。- 返回类型: