HiFiGANVocoder¶

类 torchaudio.prototype.models.HiFiGANVocoder(in_channels: int, upsample_rates: Tuple[int, ...], upsample_initial_channel: int, upsample_kernel_sizes: Tuple[int, ...], resblock_kernel_sizes: Tuple[int, ...], resblock_dilation_sizes: Tuple[Tuple[int, ...], ...], resblock_type: int, lrelu_slope: float)[source]¶

HiFi GAN 的生成器部分 [Kong et al., 2020]. 来源: https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/models.py#L75

注意

要构建此模型，请使用以下工厂函数之一：hifigan_vocoder(), hifigan_vocoder_v1(), hifigan_vocoder_v2(), hifigan_vocoder_v3()。

参数:

in_channels (int) – 输入特征中的通道数。
upsample_rates (tuple of int) – 每个上采样层增加时间维度的因子。
upsample_initial_channel (int) – 输入特征张量中的通道数。
upsample_kernel_sizes (tuple of int) – 每个上采样层的核大小。
resblock_kernel_sizes (tuple of int) – 每个残差块的核大小。
resblock_dilation_sizes (tuple of tuples of int) – 每个残差块中每个 1D 卷积层的扩张大小。对于残差块类型 1，内部元组应具有长度 3，因为每个层中有 3 个卷积。对于残差块类型 2，它们应具有长度 2。
resblock_type (int, 1 or 2) – 确定将使用 ResBlock1 还是 ResBlock2。
lrelu_slope (float) – 激活函数中 leaky ReLU 的斜率。

方法¶

forward¶

HiFiGANVocoder.forward(x: 张量) → 张量[source]¶

参数:: x (张量) – 输入特征张量，形状为 (batch_size, num_channels, time_length)。
返回:: 形状为 (batch_size, 1, time_length * upsample_rate) 的张量，其中 upsample_rate 是所有层的上采样率之积。

工厂函数¶

`hifigan_vocoder`	构建 HiFi GAN Vocoder [Kong et al., 2020]。
`hifigan_vocoder_v1`	构建具有 V1 架构的 HiFiGAN Vocoder [Kong et al., 2020]。
`hifigan_vocoder_v2`	构建具有 V2 架构的 HiFiGAN Vocoder [Kong et al., 2020]。
`hifigan_vocoder_v3`	构建具有 V3 架构的 HiFiGAN Vocoder [Kong et al., 2020]。

HiFiGANVocoder¶

方法¶

forward¶

工厂函数¶

文档

教程

资源