快捷方式

prepare_layer_dropout

torchtune.modules.prepare_layer_dropout(layers: Union[ModuleList, Iterable[Module]], prob_max: float = 0.0, prob_layer_scale: Optional[ScaleType] = ScaleType.UNIFORM, layers_str: Optional[str] = None, disable_on_eval: Optional[bool] = True) None[源代码]

通过使用 ModuleLayerDropoutWrapper 包装每个层,为模型的层准备层 dropout。此函数接收层列表、层 dropout 的最大概率、层 dropout 概率的缩放类型、指定要应用 dropout 的层的字符串以及指示是否在评估期间禁用 dropout 的布尔值。然后,它使用 ModuleLayerDropoutWrapper 就地包装模型的每个层,ModuleLayerDropoutWrapper 将层 dropout 应用于输入张量。

参数:
  • layers (Union[torch.nn.ModuleList, Iterable[torch.nn.Module]]) – 要准备层 dropout 的层列表。

  • prob_max (float) – 层 dropout 的最大概率。默认为 0.0。

  • prob_layer_scale (Optional[ScaleType]) – 跨层 dropout 概率的缩放类型。默认为 ScaleType.UNIFORM。

  • layers_str (Optional[str]) – 指定要应用 dropout 的层的字符串。默认为 None,表示应用于所有层。

  • disable_on_eval (Optional[bool]) – 是否在评估期间禁用 dropout。默认为 True。

返回:

None

示例

>>> import torch
>>> from torch import nn
>>> # Define a simple model
>>> class MyModel(nn.Module):
...     def __init__(self):
...         super().__init__()
...         self.layers = nn.ModuleList([
...             nn.Linear(5, 3),
...             nn.Linear(3, 2),
...             nn.Linear(2, 1),
...             nn.Linear(1, 2),
...             nn.Linear(2, 3),
...         ])
...
...     def forward(self, x):
...         for layer in self.layers:
...             x = layer(x)
...         return x
>>> model = MyModel()
>>> # Apply layer dropout uniformly to all layers
>>> prepare_layer_dropout(model.layers, prob_max=0.2, prob_layer_scale=ScaleType.UNIFORM)
>>> # Apply layer dropout every other layer, as described in LayerDrop paper
    (Fan et al., https://arxiv.org/abs/1909.11556v1)
>>> prepare_layer_dropout(model.layers, prob_max=0.2, prob_layer_scale=ScaleType.UNIFORM, layers_str="::2")
>>> # Apply layer dropout that increases linearly across layers, as described in Progressive Layer
    Dropout paper (Zhang et al., https://arxiv.org/abs/2010.13369)
>>> prepare_layer_dropout(model.layers, prob_max=0.2, prob_layer_scale=ScaleType.LINEAR)
>>> # Apply layer dropout that increases exponentially across layers, as described in
    LayerSkip paper (Elhoushi et al., https://arxiv.org/abs/2404.16710)
>>> prepare_layer_dropout(model.layers, prob_max=0.2, prob_layer_scale=ScaleType.EXP)

文档

访问 PyTorch 的全面开发者文档

查看文档

教程

获取面向初学者和高级开发者的深度教程

查看教程

资源

查找开发资源并获得问题解答

查看资源