概率分布 - torch.distributions¶

`distributions` 包包含可参数化的概率分布和采样函数。这允许构建用于优化的随机计算图和随机梯度估计器。此包的设计通常遵循 TensorFlow Distributions 包。

无法直接通过随机样本进行反向传播。然而，有两种主要方法可以创建可进行反向传播的替代函数。它们是得分函数估计器/似然比估计器/REINFORCE 和路径导数估计器。REINFORCE 通常被视为强化学习中策略梯度方法的基础，而路径导数估计器常见于变分自编码器中的重参数化技巧。得分函数仅需要样本 $f(x)$ 的值，而路径导数需要导数 $f'(x)$ 。接下来的部分将在一个强化学习示例中讨论这两种方法。更多详情请参阅 Gradient Estimation Using Stochastic Computation Graphs 。

得分函数¶

当概率密度函数对其参数可微时，我们只需要 sample() 和 log_prob() 来实现 REINFORCE

\Delta\theta = \alpha r \frac{\partial\log p(a|\pi^\theta(s))}{\partial\theta}

其中 $\theta$ 是参数， $\alpha$ 是学习率， $r$ 是奖励， $p(a|\pi^\theta(s))$ 是在给定策略 $\pi^\theta$ 下在状态 $s$ 中采取行动 $a$ 的概率。

在实践中，我们会从网络的输出中采样一个行动，在环境中应用此行动，然后使用 log_prob 构建等效的损失函数。请注意，我们使用负值是因为优化器使用梯度下降，而上述规则假定梯度上升。对于分类策略，实现 REINFORCE 的代码如下：

probs = policy_network(state)
# Note that this is equivalent to what used to be called multinomial
m = Categorical(probs)
action = m.sample()
next_state, reward = env.step(action)
loss = -m.log_prob(action) * reward
loss.backward()

路径导数¶

实现这些随机/策略梯度的另一种方法是使用 rsample() 方法中的重参数化技巧，其中参数化随机变量可以通过一个无参数随机变量的参数化确定性函数构建。因此，重参数化样本变得可微。实现路径导数的代码如下：

params = policy_network(state)
m = Normal(*params)
# Any distribution with .has_rsample == True could work based on the application
action = m.rsample()
next_state, reward = env.step(action)  # Assuming that reward is differentiable
loss = -reward
loss.backward()

分布¶

class torch.distributions.distribution.Distribution(batch_shape=torch.Size([]), event_shape=torch.Size([]), validate_args=None)[source][source]¶

基类： object

`Distribution` 是概率分布的抽象基类。

property arg_constraints: dict[str, torch.distributions.constraints.Constraint]¶: 返回一个字典，将参数名映射到每个分布参数应满足的 Constraint 对象。非 Tensor 参数不必出现在此字典中。

property batch_shape: Size¶: 返回参数的批量形状。

cdf(value)[source][source]¶

返回在 value 处评估的累计密度/质量函数。

参数: value (Tensor) –
返回类型: Tensor

entropy()[source][source]¶

返回分布的熵，按批量形状进行批量处理。

返回: 形状为 batch_shape 的 Tensor。
返回类型: Tensor

enumerate_support(expand=True)[source][source]¶

返回包含离散分布支持的所有值的 Tensor。结果将沿维度 0 进行枚举，因此结果的形状将是 (cardinality,) + batch_shape + event_shape (其中对于单变量分布，event_shape = ())。

请注意，这会同步枚举所有批量 Tensor，例如 [[0, 0], [1, 1], …]。当 expand=False 时，枚举沿维度 0 进行，但其余批量维度是单例维度，例如 [[0], [1], ...

要遍历完整的笛卡尔积，请使用 itertools.product(m.enumerate_support())。

参数: expand (bool) – 是否展开支持范围以匹配分布的 batch_shape 批量维度。
返回: 沿维度 0 迭代的 Tensor。
返回类型: Tensor

property event_shape: Size¶: 返回单个样本的形状（不包括批量）。

expand(batch_shape, _instance=None)[source][source]¶

返回一个新分布实例（或填充由派生类提供的现有实例），其批量维度已展开到 batch_shape。此方法会调用分布参数的 expand 方法。因此，这不会为展开的分布实例分配新的内存。此外，当首次创建实例时，它不会重复 __init__.py 中的任何参数检查或参数广播。

参数

batch_shape (torch.Size) – 期望的展开大小。
_instance – 需要覆盖 .expand 的子类提供的新实例。

返回

批量维度已展开到 batch_size 的新分布实例。

icdf(value)[source][source]¶

返回在 value 处评估的逆累计密度/质量函数。

参数: value (Tensor) –
返回类型: Tensor

log_prob(value)[source][source]¶

返回在 value 处评估的概率密度/质量函数的对数。

参数: value (Tensor) –
返回类型: Tensor

property mean: Tensor¶: 返回分布的均值。

property mode: Tensor¶: 返回分布的众数。

perplexity()[source][source]¶

返回分布的困惑度，按批量形状进行批量处理。

返回: 形状为 batch_shape 的 Tensor。
返回类型: Tensor

rsample(sample_shape=torch.Size([]))[source][source]¶

生成形状为 sample_shape 的重参数化样本，如果分布参数是批量的，则生成形状为 sample_shape 的批量重参数化样本。

返回类型: Tensor

sample(sample_shape=torch.Size([]))[source][source]¶

生成形状为 sample_shape 的样本，如果分布参数是批量的，则生成形状为 sample_shape 的批量样本。

返回类型: Tensor

sample_n(n)[source][source]¶

如果分布参数是批量的，则生成 n 个样本或 n 个批量样本。

返回类型: Tensor

static set_default_validate_args(value)[source][source]¶

设置是否启用或禁用验证。

默认行为模仿 Python 的 assert 语句：默认开启验证，但在以优化模式运行 Python (通过 python -O) 时禁用。验证可能很耗时，因此您可能希望在模型正常工作后将其禁用。

参数: value (bool) – 是否启用验证。

property stddev: Tensor¶: 返回分布的标准差。

property support: Optional[Constraint]¶: 返回表示此分布支持范围的 Constraint 对象。

property variance: Tensor¶: 返回分布的方差。

指数族¶

class torch.distributions.exp_family.ExponentialFamily(batch_shape=torch.Size([]), event_shape=torch.Size([]), validate_args=None)[source][source]¶

基类： Distribution

`ExponentialFamily` 是属于指数族的概率分布的抽象基类，其概率质量/密度函数形式如下：

p_{F}(x; \theta) = \exp(\langle t(x), \theta\rangle - F(\theta) + k(x))

其中 $\theta$ 表示自然参数， $t(x)$ 表示充分统计量， $F(\theta)$ 是给定族类的对数归一化函数，且 $k(x)$ 是载波测度。

注意

此类是 Distribution 类和属于指数族类的分布之间的中介，主要用于检查 .entropy() 和解析 KL 散度方法的正确性。我们使用此类通过 AD 框架和 Bregman 散度（引用：Frank Nielsen 和 Richard Nock 的著作《Entropies and Cross-entropies of Exponential Families》）计算熵和 KL 散度。

entropy()[source][source]¶: 使用对数归一化函数的 Bregman 散度计算熵的方法。

伯努利分布¶

class torch.distributions.bernoulli.Bernoulli(probs=None, logits=None, validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建一个伯努利分布，由 probs 或 logits 参数化（但不能同时使用两者）。

样本是二进制的（0 或 1）。它们以概率 p 取值为 1，以概率 1 - p 取值为 0。

示例

>>> m = Bernoulli(torch.tensor([0.3]))
>>> m.sample()  # 30% chance 1; 70% chance 0
tensor([ 0.])

参数

probs (数字，张量) – 抽样到 1 的概率
logits (数字，张量) – 抽样到 1 的对数几率（log-odds）

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

entropy()[source][source]¶

enumerate_support(expand=True)[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_enumerate_support = True¶

log_prob(value)[source][source]¶

property logits: Tensor¶

property mean: Tensor¶

property mode: Tensor¶

property param_shape: Size¶

property probs: Tensor¶

sample(sample_shape=torch.Size([]))[source][source]¶

support = Boolean()¶

property variance: Tensor¶

Beta 分布¶

class torch.distributions.beta.Beta(concentration1, concentration0, validate_args=None)[source][source]¶

继承自: ExponentialFamily

Beta 分布由 concentration1 和 concentration0 参数化。

示例

>>> m = Beta(torch.tensor([0.5]), torch.tensor([0.5]))
>>> m.sample()  # Beta distributed with concentration concentration1 and concentration0
tensor([ 0.1046])

参数

concentration1 (浮点数或张量) – 分布的第一个集中度参数（通常称为 alpha）
concentration0 (浮点数或张量) – 分布的第二个集中度参数（通常称为 beta）

arg_constraints = {'concentration0': GreaterThan(lower_bound=0.0), 'concentration1': GreaterThan(lower_bound=0.0)}¶

property concentration0: Tensor¶

property concentration1: Tensor¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=())[source][source]¶

返回类型: Tensor

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property variance: Tensor¶

二项分布¶

class torch.distributions.binomial.Binomial(total_count=1, probs=None, logits=None, validate_args=None)[source][source]¶

基类： Distribution

创建一个二项分布，由 total_count 和 probs 或 logits 参数化（但不能同时使用两者）。total_count 必须能够广播到 probs/logits 的形状。

示例

>>> m = Binomial(100, torch.tensor([0 , .2, .8, 1]))
>>> x = m.sample()
tensor([   0.,   22.,   71.,  100.])

>>> m = Binomial(torch.tensor([[5.], [10.]]), torch.tensor([0.5, 0.8]))
>>> x = m.sample()
tensor([[ 4.,  5.],
        [ 7.,  6.]])

参数

total_count (整型或张量) – 伯努利试验的次数
probs (张量) – 事件概率
logits (张量) – 事件对数几率（log-odds）

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0), 'total_count': IntegerGreaterThan(lower_bound=0)}¶

entropy()[source][source]¶

enumerate_support(expand=True)[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_enumerate_support = True¶

log_prob(value)[source][source]¶

property logits: Tensor¶

property mean: Tensor¶

property mode: Tensor¶

property param_shape: Size¶

property probs: Tensor¶

sample(sample_shape=torch.Size([]))[source][source]¶

property support¶

返回类型: _依赖属性

property variance: Tensor¶

范畴分布¶

class torch.distributions.categorical.Categorical(probs=None, logits=None, validate_args=None)[source][source]¶

基类： Distribution

创建一个范畴分布，由 probs 或 logits 参数化（但不能同时使用两者）。

注意

它等同于 torch.multinomial() 采样所依据的分布。

样本是来自 $\{0, \ldots, K-1\}$ 的整数，其中 K 是 probs.size(-1)。

如果 probs 是长度为 K 的 1 维张量，则每个元素表示采样到该索引对应类别的相对概率。

如果 probs 是 N 维张量，则前 N-1 维被视为一批相对概率向量。

注意

参数 probs 必须是非负、有限且总和非零，它将沿最后一个维度归一化，使其总和为 1。probs 将返回这个归一化后的值。参数 logits 将被解释为未归一化的对数概率，因此可以是任何实数。它同样会被归一化，以便生成的概率沿最后一个维度总和为 1。属性 logits 将返回输入的对数几率。

另请参阅：torch.multinomial()

示例

>>> m = Categorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor(3)

参数

probs (张量) – 事件概率
logits (Tensor) – 事件对数概率（未归一化）

arg_constraints = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

entropy()[source][source]¶

enumerate_support(expand=True)[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_enumerate_support = True¶

log_prob(value)[source][source]¶

属性 logits: Tensor¶

属性 mean: Tensor¶

属性 mode: Tensor¶

属性 param_shape: Size¶

属性 probs: Tensor¶

sample(sample_shape=torch.Size([]))[source][source]¶

属性 support¶

返回类型: _依赖属性

属性 variance: Tensor¶

柯西分布¶

类 torch.distributions.cauchy.Cauchy(loc, scale, validate_args=None)[source][source]¶

基类： Distribution

从柯西（Lorentz）分布中进行采样。均值为 0 的独立正态分布随机变量之比的分布遵循柯西分布。

示例

>>> m = Cauchy(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Cauchy distribution with loc=0 and scale=1
tensor([ 2.3214])

参数

loc (float 或 Tensor) – 分布的众数或中位数。
scale (float 或 Tensor) – 半高全宽。

arg_constraints = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(value)[source][source]¶

log_prob(value)[source][source]¶

属性 mean: Tensor¶

属性 mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

support = Real()¶

属性 variance: Tensor¶

卡方分布¶

类 torch.distributions.chi2.Chi2(df, validate_args=None)[source][source]¶

基类：Gamma

创建一个以形状参数 df 为参数的卡方分布。这完全等价于 Gamma(alpha=0.5*df, beta=0.5)

示例

>>> m = Chi2(torch.tensor([1.0]))
>>> m.sample()  # Chi2 distributed with shape df=1
tensor([ 0.1046])

参数: df (float 或 Tensor) – 分布的形状参数

arg_constraints = {'df': GreaterThan(lower_bound=0.0)}¶

属性 df: Tensor¶

expand(batch_shape, _instance=None)[source][source]¶

连续伯努利分布¶

类 torch.distributions.continuous_bernoulli.ContinuousBernoulli(probs=None, logits=None, lims=(0.499, 0.501), validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建一个由 probs 或 logits 参数化的连续伯努利分布（但不能同时指定两者）。

该分布支持在 [0, 1] 范围内，并由“probs”（在 (0,1) 范围内）或“logits”（实数值）参数化。请注意，与伯努利分布不同，“probs”不对应于概率，“logits”也不对应于对数几率，但由于与伯努利分布的相似性而使用了相同的名称。更多详情请参阅 [1]。

示例

>>> m = ContinuousBernoulli(torch.tensor([0.3]))
>>> m.sample()
tensor([ 0.2538])

参数

probs (Number, Tensor) – (0,1) 范围内的参数值
logits (Number, Tensor) – 其实数值参数的 sigmoid 值与“probs”匹配

[1] The continuous Bernoulli: fixing a pervasive error in variational autoencoders, Loaiza-Ganem G and Cunningham JP, NeurIPS 2019. https://arxiv.org/abs/1907.06845

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(value)[source][source]¶

log_prob(value)[source][source]¶

属性 logits: Tensor¶

属性 mean: Tensor¶

属性 param_shape: Size¶

属性 probs: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

sample(sample_shape=torch.Size([]))[source][source]¶

属性 stddev: Tensor¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

属性 variance: Tensor¶

Dirichlet 分布¶

类 torch.distributions.dirichlet.Dirichlet(concentration, validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建一个由集中度参数 concentration 参数化的 Dirichlet 分布。

示例

>>> m = Dirichlet(torch.tensor([0.5, 0.5]))
>>> m.sample()  # Dirichlet distributed with concentration [0.5, 0.5]
tensor([ 0.1046,  0.8954])

参数: concentration (Tensor) – 分布的集中度参数（通常称为 alpha）

arg_constraints = {'concentration': IndependentConstraint(GreaterThan(lower_bound=0.0), 1)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

属性 mean: Tensor¶

属性 mode: Tensor¶

rsample(sample_shape=())[source][source]¶

返回类型: Tensor

support = Simplex()¶

属性 variance: Tensor¶

指数分布¶

class torch.distributions.exponential.Exponential(rate, validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建一个由 rate 参数化的指数分布。

示例

>>> m = Exponential(torch.tensor([1.0]))
>>> m.sample()  # Exponential distributed with rate=1
tensor([ 0.1046])

参数: rate (float 或 Tensor) – 分布的速率，rate = 1 / 尺度

arg_constraints = {'rate': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(value)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

property stddev: Tensor¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance: Tensor¶

FisherSnedecor¶

class torch.distributions.fishersnedecor.FisherSnedecor(df1, df2, validate_args=None)[source][source]¶

基类： Distribution

创建一个由 df1 和 df2 参数化的 Fisher-Snedecor 分布。

示例

>>> m = FisherSnedecor(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # Fisher-Snedecor-distributed with df1=1 and df2=2
tensor([ 0.2453])

参数

df1 (float 或 Tensor) – 自由度参数 1
df2 (float 或 Tensor) – 自由度参数 2

arg_constraints = {'df1': GreaterThan(lower_bound=0.0), 'df2': GreaterThan(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

support = GreaterThan(lower_bound=0.0)¶

property variance: Tensor¶

Gamma¶

class torch.distributions.gamma.Gamma(concentration, rate, validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建一个由形状参数 concentration 和 rate 参数化的 Gamma 分布。

示例

>>> m = Gamma(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # Gamma distributed with concentration=1 and rate=1
tensor([ 0.1046])

参数

concentration (float 或 Tensor) – 分布的形状参数 (通常被称为 alpha)
rate (float 或 Tensor) – 分布的速率参数 (通常被称为 beta)，rate = 1 / 尺度

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0), 'rate': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

support = GreaterThanEq(lower_bound=0.0)¶

property variance: Tensor¶

Geometric¶

class torch.distributions.geometric.Geometric(probs=None, logits=None, validate_args=None)[source][source]¶

基类： Distribution

创建一个由 probs 参数化的 Geometric 分布，其中 probs 是伯努利试验的成功概率。

P(X=k) = (1-p)^{k} p, k = 0, 1, ...

注意

torch.distributions.geometric.Geometric() 采样第 $(k+1)$ 次试验是第一次成功的情况，因此采样范围是 $\{0, 1, \ldots\}$ ，而 torch.Tensor.geometric_() 采样第 k 次试验是第一次成功的情况，因此采样范围是 $\{1, 2, \ldots\}$ 。

示例

>>> m = Geometric(torch.tensor([0.3]))
>>> m.sample()  # underlying Bernoulli has 30% chance 1; 70% chance 0
tensor([ 2.])

参数

probs (Number, Tensor) – 采样到 1 的概率。必须在范围 (0, 1] 内
logits (Number, Tensor) – 采样到 1 的对数几率。

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

log_prob(value)[source][source]¶

property logits: Tensor¶

property mean: Tensor¶

property mode: Tensor¶

property probs: Tensor¶

sample(sample_shape=torch.Size([]))[source][source]¶

support = IntegerGreaterThan(lower_bound=0)¶

property variance: Tensor¶

Gumbel¶

class torch.distributions.gumbel.Gumbel(loc, scale, validate_args=None)[source][source]¶

基类：TransformedDistribution

从 Gumbel 分布中采样。

示例

>>> m = Gumbel(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # sample from Gumbel distribution with loc=1, scale=2
tensor([ 1.0124])

参数

loc (float 或 Tensor) – 分布的位置参数
scale (float 或 Tensor) – 分布的尺度参数

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

property stddev: Tensor¶

support = Real()¶

property variance: Tensor¶

HalfCauchy¶

class torch.distributions.half_cauchy.HalfCauchy(scale, validate_args=None)[source][source]¶

基类：TransformedDistribution

创建一个由 scale 参数化的半柯西分布，其中

X ~ Cauchy(0, scale)
Y = |X| ~ HalfCauchy(scale)

示例

>>> m = HalfCauchy(torch.tensor([1.0]))
>>> m.sample()  # half-cauchy distributed with scale=1
tensor([ 2.3214])

参数: scale (float 或 Tensor) – 完整柯西分布的尺度

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(prob)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

property scale: Tensor¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance: Tensor¶

HalfNormal¶

class torch.distributions.half_normal.HalfNormal(scale, validate_args=None)[source][source]¶

基类：TransformedDistribution

创建一个半正态分布，其参数为 scale，其中

X ~ Normal(0, scale)
Y = |X| ~ HalfNormal(scale)

示例

>>> m = HalfNormal(torch.tensor([1.0]))
>>> m.sample()  # half-normal distributed with scale=1
tensor([ 0.1046])

参数: scale (浮点数 或张量) – 对应完整正态分布的尺度参数

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(prob)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

property scale: Tensor¶

support = GreaterThanEq(lower_bound=0.0)¶

property variance: Tensor¶

Independent¶

class torch.distributions.independent.Independent(base_distribution, reinterpreted_batch_ndims, validate_args=None)[source][source]¶

基类： Distribution

将分布的某些批次维度重新解释为事件维度。

这主要用于改变 log_prob() 的结果形状。例如，要创建一个与多元正态分布形状相同（以便它们可以互换）的对角正态分布，您可以

>>> from torch.distributions.multivariate_normal import MultivariateNormal
>>> from torch.distributions.normal import Normal
>>> loc = torch.zeros(3)
>>> scale = torch.ones(3)
>>> mvn = MultivariateNormal(loc, scale_tril=torch.diag(scale))
>>> [mvn.batch_shape, mvn.event_shape]
[torch.Size([]), torch.Size([3])]
>>> normal = Normal(loc, scale)
>>> [normal.batch_shape, normal.event_shape]
[torch.Size([3]), torch.Size([])]
>>> diagn = Independent(normal, 1)
>>> [diagn.batch_shape, diagn.event_shape]
[torch.Size([]), torch.Size([3])]

参数

base_distribution (torch.distributions.distribution.Distribution) – 一个基础分布
reinterpreted_batch_ndims (整型) – 要重新解释为事件维度的批次维度数量

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {}¶

entropy()[source][source]¶

enumerate_support(expand=True)[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

property has_enumerate_support: bool¶

property has_rsample: bool¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

sample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

property support¶

返回类型: _依赖属性

property variance: Tensor¶

InverseGamma¶

class torch.distributions.inverse_gamma.InverseGamma(concentration, rate, validate_args=None)[source][source]¶

基类：TransformedDistribution

创建一个逆伽马分布，其参数为 concentration 和 rate，其中

X ~ Gamma(concentration, rate)
Y = 1 / X ~ InverseGamma(concentration, rate)

示例

>>> m = InverseGamma(torch.tensor([2.0]), torch.tensor([3.0]))
>>> m.sample()
tensor([ 1.2953])

参数

concentration (float 或 Tensor) – 分布的形状参数 (通常被称为 alpha)
rate (浮点数 或张量) – 速率参数 = 1 / 分布的尺度参数（通常称为 beta）

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'concentration': GreaterThan(lower_bound=0.0), 'rate': GreaterThan(lower_bound=0.0)}¶

property concentration: Tensor¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

property mean: Tensor¶

property mode: Tensor¶

property rate: Tensor¶

support = GreaterThan(lower_bound=0.0)¶

property variance: Tensor¶

Kumaraswamy¶

class torch.distributions.kumaraswamy.Kumaraswamy(concentration1, concentration0, validate_args=None)[source][source]¶

基类：TransformedDistribution

从 Kumaraswamy 分布中采样。

示例

>>> m = Kumaraswamy(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Kumaraswamy distribution with concentration alpha=1 and beta=1
tensor([ 0.1729])

参数

concentration1 (浮点数或张量) – 分布的第一个集中度参数（通常称为 alpha）
concentration0 (浮点数或张量) – 分布的第二个集中度参数（通常称为 beta）

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'concentration0': GreaterThan(lower_bound=0.0), 'concentration1': GreaterThan(lower_bound=0.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

property mean: Tensor¶

property mode: Tensor¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

property variance: Tensor¶

LKJCholesky¶

class torch.distributions.lkj_cholesky.LKJCholesky(dim, concentration=1.0, validate_args=None)[source][source]¶

基类： Distribution

相关矩阵的下乔利斯基因子的 LKJ 分布。该分布由 concentration 参数 $\eta$ 控制，使得由乔利斯基因子生成的相关矩阵 $M$ 的概率与 $\det(M)^{\eta - 1}$ 成正比。因此，当 concentration == 1 时，我们在相关矩阵的乔利斯基因子上得到一个均匀分布。

L ~ LKJCholesky(dim, concentration)
X = L @ L' ~ LKJCorr(dim, concentration)

请注意，此分布采样的是相关矩阵的乔利斯基因子，而不是相关矩阵本身，因此与 [1] 中 LKJCorr 分布的推导略有不同。对于采样，它使用了 [1] 中第 3 节的洋葱方法 (Onion method)。

示例

>>> l = LKJCholesky(3, 0.5)
>>> l.sample()  # l @ l.T is a sample of a correlation 3x3 matrix
tensor([[ 1.0000,  0.0000,  0.0000],
        [ 0.3516,  0.9361,  0.0000],
        [-0.1899,  0.4748,  0.8593]])

参数

dimension (dim) – 矩阵的维度
concentration (float 或 Tensor) – 分布的集中度/形状参数（通常称为 eta）

参考文献

[1] 基于 vines 和扩展洋葱方法生成随机相关矩阵 (2009), Daniel Lewandowski, Dorota Kurowicka, Harry Joe. Journal of Multivariate Analysis. 100. 10.1016/j.jmva.2009.04.008

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[source][source]¶

log_prob(value)[source][source]¶

sample(sample_shape=torch.Size([]))[source][source]¶

support = CorrCholesky()¶

拉普拉斯分布¶

class torch.distributions.laplace.Laplace(loc, scale, validate_args=None)[source][source]¶

基类： Distribution

创建一个由 loc 和 scale 参数化的拉普拉斯分布。

示例

>>> m = Laplace(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # Laplace distributed with loc=0, scale=1
tensor([ 0.1046])

参数

loc (float 或 Tensor) – 分布的均值
scale (float 或 Tensor) – 分布的尺度

arg_constraints = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(value)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

property stddev: Tensor¶

support = Real()¶

property variance: Tensor¶

对数正态分布¶

class torch.distributions.log_normal.LogNormal(loc, scale, validate_args=None)[source][source]¶

基类：TransformedDistribution

创建一个对数正态分布，由 loc 和 scale 参数化，其中

X ~ Normal(loc, scale)
Y = exp(X) ~ LogNormal(loc, scale)

示例

>>> m = LogNormal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # log-normal distributed with mean=0 and stddev=1
tensor([ 0.1046])

参数

loc (float 或 Tensor) – 分布对数的均值
scale (float 或 Tensor) – 分布对数的标准差

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

property loc: Tensor¶

property mean: Tensor¶

property mode: Tensor¶

property scale: Tensor¶

support = GreaterThan(lower_bound=0.0)¶

property variance: Tensor¶

低秩多元正态分布¶

class torch.distributions.lowrank_multivariate_normal.LowRankMultivariateNormal(loc, cov_factor, cov_diag, validate_args=None)[source][source]¶

基类： Distribution

创建一个多元正态分布，其协方差矩阵具有由 cov_factor 和 cov_diag 参数化的低秩形式

covariance_matrix = cov_factor @ cov_factor.T + cov_diag

示例

>>> m = LowRankMultivariateNormal(
...     torch.zeros(2), torch.tensor([[1.0], [0.0]]), torch.ones(2)
... )
>>> m.sample()  # normally distributed with mean=`[0,0]`, cov_factor=`[[1],[0]]`, cov_diag=`[1,1]`
tensor([-0.2102, -0.5429])

参数

loc (Tensor) – 分布的均值，形状为 batch_shape + event_shape
cov_factor (Tensor) – 协方差矩阵低秩形式的因子部分，形状为 batch_shape + event_shape + (rank,)
cov_diag (Tensor) – 协方差矩阵低秩形式的对角线部分，形状为 batch_shape + event_shape

注意

由于 Woodbury 矩阵恒等式和矩阵行列式引理，当 cov_factor.shape[1] << cov_factor.shape[0] 时，可以避免计算协方差矩阵的行列式和逆。由于这些公式，我们只需要计算小尺寸“电容”矩阵的行列式和逆。

capacitance = I + cov_factor.T @ inv(cov_diag) @ cov_factor

arg_constraints = {'cov_diag': IndependentConstraint(GreaterThan(lower_bound=0.0), 1), 'cov_factor': IndependentConstraint(Real(), 2), 'loc': IndependentConstraint(Real(), 1)}¶

property covariance_matrix: Tensor¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

property precision_matrix: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

property scale_tril: Tensor¶

support = IndependentConstraint(Real(), 1)¶

property variance: Tensor¶

同族混合分布¶

class torch.distributions.mixture_same_family.MixtureSameFamily(mixture_distribution, component_distribution, validate_args=None)[source][source]¶

基类： Distribution

MixtureSameFamily 分布实现了（批量）混合分布，其中所有分量来自同一分布类型的不同参数化。它由一个 Categorical “选择分布”（关于 k 个分量）和一个分量分布参数化，即一个 Distribution，其最右边的批量形状（等于 [k]）索引每个（批量的）分量。

示例

>>> # Construct Gaussian Mixture Model in 1D consisting of 5 equally
>>> # weighted normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Normal(torch.randn(5,), torch.rand(5,))
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct Gaussian Mixture Model in 2D consisting of 5 equally
>>> # weighted bivariate normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Independent(D.Normal(
...          torch.randn(5,2), torch.rand(5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct a batch of 3 Gaussian Mixture Models in 2D each
>>> # consisting of 5 random weighted bivariate normal distributions
>>> mix = D.Categorical(torch.rand(3,5))
>>> comp = D.Independent(D.Normal(
...         torch.randn(3,5,2), torch.rand(3,5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

参数

mixture_distribution (Categorical) – torch.distributions.Categorical 类的实例。管理选择分量的概率。类别数必须与 component_distribution 的最右侧批量维度匹配。必须具有标量 batch_shape 或与 component_distribution.batch_shape[:-1] 匹配的 batch_shape
component_distribution (Distribution) – torch.distributions.Distribution 类的实例。最右侧的批量维度索引分量。

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {}¶

cdf(x)[source][source]¶

property component_distribution: Distribution¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = False¶

log_prob(x)[source][source]¶

property mean: Tensor¶

property mixture_distribution: Categorical¶

sample(sample_shape=torch.Size([]))[source][source]¶

property support¶

返回类型: _依赖属性

property variance: Tensor¶

多项分布¶

class torch.distributions.multinomial.Multinomial(total_count=1, probs=None, logits=None, validate_args=None)[source][source]¶

基类： Distribution

创建了一个由 total_count 和 probs 或 logits（但不能同时指定两者）参数化的多项分布。 probs 的最内层维度索引类别。所有其他维度索引批次。

请注意，如果仅调用 log_prob()，则无需指定 total_count （见下面的示例）

注意

probs 参数必须是非负、有限且和非零的，并且它将沿最后一个维度归一化使其总和为 1。 probs 将返回此归一化后的值。 logits 参数将被解释为未归一化的对数概率，因此可以是任何实数。同样，它也将被归一化，以便得到的概率沿最后一个维度总和为 1。 logits 将返回此归一化后的值。

sample() 要求所有参数和样本使用一个共享的 total_count。
log_prob() 允许每个参数和样本使用不同的 total_count。

示例

>>> m = Multinomial(100, torch.tensor([ 1., 1., 1., 1.]))
>>> x = m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 21.,  24.,  30.,  25.])

>>> Multinomial(probs=torch.tensor([1., 1., 1., 1.])).log_prob(x)
tensor([-4.1338])

参数

total_count (int) – 试验次数
probs (张量) – 事件概率
logits (Tensor) – 事件对数概率（未归一化）

arg_constraints = {'logits': 独立约束(实数(), 1), 'probs': 单纯形()}¶

entropy()[源代码][源代码]¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

log_prob(value)[源代码][源代码]¶

property logits: Tensor¶

property mean: Tensor¶

property param_shape: Size¶

property probs: Tensor¶

sample(sample_shape=torch.Size([]))[源代码][源代码]¶

property support¶

返回类型: _依赖属性

total_count: int¶

property variance: Tensor¶

多元正态分布¶

class torch.distributions.multivariate_normal.MultivariateNormal(loc, covariance_matrix=None, precision_matrix=None, scale_tril=None, validate_args=None)[源代码][源代码]¶

基类： Distribution

创建了一个由均值向量和协方差矩阵参数化的多元正态（也称为高斯）分布。

多元正态分布可以用正定协方差矩阵 $\mathbf{\Sigma}$ 、或正定精度矩阵 $\mathbf{\Sigma}^{-1}$ 、或具有正对角线元素的下三角矩阵 $\mathbf{L}$ （满足 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$ ）进行参数化。这个三角矩阵可以通过协方差矩阵的 Cholesky 分解等方法获得。

示例

>>> m = MultivariateNormal(torch.zeros(2), torch.eye(2))
>>> m.sample()  # normally distributed with mean=`[0,0]` and covariance_matrix=`I`
tensor([-0.2102, -0.5429])

参数

loc (Tensor) – 分布的均值
covariance_matrix (Tensor) – 正定协方差矩阵
precision_matrix (Tensor) – 正定精度矩阵
scale_tril (Tensor) – 协方差矩阵的下三角因子，对角线元素为正

注意

covariance_matrix、precision_matrix 或 scale_tril 只能指定其中一个。

使用 scale_tril 会更高效：所有内部计算都基于 scale_tril。如果传入的是 covariance_matrix 或 precision_matrix，它们仅用于通过 Cholesky 分解计算对应的下三角矩阵。

arg_constraints = {'covariance_matrix': 正定(), 'loc': 独立约束(实数(), 1), 'precision_matrix': 正定(), 'scale_tril': LowerCholesky()}¶

property covariance_matrix: Tensor¶

entropy()[源代码][源代码]¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

has_rsample = True¶

log_prob(value)[源代码][源代码]¶

property mean: Tensor¶

property mode: Tensor¶

property precision_matrix: Tensor¶

rsample(sample_shape=torch.Size([]))[源代码][源代码]¶

返回类型: Tensor

property scale_tril: Tensor¶

support = 独立约束(实数(), 1)¶

property variance: Tensor¶

负二项分布¶

class torch.distributions.negative_binomial.NegativeBinomial(total_count, probs=None, logits=None, validate_args=None)[源代码][源代码]¶

基类： Distribution

创建一个负二项分布，即在达到 total_count 次失败之前成功的独立同分布 Bernoulli 试验次数的分布。每次 Bernoulli 试验成功的概率为 probs。

参数

total_count (float 或 Tensor) – 停止的非负失败 Bernoulli 试验次数，尽管对于实数值次数分布仍然有效
probs (Tensor) – 成功事件的概率，在半开区间 [0, 1) 内
logits (Tensor) – 成功事件概率的对数几率

arg_constraints = {'logits': 实数(), 'probs': 半开区间(下界=0.0, 上界=1.0), 'total_count': 大于等于(下界=0)}¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

log_prob(value)[源代码][源代码]¶

property logits: Tensor¶

property mean: Tensor¶

property mode: Tensor¶

property param_shape: Size¶

property probs: Tensor¶

sample(sample_shape=torch.Size([]))[源代码][源代码]¶

support = 大于0的整数(下界=0)¶

property variance: Tensor¶

正态分布¶

class torch.distributions.normal.Normal(loc, scale, validate_args=None)[源代码][源代码]¶

继承自: ExponentialFamily

创建了一个由 loc 和 scale 参数化的正态（也称为高斯）分布。

示例

>>> m = Normal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # normally distributed with loc=0 and scale=1
tensor([ 0.1046])

参数

loc (float 或 Tensor) – 分布的均值（常称为 mu）
scale (float 或 Tensor) – 分布的标准差（常称为 sigma）

arg_constraints = {'loc': 实数(), 'scale': 大于(下界=0.0)}¶

cdf(value)[源代码][源代码]¶

entropy()[源代码][源代码]¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

has_rsample = True¶

icdf(value)[源代码][源代码]¶

log_prob(value)[源代码][源代码]¶

property mean: Tensor¶

属性 mode: Tensor¶

rsample(sample_shape=torch.Size([]))[源代码][源代码]¶

返回类型: Tensor

sample(sample_shape=torch.Size([]))[源代码][源代码]¶

属性 stddev: Tensor¶

support = Real()¶

属性 variance: Tensor¶

独热分类分布 (OneHotCategorical)¶

类 torch.distributions.one_hot_categorical.OneHotCategorical(probs=None, logits=None, validate_args=None)[源代码][源代码]¶

基类： Distribution

创建一个独热分类分布，其参数由 probs 或 logits 指定。

样本是大小为 probs.size(-1) 的独热编码向量。

注意

probs 参数必须是非负、有限且和非零，并且将沿最后一维归一化，使其和为 1。 probs 将返回此归一化后的值。logits 参数将被解释为未归一化的对数概率，因此可以是任意实数。它同样将被归一化，使得得到的概率沿最后一维的和为 1。 logits 将返回此归一化后的值。

另请参见： torch.distributions.Categorical() 关于 probs 和 logits 的规范。

示例

>>> m = OneHotCategorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 0.,  0.,  0.,  1.])

参数

probs (张量) – 事件概率
logits (Tensor) – 事件对数概率（未归一化）

arg_constraints = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

entropy()[源代码][源代码]¶

enumerate_support(expand=True)[源代码][源代码]¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

has_enumerate_support = True¶

log_prob(value)[源代码][源代码]¶

属性 logits: Tensor¶

属性 mean: Tensor¶

属性 mode: Tensor¶

属性 param_shape: Size¶

属性 probs: Tensor¶

sample(sample_shape=torch.Size([]))[源代码][源代码]¶

support = OneHot()¶

属性 variance: Tensor¶

帕累托分布 (Pareto)¶

类 torch.distributions.pareto.Pareto(scale, alpha, validate_args=None)[源代码][源代码]¶

基类：TransformedDistribution

从帕累托第一型分布采样。

示例

>>> m = Pareto(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Pareto distribution with scale=1 and alpha=1
tensor([ 1.5623])

参数

scale (float 或 Tensor) – 分布的尺度参数
alpha (float 或 Tensor) – 分布的形状参数

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'alpha': GreaterThan(lower_bound=0.0), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[源代码][源代码]¶

返回类型: Tensor

expand(batch_shape, _instance=None)[源代码][源代码]¶

返回类型: 帕累托分布 (Pareto)

属性 mean: Tensor¶

属性 mode: Tensor¶

属性 support: Constraint¶

返回类型: _依赖属性

属性 variance: Tensor¶

泊松分布 (Poisson)¶

类 torch.distributions.poisson.Poisson(rate, validate_args=None)[源代码][源代码]¶

继承自: ExponentialFamily

创建一个泊松分布，其参数为速率参数 rate。

样本是非负整数，其概率质量函数 (pmf) 如下所示：

\mathrm{rate}^k \frac{e^{-\mathrm{rate}}}{k!}

示例

>>> m = Poisson(torch.tensor([4]))
>>> m.sample()
tensor([ 3.])

参数: rate (数值, Tensor) – 速率参数

arg_constraints = {'rate': GreaterThanEq(lower_bound=0.0)}¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

log_prob(value)[源代码][源代码]¶

属性 mean: Tensor¶

属性 mode: Tensor¶

sample(sample_shape=torch.Size([]))[源代码][源代码]¶

support = IntegerGreaterThan(lower_bound=0)¶

属性 variance: Tensor¶

松弛伯努利分布 (RelaxedBernoulli)¶

类 torch.distributions.relaxed_bernoulli.RelaxedBernoulli(temperature, probs=None, logits=None, validate_args=None)[源代码][源代码]¶

基类：TransformedDistribution

创建一个松弛伯努利分布，其参数为 temperature，以及 probs 或 logits (二者不同时提供)。这是 Bernoulli 分布的松弛版本，其取值范围在 (0, 1) 内，并且具有可重参数化采样。

示例

>>> m = RelaxedBernoulli(torch.tensor([2.2]),
...                      torch.tensor([0.1, 0.2, 0.3, 0.99]))
>>> m.sample()
tensor([ 0.2951,  0.3442,  0.8918,  0.9021])

参数

temperature (Tensor) – 松弛温度
probs (数字，张量) – 抽样到 1 的概率
logits (数字，张量) – 抽样到 1 的对数几率（log-odds）

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

has_rsample = True¶

属性 logits: Tensor¶

属性 probs: Tensor¶

support = Interval(lower_bound=0.0, upper_bound=1.0)¶

属性 temperature: Tensor¶

Logit松弛伯努利分布 (LogitRelaxedBernoulli)¶

类 torch.distributions.relaxed_bernoulli.LogitRelaxedBernoulli(temperature, probs=None, logits=None, validate_args=None)[源代码][源代码]¶

基类： Distribution

创建一个 Logit 松弛伯努利分布，其参数由 probs 或 logits (二者不同时提供) 指定，它是松弛伯努利分布的 logit。

样本是 (0, 1) 范围内值的 logit。更多详细信息请参见 [1]。

参数

temperature (Tensor) – 松弛温度
probs (数字，张量) – 抽样到 1 的概率
logits (数字，张量) – 抽样到 1 的对数几率（log-odds）

[1] The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables (Maddison et al., 2017)

[2] Categorical Reparametrization with Gumbel-Softmax (Jang et al., 2017)

arg_constraints = {'logits': Real(), 'probs': Interval(lower_bound=0.0, upper_bound=1.0)}¶

expand(batch_shape, _instance=None)[源代码][源代码]¶

log_prob(value)[源代码][源代码]¶

属性 logits: Tensor¶

属性 param_shape: Size¶

属性 probs: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

support = Real()¶

松弛独热分类分布 (RelaxedOneHotCategorical)¶

class torch.distributions.relaxed_categorical.RelaxedOneHotCategorical(temperature, probs=None, logits=None, validate_args=None)[source][source]¶

基类：TransformedDistribution

创建一个 RelaxedOneHotCategorical 分布，由 temperature 参数化，并使用 probs 或 logits。这是 OneHotCategorical 分布的松弛版本，因此其样本在 simplex 上，并且可重新参数化。

示例

>>> m = RelaxedOneHotCategorical(torch.tensor([2.2]),
...                              torch.tensor([0.1, 0.2, 0.3, 0.4]))
>>> m.sample()
tensor([ 0.1294,  0.2324,  0.3859,  0.2523])

参数

temperature (Tensor) – 松弛温度
probs (张量) – 事件概率
logits (Tensor) – 每个事件的非归一化对数概率

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'logits': IndependentConstraint(Real(), 1), 'probs': Simplex()}¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

property logits: Tensor¶

property probs: Tensor¶

support =Simplex()¶

property temperature: Tensor¶

学生 t 分布 (StudentT)¶

class torch.distributions.studentT.StudentT(df, loc=0.0, scale=1.0, validate_args=None)[source][source]¶

基类： Distribution

创建一个 Student's t 分布，由自由度 df、均值 loc 和尺度 scale 参数化。

示例

>>> m = StudentT(torch.tensor([2.0]))
>>> m.sample()  # Student's t-distributed with degrees of freedom=2
tensor([ 0.1046])

参数

df (float or Tensor) – 自由度
loc (float 或 Tensor) – 分布的均值
scale (float 或 Tensor) – 分布的尺度

arg_constraints = {'df': GreaterThan(lower_bound=0.0), 'loc': Real(), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

support = Real()¶

property variance: Tensor¶

变换分布 (TransformedDistribution)¶

class torch.distributions.transformed_distribution.TransformedDistribution(base_distribution, transforms, validate_args=None)[source][source]¶

基类： Distribution

Distribution 类别的扩展，它对基础分布应用一系列 Transform。令 f 为所应用的变换的组合

X ~ BaseDistribution
Y = f(X) ~ TransformedDistribution(BaseDistribution, f)
log p(Y) = log p(X) + log |det (dX/dY)|

请注意，TransformedDistribution 的 .event_shape 是其基础分布及其变换的最大形状，因为变换可以在事件之间引入相关性。

TransformedDistribution 的使用示例包括

# Building a Logistic Distribution
# X ~ Uniform(0, 1)
# f = a + b * logit(X)
# Y ~ f(X) ~ Logistic(a, b)
base_distribution = Uniform(0, 1)
transforms = [SigmoidTransform().inv, AffineTransform(loc=a, scale=b)]
logistic = TransformedDistribution(base_distribution, transforms)

更多示例请参见 Gumbel、HalfCauchy、HalfNormal、LogNormal、Pareto、Weibull、RelaxedBernoulli 和 RelaxedOneHotCategorical 的实现。

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {}¶

cdf(value)[source][source]¶: 通过反转变换并计算基础分布的分数来计算累积分布函数。

expand(batch_shape, _instance=None)[source][source]¶

property has_rsample: bool¶

icdf(value)[source][source]¶: 通过计算基础分布的逆累积分布函数并应用变换来计算逆累积分布函数。

log_prob(value)[source][source]¶: 通过反转变换，并使用基础分布的分数和对数绝对雅可比行列式来计算样本的分数。

rsample(sample_shape=torch.Size([]))[source][source]¶

如果分布参数是批量化的，则生成一个 sample_shape 形状的可重新参数化样本或 sample_shape 形状的可重新参数化样本批量。首先从基础分布中采样，然后对列表中的每个变换应用 transform()。

返回类型: Tensor

sample(sample_shape=torch.Size([]))[source][source]¶: 如果分布参数是批量化的，则生成一个 sample_shape 形状的样本或 sample_shape 形状的样本批量。首先从基础分布中采样，然后对列表中的每个变换应用 transform()。

property support¶

返回类型: _依赖属性

均匀分布 (Uniform)¶

class torch.distributions.uniform.Uniform(low, high, validate_args=None)[source][source]¶

基类： Distribution

从半开区间 [low, high) 中生成均匀分布的随机样本。

示例

>>> m = Uniform(torch.tensor([0.0]), torch.tensor([5.0]))
>>> m.sample()  # uniformly distributed in the range [0.0, 5.0)
tensor([ 2.3418])

参数

low (float or Tensor) – 下界范围（包含）。
high (float or Tensor) – 上界范围（不包含）。

arg_constraints = {'high': Dependent(), 'low': Dependent()}¶

cdf(value)[source][source]¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

icdf(value)[source][source]¶

log_prob(value)[source][source]¶

property mean: Tensor¶

property mode: Tensor¶

rsample(sample_shape=torch.Size([]))[source][source]¶

返回类型: Tensor

property stddev: Tensor¶

property support¶

返回类型: _依赖属性

property variance: Tensor¶

冯·米塞斯分布 (VonMises)¶

class torch.distributions.von_mises.VonMises(loc, concentration, validate_args=None)[source][source]¶

基类： Distribution

一种环状 von Mises 分布。

此实现使用极坐标。loc 和 value 参数可以是任何实数（为了便于无约束优化），但被解释为以 2 pi 为模的角度。

示例：

>>> m = VonMises(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # von Mises distributed with loc=1 and concentration=1
tensor([1.9777])

参数

loc (torch.Tensor) – 弧度制的角度。
concentration (torch.Tensor) – 集中度参数

arg_constraints = {'concentration': GreaterThan(lower_bound=0.0), 'loc': Real()}¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = False¶

log_prob(value)[source][source]¶

property mean: Tensor¶: 提供的均值是环状均值。

property mode: Tensor¶

sample(sample_shape=torch.Size([]))[source][source]¶

von Mises 分布的采样算法基于以下论文：D.J. Best and N.I. Fisher, “Efficient simulation of the von Mises distribution.” Applied Statistics (1979): 152-157。

采样内部始终使用双精度进行，以避免在 concentration 值较小时导致 _rejection_sample() 卡死，这种情况在单精度下大约 1e-4 时开始发生（参见 issue #88443）。

support = Real()¶

property variance: Tensor¶: 提供的方差是环状方差。

威布尔分布 (Weibull)¶

class torch.distributions.weibull.Weibull(scale, concentration, validate_args=None)[source][source]¶

基类：TransformedDistribution

从两参数 Weibull 分布中采样。

示例

>>> m = Weibull(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Weibull distribution with scale=1, concentration=1
tensor([ 0.4784])

参数

scale (float or Tensor) – 分布的尺度参数 (lambda)。
concentration (float or Tensor) – 分布的集中度参数 (k/shape)。

arg_constraints: dict[str, torch.distributions.constraints.Constraint] = {'concentration': GreaterThan(lower_bound=0.0), 'scale': GreaterThan(lower_bound=0.0)}¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

属性 mean: Tensor¶

属性 mode: Tensor¶

support = GreaterThan(lower_bound=0.0)¶

属性 variance: Tensor¶

Wishart 分布¶

class torch.distributions.wishart.Wishart(df, covariance_matrix=None, precision_matrix=None, scale_tril=None, validate_args=None)[source][source]¶

继承自: ExponentialFamily

创建 Wishart 分布，该分布由对称正定矩阵 $\Sigma$ 或其 Cholesky 分解 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$ 参数化

示例

>>> m = Wishart(torch.Tensor([2]), covariance_matrix=torch.eye(2))
>>> m.sample()  # Wishart distributed with mean=`df * I` and
>>> # variance(x_ij)=`df` for i != j and variance(x_ij)=`2 * df` for i == j

参数

df (float 或 Tensor) – 实值参数，其值大于 (方阵的维度) - 1
covariance_matrix (Tensor) – 正定协方差矩阵
precision_matrix (Tensor) – 正定精度矩阵
scale_tril (Tensor) – 协方差矩阵的下三角因子，对角线元素为正

注意

只能指定 covariance_matrix、precision_matrix 或 scale_tril 中的一个。使用 scale_tril 会更有效：内部的所有计算都基于 scale_tril。如果传入 covariance_matrix 或 precision_matrix，则仅用于通过 Cholesky 分解计算相应的下三角矩阵。torch.distributions.LKJCholesky 是受限的 Wishart 分布。[1]

参考文献

[1] Wang, Z., Wu, Y. and Chu, H., 2018. On equivalence of the LKJ distribution and the restricted Wishart distribution. [2] Sawyer, S., 2007. Wishart Distributions and Inverse-Wishart Sampling. [3] Anderson, T. W., 2003. An Introduction to Multivariate Statistical Analysis (3rd ed.). [4] Odell, P. L. & Feiveson, A. H., 1966. A Numerical Procedure to Generate a SampleCovariance Matrix. JASA, 61(313):199-203. [5] Ku, Y.-C. & Bloomfield, P., 2010. Generating Random Wishart Matrices with Fractional Degrees of Freedom in OX.

arg_constraints = {'covariance_matrix': PositiveDefinite(), 'df': GreaterThan(lower_bound=0), 'precision_matrix': PositiveDefinite(), 'scale_tril': LowerCholesky()}¶

属性 covariance_matrix: Tensor¶

entropy()[source][source]¶

expand(batch_shape, _instance=None)[source][source]¶

has_rsample = True¶

log_prob(value)[source][source]¶

属性 mean: Tensor¶

属性 mode: Tensor¶

属性 precision_matrix: Tensor¶

rsample(sample_shape=torch.Size([]), max_try_correction=None)[source][source]¶

警告

在某些情况下，基于 Bartlett 分解的采样算法可能会返回奇异矩阵样本。默认情况下会尝试几次校正奇异样本，但最终仍可能返回奇异矩阵样本。奇异样本在 .log_prob() 中可能会返回 -inf 值。在这些情况下，用户应该验证样本，并修复 df 的值或相应调整 .rsample 中参数 max_try_correction 的值。

返回类型: Tensor

属性 scale_tril: Tensor¶

support = PositiveDefinite()¶

属性 variance: Tensor¶

KL 散度¶

torch.distributions.kl.kl_divergence(p, q)[source][source]¶

计算两个分布之间的 Kullback-Leibler 散度 $KL(p \| q)$ 。

KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx

参数

p (Distribution) – 一个 Distribution 对象。
q (Distribution) – 一个 Distribution 对象。

返回

一个形状为 batch_shape 的 KL 散度批次。

返回类型

Tensor

引发

NotImplementedError – 如果分发类型尚未通过 register_kl() 注册。

KL 散度目前已为以下分布对实现

Bernoulli 和 Bernoulli
Bernoulli 和 Poisson
Beta 和 Beta
Beta 和 ContinuousBernoulli
Beta 和 Exponential
Beta 和 Gamma
Beta 和 Normal
Beta 和 Pareto
Beta 和 Uniform
Binomial 和 Binomial
Categorical 和 Categorical
Cauchy 和 Cauchy
ContinuousBernoulli 和 ContinuousBernoulli
ContinuousBernoulli 和 Exponential
ContinuousBernoulli 和 Normal
ContinuousBernoulli 和 Pareto
ContinuousBernoulli 和 Uniform
Dirichlet 和 Dirichlet
Exponential 和 Beta
Exponential 和 ContinuousBernoulli
Exponential 和 Exponential
Exponential 和 Gamma
Exponential 和 Gumbel
Exponential 和 Normal
Exponential 和 Pareto
Exponential 和 Uniform
ExponentialFamily 和 ExponentialFamily
Gamma 和 Beta
Gamma 和 ContinuousBernoulli
Gamma 和 Exponential
Gamma 和 Gamma
Gamma 和 Gumbel
Gamma 和 Normal
Gamma 和 Pareto
Gamma 和 Uniform
Geometric 和 Geometric
Gumbel 和 Beta
Gumbel 和 ContinuousBernoulli
Gumbel 和 Exponential
Gumbel 和 Gamma
Gumbel 和 Gumbel
Gumbel 和 Normal
Gumbel 和 Pareto
Gumbel 和 Uniform
HalfNormal 和 HalfNormal
Independent 和 Independent
Laplace 和 Beta
Laplace 和 ContinuousBernoulli
Laplace 和 Exponential
Laplace 和 Gamma
Laplace 和 Laplace
Laplace 和 Normal
Laplace 和 Pareto
Laplace 和 Uniform
LowRankMultivariateNormal 和 LowRankMultivariateNormal
LowRankMultivariateNormal 和 MultivariateNormal
MultivariateNormal 和 LowRankMultivariateNormal
MultivariateNormal 和 MultivariateNormal
Normal 和 Beta
Normal 和 ContinuousBernoulli
Normal 和 Exponential
Normal 和 Gamma
Normal 和 Gumbel
Normal 和 Laplace
Normal 和 Normal
Normal 和 Pareto
Normal 和 Uniform
OneHotCategorical 和 OneHotCategorical
Pareto 和 Beta
Pareto 和 ContinuousBernoulli
Pareto 和 Exponential
Pareto 和 Gamma
Pareto 和 Normal
Pareto 和 Pareto
Pareto 和 Uniform
Poisson 和 Bernoulli
Poisson 和 Binomial
Poisson 和 Poisson
TransformedDistribution 和 TransformedDistribution
Uniform 和 Beta
Uniform 和 ContinuousBernoulli
Uniform 和 Exponential
Uniform 和 Gamma
Uniform 和 Gumbel
Uniform 和 Normal
Uniform 和 Pareto
Uniform 和 Uniform

torch.distributions.kl.register_kl(type_p, type_q)[source][source]¶

用于注册一个对偶函数的装饰器，与 kl_divergence()。用法

@register_kl(Normal, Normal)
def kl_normal_normal(p, q):
    # insert implementation here

查找返回按子类排序的最具体的 (type,type) 匹配。如果匹配不明确，则会引发 RuntimeWarning。例如，为了解决不明确的情况

@register_kl(BaseP, DerivedQ)
def kl_version1(p, q): ...
@register_kl(DerivedP, BaseQ)
def kl_version2(p, q): ...

您应该注册第三个最具体的实现，例如

register_kl(DerivedP, DerivedQ)(kl_version1)  # Break the tie.

参数

type_p (type) – Distribution 的子类。
type_q (type) – Distribution 的子类。

Transforms¶

class torch.distributions.transforms.AbsTransform(cache_size=0)[source][source]¶: 通过映射 $y = |x|$ 进行变换。

class torch.distributions.transforms.AffineTransform(loc, scale, event_dim=0, cache_size=0)[source][source]¶

通过逐点仿射映射 $y = \text{loc} + \text{scale} \times x$ 进行变换。

参数

loc (Tensor 或 float) – 位置参数。
scale (Tensor 或 float) – 尺度参数。
event_dim (int) – event_shape 的可选大小。对于单变量随机变量，此值应为零；对于向量分布，此值应为 1；对于矩阵分布，此值应为 2；等等。

class torch.distributions.transforms.CatTransform(tseq, dim=0, lengths=None, cache_size=0)[source][source]¶

变换函数子，将一系列变换 tseq 按分量应用于 dim 处的每个子矩阵，长度为 lengths[dim]，其方式与 torch.cat() 兼容。

示例

x0 = torch.cat([torch.range(1, 10), torch.range(1, 10)], dim=0)
x = torch.cat([x0, x0], dim=0)
t0 = CatTransform([ExpTransform(), identity_transform], dim=0, lengths=[10, 10])
t = CatTransform([t0, t0], dim=0, lengths=[20, 20])
y = t(x)

class torch.distributions.transforms.ComposeTransform(parts, cache_size=0)[source][source]¶

将多个变换组合成一个链。被组合的变换负责缓存。

参数

parts (list of Transform) – 要组合的变换列表。
cache_size (int) – 缓存大小。如果为零，则不进行缓存。如果为一，则缓存最新的单个值。仅支持 0 和 1。

class torch.distributions.transforms.CorrCholeskyTransform(cache_size=0)[source][source]¶

将无约束实向量 $x$ ，长度为 $D*(D-1)/2$ ，变换为 D 维相关矩阵的 Cholesky 因子。此 Cholesky 因子是一个下三角矩阵，其对角线元素为正，且每行的欧几里得范数为一。变换过程如下

首先，我们将 x 按行顺序转换为一个下三角矩阵。

对于下三角部分的每一行 $X_i$ ，我们应用类 StickBreakingTransform 的 *有符号* 版本，通过以下步骤将 $X_i$ 变换为单位欧几里得范数向量： - 缩放到区间 $(-1, 1)$ 域： $r_i = \tanh(X_i)$ 。 - 变换到无符号域： $z_i = r_i^2$ 。 - 应用 $s_i = StickBreakingTransform(z_i)$ 。 - 变换回有符号域： $y_i = sign(r_i) * \sqrt{s_i}$ 。

class torch.distributions.transforms.CumulativeDistributionTransform(distribution, cache_size=0)[source][source]¶

通过概率分布的累积分布函数进行变换。

参数: distribution (Distribution) – 用于变换的概率分布，使用其累积分布函数。

示例

# Construct a Gaussian copula from a multivariate normal.
base_dist = MultivariateNormal(
    loc=torch.zeros(2),
    scale_tril=LKJCholesky(2).sample(),
)
transform = CumulativeDistributionTransform(Normal(0, 1))
copula = TransformedDistribution(base_dist, [transform])

class torch.distributions.transforms.ExpTransform(cache_size=0)[source][source]¶: 通过映射 $y = \exp(x)$ 进行变换。

class torch.distributions.transforms.IndependentTransform(base_transform, reinterpreted_batch_ndims, cache_size=0)[source][source]¶

另一个变换的包装器，将最右边的额外 reinterpreted_batch_ndims 个维度视为相关的。这不会影响前向或后向变换，但会在 log_abs_det_jacobian() 中对最右边的额外 reinterpreted_batch_ndims 个维度求和。

参数

base_transform (Transform) – 基本变换。
reinterpreted_batch_ndims (int) – 要视为相关的最右边额外维度数。

class torch.distributions.transforms.LowerCholeskyTransform(cache_size=0)[source][source]¶

将无约束矩阵变换为具有非负对角线元素的下三角矩阵。

这对于通过 Cholesky 分解来参数化正定矩阵非常有用。

class torch.distributions.transforms.PositiveDefiniteTransform(cache_size=0)[source][source]¶: 将无约束矩阵变换为正定矩阵。

class torch.distributions.transforms.PowerTransform(exponent, cache_size=0)[source][source]¶: 通过映射 $y = x^{\text{exponent}}$ 进行变换。

class torch.distributions.transforms.ReshapeTransform(in_shape, out_shape, cache_size=0)[source][source]¶

单位雅可比变换，用于重塑张量的最右边部分。

注意，in_shape 和 out_shape 必须具有相同数量的元素，就像 torch.Tensor.reshape() 一样。

参数

in_shape (torch.Size) – 输入事件形状。
out_shape (torch.Size) – 输出事件形状。
cache_size (int) – 缓存大小。如果为零，则不进行缓存。如果为一，则缓存最新的单个值。仅支持 0 和 1。（默认值：0）

class torch.distributions.transforms.SigmoidTransform(cache_size=0)[source][source]¶: 通过以下映射进行变换 $y = \frac{1}{1 + \exp(-x)}$ 和 $x = \text{logit}(y)$ 。

class torch.distributions.transforms.SoftplusTransform(cache_size=0)[source][source]¶: 通过以下映射进行变换 $\text{Softplus}(x) = \log(1 + \exp(x))$ 。当 $x > 20$ 时，实现会回退到线性函数。

class torch.distributions.transforms.TanhTransform(cache_size=0)[source][source]¶

通过以下映射进行变换 $y = \tanh(x)$ 。

它等价于

ComposeTransform(
    [
        AffineTransform(0.0, 2.0),
        SigmoidTransform(),
        AffineTransform(-1.0, 2.0),
    ]
)

然而这可能在数值上不稳定，因此建议使用 TanhTransform 代替。

注意，在涉及 NaN/Inf 值时，应使用 cache_size=1。

class torch.distributions.transforms.SoftmaxTransform(cache_size=0)[source][source]¶

通过 $y = \exp(x)$ 然后归一化，从无约束空间变换到单纯形。

这不是双射的，不能用于 HMC。然而，这主要按坐标操作（最终归一化除外），因此适用于按坐标进行的优化算法。

class torch.distributions.transforms.StackTransform(tseq, dim=0, cache_size=0)[source][source]¶

变换函数（functor），它以与 torch.stack() 兼容的方式，将变换序列 tseq 逐分量应用于 dim 处的每个子矩阵。

示例

x = torch.stack([torch.range(1, 10), torch.range(1, 10)], dim=1)
t = StackTransform([ExpTransform(), identity_transform], dim=1)
y = t(x)

class torch.distributions.transforms.StickBreakingTransform(cache_size=0)[source][source]¶

通过分段（stick-breaking）过程，从无约束空间变换到维度增加一的单纯形。

此变换在 Dirichlet 分布的分段构建中作为迭代 sigmoid 变换出现：第一个 logit 通过 sigmoid 变换为第一个概率以及其余部分的概率，然后该过程递归进行。

这是双射的，适用于 HMC；但它将坐标混合在一起，不太适用于优化。

class torch.distributions.transforms.Transform(cache_size=0)[source][source]¶

具有可计算的 log det jacobians 的可逆变换的抽象类。它们主要用于 torch.distributions.TransformedDistribution。

对于逆变换计算成本高或数值不稳定的变换，缓存非常有用。注意，使用记忆值（memoized values）时必须小心，因为 autograd 计算图可能会被反转。例如，以下代码无论是否使用缓存都可以工作

y = t(x)
t.log_abs_det_jacobian(x, y).backward()  # x will receive gradients.

然而，以下代码在使用缓存时会由于依赖反转而报错

y = t(x)
z = t.inv(y)
grad(z.sum(), [y])  # error because z is x

派生类应实现 _call() 或 _inverse() 中的一个或两个。设置 bijective=True 的派生类也应实现 log_abs_det_jacobian()。

参数

cache_size (int) – 缓存大小。如果为零，则不进行缓存。如果为一，则缓存最新的单个值。仅支持 0 和 1。

变量

domain（Constraint） – 表示此变换的有效输入的约束。
codomain（Constraint） – 表示此变换的有效输出（即逆变换的输入）的约束。
bijective（bool） – 此变换是否为双射。当且仅当对于域中的每个 x 和值域中的每个 y，变换 t 满足 t.inv(t(x)) == x 且 t(t.inv(y)) == y 时，它是双射的。非双射变换至少应保持较弱的伪逆性质 t(t.inv(t(x)) == t(x) 和 t.inv(t(t.inv(y))) == t.inv(y)。
sign（int 或 Tensor） – 对于双射的一元变换，根据变换是单调递增还是递减，此值应为 +1 或 -1。

property inv: Transform¶: 返回此变换的逆变换 Transform。这应满足 t.inv.inv is t。

property sign: int¶: （如果适用）返回雅可比行列式的符号。一般来说，这只对双射变换有意义。

log_abs_det_jacobian(x, y)[source][source]¶: 给定输入和输出，计算 log det jacobian log |dy/dx|。

forward_shape(shape)[source][source]¶: 给定输入形状，推断正向计算的形状。默认为保持形状不变。

inverse_shape(shape)[source][source]¶: 给定输出形状，推断逆向计算的形状。默认为保持形状不变。

Constraints¶

class torch.distributions.constraints.Constraint[source][source]¶

约束的抽象基类。

约束对象表示变量有效的区域，例如变量可在其中进行优化的区域。

变量

is_discrete（bool） – 约束空间是否为离散。默认为 False。
event_dim（int） – 一起定义一个事件的最右侧维度的数量。check() 方法在计算有效性时将移除这些维度的数量。

check(value)[source][source]¶: 返回一个由 sample_shape + batch_shape 组成的字节张量，指示值中的每个事件是否满足此约束。

torch.distributions.constraints.cat[source]¶: _Cat 的别名

torch.distributions.constraints.dependent_property[source]¶: _DependentProperty 的别名

torch.distributions.constraints.greater_than[source]¶: _GreaterThan 的别名

torch.distributions.constraints.greater_than_eq[source]¶: _GreaterThanEq 的别名

torch.distributions.constraints.independent[source]¶: _IndependentConstraint 的别名

torch.distributions.constraints.integer_interval[source]¶: _IntegerInterval 的别名

torch.distributions.constraints.interval[source]¶: _Interval 的别名

torch.distributions.constraints.half_open_interval[source]¶: _HalfOpenInterval 的别名

is_dependent(constraint)[source][source]¶

检查 constraint 是否为 _Dependent 对象。

参数: constraint – 一个 Constraint 对象。
返回: 如果 constraint 可以被精炼为 _Dependent 类型，则为 True，否则为 False。
返回类型: bool

示例

>>> import torch
>>> from torch.distributions import Bernoulli
>>> from torch.distributions.constraints import is_dependent

>>> dist = Bernoulli(probs=torch.tensor([0.6], requires_grad=True))
>>> constraint1 = dist.arg_constraints["probs"]
>>> constraint2 = dist.arg_constraints["logits"]

>>> for constraint in [constraint1, constraint2]:
>>>     if is_dependent(constraint):
>>>         continue

torch.distributions.constraints.less_than[source]¶: _LessThan 的别名

torch.distributions.constraints.multinomial[source]¶: _Multinomial 的别名

torch.distributions.constraints.stack[source]¶: _Stack 的别名

Constraint Registry¶

PyTorch 提供了两个全局 ConstraintRegistry 对象，它们将 Constraint 对象链接到 Transform 对象。这些对象都接收约束作为输入并返回变换，但它们在双射性上提供不同的保证。

biject_to(constraint) 查找一个从 constraints.real 到给定 constraint 的双射 Transform。返回的变换保证具有 .bijective = True 并且应实现 log_abs_det_jacobian()。
transform_to(constraint) 查找一个从 constraints.real 到给定 constraint 的不一定是双射的 Transform。返回的变换不保证实现 log_abs_det_jacobian()。

transform_to() 注册表对于对概率分布的受约束参数执行无约束优化很有用，这些参数由每个分布的 .arg_constraints 字典指示。这些变换通常过度参数化一个空间以避免旋转；因此它们更适合像 Adam 这样的按坐标优化算法。

loc = torch.zeros(100, requires_grad=True)
unconstrained = torch.zeros(100, requires_grad=True)
scale = transform_to(Normal.arg_constraints["scale"])(unconstrained)
loss = -Normal(loc, scale).log_prob(data).sum()

biject_to() 注册表对于汉密尔顿蒙特卡洛（Hamiltonian Monte Carlo）很有用，其中从具有受约束 .support 的概率分布中提取的样本在无约束空间中传播，并且算法通常是旋转不变的。

dist = Exponential(rate)
unconstrained = torch.zeros(100, requires_grad=True)
sample = biject_to(dist.support)(unconstrained)
potential_energy = -dist.log_prob(sample).sum()

注意

transform_to 和 biject_to 不同的一个例子是 constraints.simplex：transform_to(constraints.simplex) 返回一个 SoftmaxTransform，它简单地对其输入进行指数化和归一化；这是一个计算开销小且主要按坐标进行的操作，适用于像 SVI 这样的算法。相比之下，biject_to(constraints.simplex) 返回一个 StickBreakingTransform，它将其输入双射到一个维度减一的空间；这是一个计算开销更大、数值稳定性更差的变换，但对于像 HMC 这样的算法是必需的。

biject_to 和 transform_to 对象可以通过用户定义的约束和变换使用它们的 .register() 方法进行扩展，可以作为单例约束上的函数

transform_to.register(my_constraint, my_transform)

或者作为参数化约束上的装饰器

@transform_to.register(MyConstraintClass)
def my_factory(constraint):
    assert isinstance(constraint, MyConstraintClass)
    return MyTransform(constraint.param1, constraint.param2)

您可以通过创建一个新的 ConstraintRegistry 对象来创建自己的注册表。

class torch.distributions.constraint_registry.ConstraintRegistry[source][source]¶

用于链接约束和变换的注册表。

register(constraint, factory=None)[source][source]¶

在此注册表中注册一个 Constraint 子类。用法

@my_registry.register(MyConstraintClass)
def construct_transform(constraint):
    assert isinstance(constraint, MyConstraint)
    return MyTransform(constraint.arg_constraints)

参数

constraint (Constraint 的子类) – Constraint 的子类，或所需类的单例对象。
factory (Callable) – 一个可调用对象，它接受一个约束对象作为输入，并返回一个 Transform 对象。

概率分布 - torch.distributions¶

得分函数¶

路径导数¶

分布¶

指数族¶

伯努利分布¶

Beta 分布¶

二项分布¶

范畴分布¶

柯西分布¶

卡方分布¶

连续伯努利分布¶

Dirichlet 分布¶

指数分布¶

FisherSnedecor¶

Gamma¶

Geometric¶

Gumbel¶

HalfCauchy¶

HalfNormal¶

Independent¶

InverseGamma¶

Kumaraswamy¶

LKJCholesky¶

拉普拉斯分布¶

对数正态分布¶

低秩多元正态分布¶

同族混合分布¶

多项分布¶

多元正态分布¶

负二项分布¶

正态分布¶

独热分类分布 (OneHotCategorical)¶

帕累托分布 (Pareto)¶

泊松分布 (Poisson)¶

松弛伯努利分布 (RelaxedBernoulli)¶

Logit松弛伯努利分布 (LogitRelaxedBernoulli)¶

松弛独热分类分布 (RelaxedOneHotCategorical)¶

学生 t 分布 (StudentT)¶

变换分布 (TransformedDistribution)¶

均匀分布 (Uniform)¶

冯·米塞斯分布 (VonMises)¶

威布尔分布 (Weibull)¶

Wishart 分布¶

KL 散度¶

Transforms¶

Constraints¶

Constraint Registry¶

文档

教程

资源