注意

点击此处下载完整示例代码

介绍 || 张量 || Autograd || 构建模型 || TensorBoard 支持 || 训练模型 || 理解模型

使用 Captum 理解模型¶

创建于：2021 年 11 月 30 日 | 最后更新：2024 年 1 月 19 日 | 最后验证：2024 年 11 月 05 日

可观看下方视频或在 YouTube 上观看。在此下载 Notebook 和相应文件。

Captum (在拉丁语中意为“理解”) 是一个构建于 PyTorch 之上的开源、可扩展的模型可解释性库。

随着模型复杂性的增加以及由此导致的透明度不足，模型可解释性方法变得越来越重要。模型理解既是一个活跃的研究领域，也是机器学习在各行各业实际应用的重点领域。Captum 提供了最先进的算法，包括集成梯度 (Integrated Gradients)，为研究人员和开发者提供一种简单的方式来理解哪些特征对模型的输出有所贡献。

完整的文档、API 参考和针对特定主题的一系列教程可在 captum.ai 网站上找到。

引言¶

Captum 对模型可解释性的方法基于归因。Captum 中提供了三种类型的归因

特征归因 (Feature Attribution) 旨在根据生成特定输出的输入特征来解释该输出。解释一篇影评是积极还是消极，通过影评中的某些词语来理解，就是特征归因的一个例子。
层归因 (Layer Attribution) 检查模型隐藏层在接收特定输入后的活动情况。检查卷积层对输入图像的空间映射输出就是一个层归因的例子。
神经元归因 (Neuron Attribution) 类似于层归因，但侧重于单个神经元的活动。

在这个交互式 Notebook 中，我们将研究特征归因和层归因。

这三种归因类型都关联了多种归因算法。许多归因算法分为两大类

基于梯度的算法 (Gradient-based algorithms) 计算模型输出、层输出或神经元激活相对于输入的反向梯度。集成梯度 (Integrated Gradients) (用于特征)、层梯度 * 激活 (Layer Gradient * Activation) 和神经元传导 (Neuron Conductance) 都属于基于梯度的算法。
基于扰动的算法 (Perturbation-based algorithms) 检查模型、层或神经元输出对输入变化的响应。输入扰动可以是定向的或随机的。遮挡 (Occlusion)、特征消融 (Feature Ablation) 和特征置换 (Feature Permutation) 都属于基于扰动的算法。

我们将在下面研究这两种类型的算法。

特别是在涉及大型模型时，以易于与被检查的输入特征相关联的方式可视化归因数据可能非常有价值。虽然当然可以使用 Matplotlib、Plotly 或类似工具创建自己的可视化，但 Captum 提供了针对其归因的增强工具

captum.attr.visualization 模块（下面导入为 viz）提供了用于可视化图像相关归因的有用函数。
Captum Insights 是构建在 Captum 之上的易于使用的 API，提供了一个可视化小部件，其中包含适用于图像、文本和任意模型类型的现成可视化。

这两种可视化工具集都将在本 Notebook 中进行演示。前几个示例将侧重于计算机视觉用例，但最后的 Captum Insights 部分将演示多模型、视觉问答模型中的归因可视化。

安装¶

在开始之前，您需要具备一个 Python 环境，其中包含

Python 版本 3.6 或更高
对于 Captum Insights 示例，Flask 1.1 或更高以及 Flask-Compress（推荐最新版本）
PyTorch 版本 1.2 或更高（推荐最新版本）
TorchVision 版本 0.6 或更高（推荐最新版本）
Captum（推荐最新版本）
Matplotlib 版本 3.3.4，因为 Captum 当前使用 Matplotlib 的一个函数，该函数在更高版本中更改了参数名称

要在 Anaconda 或 pip 虚拟环境中安装 Captum，请使用适合您环境的相应命令

使用 conda

conda install pytorch torchvision captum flask-compress matplotlib=3.3.4 -c pytorch

使用 pip

pip install torch torchvision captum matplotlib==3.3.4 Flask-Compress

在您设置好的环境中重启此 Notebook，即可开始！

第一个示例¶

首先，让我们看一个简单的视觉示例。我们将从在 ImageNet 数据集上预训练的 ResNet 模型开始。我们将获取一个测试输入，并使用不同的特征归因算法来检查输入图像如何影响输出，并查看一些测试图像的输入归因图的有用可视化。

首先，一些导入

import torch
import torch.nn.functional as F
import torchvision.transforms as transforms
import torchvision.models as models

import captum
from captum.attr import IntegratedGradients, Occlusion, LayerGradCam, LayerAttribution
from captum.attr import visualization as viz

import os, sys
import json

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

现在我们将使用 TorchVision 模型库下载一个预训练的 ResNet 模型。由于我们不进行训练，暂时将其设置为评估模式。

model = models.resnet18(weights='IMAGENET1K_V1')
model = model.eval()

获取此交互式 Notebook 的位置应该还有一个 img 文件夹，其中包含一个 cat.jpg 文件。

test_img = Image.open('img/cat.jpg')
test_img_data = np.asarray(test_img)
plt.imshow(test_img_data)
plt.show()

我们的 ResNet 模型在 ImageNet 数据集上训练，并期望图像具有特定尺寸，通道数据标准化到特定值范围。我们还将拉入模型识别的类别的可读标签列表 - 这也应该在 img 文件夹中。

# model expects 224x224 3-color image
transform = transforms.Compose([
 transforms.Resize(224),
 transforms.CenterCrop(224),
 transforms.ToTensor()
])

# standard ImageNet normalization
transform_normalize = transforms.Normalize(
     mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]
 )

transformed_img = transform(test_img)
input_img = transform_normalize(transformed_img)
input_img = input_img.unsqueeze(0) # the model requires a dummy batch dimension

labels_path = 'img/imagenet_class_index.json'
with open(labels_path) as json_data:
    idx_to_labels = json.load(json_data)

现在，我们可以问一个问题：我们的模型认为这张图像代表什么？

output = model(input_img)
output = F.softmax(output, dim=1)
prediction_score, pred_label_idx = torch.topk(output, 1)
pred_label_idx.squeeze_()
predicted_label = idx_to_labels[str(pred_label_idx.item())][1]
print('Predicted:', predicted_label, '(', prediction_score.squeeze().item(), ')')

我们已经确认 ResNet 认为我们的猫的图像实际上就是一只猫。但模型为什么认为这是一只猫的图像呢？

要回答这个问题，我们转向 Captum。

使用集成梯度进行特征归因¶

特征归因 (Feature attribution) 将特定输出归因于输入的特征。它使用特定的输入（此处为我们的测试图像）生成一个图，显示每个输入特征对特定输出特征的相对重要性。

集成梯度 (Integrated Gradients) 是 Captum 中可用的特征归因算法之一。集成梯度通过近似模型输出相对于输入的梯度的积分，为每个输入特征分配一个重要性分数。

在我们的例子中，我们将选取输出向量的一个特定元素（即表示模型对其所选类别的置信度的那个元素），并使用集成梯度来理解输入图像的哪些部分对这个输出有所贡献。

一旦获得集成梯度生成的重要性图，我们将使用 Captum 中的可视化工具来提供重要性图的有用表示。Captum 的 visualize_image_attr() 函数提供了多种选项来自定义归因数据的显示。此处，我们传入一个自定义的 Matplotlib 颜色映射。

运行包含 integrated_gradients.attribute() 调用的单元格通常需要一两分钟。

# Initialize the attribution algorithm with the model
integrated_gradients = IntegratedGradients(model)

# Ask the algorithm to attribute our output target to
attributions_ig = integrated_gradients.attribute(input_img, target=pred_label_idx, n_steps=200)

# Show the original image for comparison
_ = viz.visualize_image_attr(None, np.transpose(transformed_img.squeeze().cpu().detach().numpy(), (1,2,0)),
                      method="original_image", title="Original Image")

default_cmap = LinearSegmentedColormap.from_list('custom blue',
                                                 [(0, '#ffffff'),
                                                  (0.25, '#0000ff'),
                                                  (1, '#0000ff')], N=256)

_ = viz.visualize_image_attr(np.transpose(attributions_ig.squeeze().cpu().detach().numpy(), (1,2,0)),
                             np.transpose(transformed_img.squeeze().cpu().detach().numpy(), (1,2,0)),
                             method='heat_map',
                             cmap=default_cmap,
                             show_colorbar=True,
                             sign='positive',
                             title='Integrated Gradients')

在上面的图像中，您应该看到集成梯度在图像中猫的位置周围给出了最强的信号。

使用遮挡进行特征归因¶

基于梯度的归因方法通过直接计算输出相对于输入的改变来帮助理解模型。基于扰动的归因方法则更直接地通过引入输入的变化来衡量对输出的影响。遮挡 (Occlusion) 就是其中一种方法。它涉及替换输入图像的某些区域，并检查对输出信号的影响。

下面，我们设置遮挡归因。与配置卷积神经网络类似，您可以指定目标区域的大小和步长来确定单个测量之间的间距。我们将使用 visualize_image_attr_multiple() 可视化遮挡归因的输出，显示按区域划分的正负归因热图，并通过使用正归因区域掩盖原始图像。掩盖图像可以非常直观地展示模型认为猫照片中哪些区域最像“猫”。

occlusion = Occlusion(model)

attributions_occ = occlusion.attribute(input_img,
                                       target=pred_label_idx,
                                       strides=(3, 8, 8),
                                       sliding_window_shapes=(3,15, 15),
                                       baselines=0)


_ = viz.visualize_image_attr_multiple(np.transpose(attributions_occ.squeeze().cpu().detach().numpy(), (1,2,0)),
                                      np.transpose(transformed_img.squeeze().cpu().detach().numpy(), (1,2,0)),
                                      ["original_image", "heat_map", "heat_map", "masked_image"],
                                      ["all", "positive", "negative", "positive"],
                                      show_colorbar=True,
                                      titles=["Original", "Positive Attribution", "Negative Attribution", "Masked"],
                                      fig_size=(18, 6)
                                     )

同样，我们看到包含猫的图像区域被赋予了更大的重要性。

使用 Layer GradCAM 进行层归因¶

层归因 (Layer Attribution) 允许您将模型中隐藏层的活动归因于您的输入特征。下面，我们将使用层归因算法来检查模型中一个卷积层的活动。

GradCAM 计算目标输出相对于给定层的梯度，对每个输出通道（输出的维度 2）取平均，然后将每个通道的平均梯度乘以层激活。结果在所有通道上求和。GradCAM 专为卷积网络设计；由于卷积层的活动通常在空间上映射到输入，因此 GradCAM 归因通常会被上采样并用于掩盖输入。

层归因的设置类似于输入归因，不同之处在于除了模型之外，您还必须指定要检查的模型中的隐藏层。如上所述，当我们调用 attribute() 时，我们指定感兴趣的目标类别。

layer_gradcam = LayerGradCam(model, model.layer3[1].conv2)
attributions_lgc = layer_gradcam.attribute(input_img, target=pred_label_idx)

_ = viz.visualize_image_attr(attributions_lgc[0].cpu().permute(1,2,0).detach().numpy(),
                             sign="all",
                             title="Layer 3 Block 1 Conv 2")

我们将使用 LayerAttribution 基类中的便利方法 interpolate() 对此归因数据进行上采样，以便与输入图像进行比较。

upsamp_attr_lgc = LayerAttribution.interpolate(attributions_lgc, input_img.shape[2:])

print(attributions_lgc.shape)
print(upsamp_attr_lgc.shape)
print(input_img.shape)

_ = viz.visualize_image_attr_multiple(upsamp_attr_lgc[0].cpu().permute(1,2,0).detach().numpy(),
                                      transformed_img.permute(1,2,0).numpy(),
                                      ["original_image","blended_heat_map","masked_image"],
                                      ["all","positive","positive"],
                                      show_colorbar=True,
                                      titles=["Original", "Positive Attribution", "Masked"],
                                      fig_size=(18, 6))

像这样的可视化可以为您提供关于隐藏层如何响应输入的新颖见解。

使用 Captum Insights 进行可视化¶

Captum Insights 是一个构建在 Captum 之上的可解释性可视化小部件，旨在促进模型理解。Captum Insights 适用于图像、文本和其他特征，帮助用户理解特征归因。它允许您可视化多个输入/输出对的归因，并为图像、文本和任意数据提供可视化工具。

在本 Notebook 的这一部分，我们将使用 Captum Insights 可视化多个图像分类推理结果。

首先，让我们收集一些图像，看看模型对它们的看法。为了增加多样性，我们将选取我们的猫、一个茶壶和一个三叶虫化石

imgs = ['img/cat.jpg', 'img/teapot.jpg', 'img/trilobite.jpg']

for img in imgs:
    img = Image.open(img)
    transformed_img = transform(img)
    input_img = transform_normalize(transformed_img)
    input_img = input_img.unsqueeze(0) # the model requires a dummy batch dimension

    output = model(input_img)
    output = F.softmax(output, dim=1)
    prediction_score, pred_label_idx = torch.topk(output, 1)
    pred_label_idx.squeeze_()
    predicted_label = idx_to_labels[str(pred_label_idx.item())][1]
    print('Predicted:', predicted_label, '/', pred_label_idx.item(), ' (', prediction_score.squeeze().item(), ')')

...看起来我们的模型都正确识别了它们 - 但当然，我们想深入了解。为此，我们将使用 Captum Insights 小部件，我们使用一个 AttributionVisualizer 对象进行配置，该对象在下面导入。AttributionVisualizer 期望批次数据，因此我们将引入 Captum 的 Batch 辅助类。我们将专门查看图像，因此我们还将导入 ImageFeature。

我们使用以下参数配置 AttributionVisualizer

一个待检查的模型数组（在我们的例子中只有一个）
一个评分函数，允许 Captum Insights 从模型中提取 top-k 预测
一个模型训练所用的、有序的、可读的类别列表
一个要查找的特征列表 - 在我们的例子中是一个 ImageFeature
一个数据集，它是一个可迭代对象，返回输入和标签的批次 - 就像您用于训练的那样

from captum.insights import AttributionVisualizer, Batch
from captum.insights.attr_vis.features import ImageFeature

# Baseline is all-zeros input - this may differ depending on your data
def baseline_func(input):
    return input * 0

# merging our image transforms from above
def full_img_transform(input):
    i = Image.open(input)
    i = transform(i)
    i = transform_normalize(i)
    i = i.unsqueeze(0)
    return i


input_imgs = torch.cat(list(map(lambda i: full_img_transform(i), imgs)), 0)

visualizer = AttributionVisualizer(
    models=[model],
    score_func=lambda o: torch.nn.functional.softmax(o, 1),
    classes=list(map(lambda k: idx_to_labels[k][1], idx_to_labels.keys())),
    features=[
        ImageFeature(
            "Photo",
            baseline_transforms=[baseline_func],
            input_transforms=[],
        )
    ],
    dataset=[Batch(input_imgs, labels=[282,849,69])]
)

请注意，运行上面的单元格几乎没有花时间，这与我们上面运行归因不同。这是因为 Captum Insights 允许您在可视化小部件中配置不同的归因算法，然后它将计算并显示归因。这个过程将需要几分钟。

运行下面的单元格将渲染 Captum Insights 小部件。然后，您可以选择归因方法及其参数，根据预测类别或预测正确性过滤模型响应，查看模型的预测及其相关概率，并查看归因与原始图像比较的热图。

visualizer.render()

脚本总运行时间： ( 0 分钟 0.000 秒)

由 Sphinx-Gallery 生成