• 文档 >
  • 可变 Torch TensorRT 模块
快捷方式

可变 Torch TensorRT 模块

我们将演示如何轻松使用可变 Torch TensorRT 模块来编译、交互和修改 TensorRT 图模块。

编译 Torch-TensorRT 模块很简单,但修改已编译的模块可能很困难,尤其是在维护 PyTorch 模块与相应的 Torch-TensorRT 模块之间的状态和连接方面。在提前编译 (AoT) 场景中,将 Torch TensorRT 与复杂的管道(例如 Hugging Face Stable Diffusion 管道)集成变得更加困难。可变 Torch TensorRT 模块旨在解决这些挑战,使与 Torch-TensorRT 模块的交互比以往更容易。

在本教程中,我们将逐步完成 1. 使用 ResNet 18 的可变 Torch TensorRT 模块的示例工作流程 2. 保存可变 Torch TensorRT 模块 3. LoRA 使用案例中与 Huggingface 管道的集成

import numpy as np
import torch
import torch_tensorrt as torch_trt
import torchvision.models as models

np.random.seed(5)
torch.manual_seed(5)
inputs = [torch.rand((1, 3, 224, 224)).to("cuda")]

使用设置初始化可变 Torch TensorRT 模块。

settings = {
    "use_python": False,
    "enabled_precisions": {torch.float32},
    "make_refittable": True,
}

model = models.resnet18(pretrained=True).eval().to("cuda")
mutable_module = torch_trt.MutableTorchTensorRTModule(model, **settings)
# You can use the mutable module just like the original pytorch module. The compilation happens while you first call the mutable module.
mutable_module(*inputs)

对可变模块进行修改。

对可变模块进行更改可能会触发重新拟合或重新编译。例如,加载不同的 state_dict 并设置新的权重值将触发重新拟合,向模型添加模块将触发重新编译。

model2 = models.resnet18(pretrained=False).eval().to("cuda")
mutable_module.load_state_dict(model2.state_dict())


# Check the output
# The refit happens while you call the mutable module again.
expected_outputs, refitted_outputs = model2(*inputs), mutable_module(*inputs)
for expected_output, refitted_output in zip(expected_outputs, refitted_outputs):
    assert torch.allclose(
        expected_output, refitted_output, 1e-2, 1e-2
    ), "Refit Result is not correct. Refit failed"

print("Refit successfully!")

保存可变 Torch TensorRT 模块

# Currently, saving is only enabled for C++ runtime, not python runtime.
torch_trt.MutableTorchTensorRTModule.save(mutable_module, "mutable_module.pkl")
reload = torch_trt.MutableTorchTensorRTModule.load("mutable_module.pkl")

使用 Huggingface 的 Stable Diffusion

# The LoRA checkpoint is from https://civitai.com/models/12597/moxin

from diffusers import DiffusionPipeline

with torch.no_grad():
    settings = {
        "use_python_runtime": True,
        "enabled_precisions": {torch.float16},
        "debug": True,
        "make_refittable": True,
    }

    model_id = "runwayml/stable-diffusion-v1-5"
    device = "cuda:0"

    prompt = "house in forest, shuimobysim, wuchangshuo, best quality"
    negative = "(worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, out of focus, cloudy, (watermark:2),"

    pipe = DiffusionPipeline.from_pretrained(
        model_id, revision="fp16", torch_dtype=torch.float16
    )
    pipe.to(device)

    # The only extra line you need
    pipe.unet = torch_trt.MutableTorchTensorRTModule(pipe.unet, **settings)

    image = pipe(prompt, negative_prompt=negative, num_inference_steps=30).images[0]
    image.save("./without_LoRA_mutable.jpg")

    # Standard Huggingface LoRA loading procedure
    pipe.load_lora_weights(
        "stablediffusionapi/load_lora_embeddings",
        weight_name="moxin.safetensors",
        adapter_name="lora1",
    )
    pipe.set_adapters(["lora1"], adapter_weights=[1])
    pipe.fuse_lora()
    pipe.unload_lora_weights()

    # Refit triggered
    image = pipe(prompt, negative_prompt=negative, num_inference_steps=30).images[0]
    image.save("./with_LoRA_mutable.jpg")

脚本的总运行时间: ( 0 分钟 0.000 秒)

由 Sphinx-Gallery 生成的画廊

文档

访问 PyTorch 的全面开发者文档

查看文档

教程

获取针对初学者和高级开发人员的深入教程

查看教程

资源

查找开发资源并解答您的问题

查看资源