• 文档 >
  • 多图像生成 Streamlit 应用:使用 TorchServe、torch.compile 和 OpenVINO 链接 Llama 和 Stable Diffusion
快捷方式

多图像生成 Streamlit 应用:使用 TorchServe、torch.compile 和 OpenVINO 链接 Llama 和 Stable Diffusion

此多图像生成 Streamlit 应用旨在根据提供的文本提示生成多张图像。该应用没有直接使用 Stable Diffusion,而是链接 Llama 和 Stable Diffusion 以增强图像生成过程。 工作原理如下:

Multi-Image Generation App Workflow

快速入门指南

先决条件:

  • 系统上已安装 Docker

  • Hugging Face 令牌:创建一个 Hugging Face 账户并获取一个令牌,该令牌有权访问 meta-llama/Llama-3.2-3B-Instruct 模型。

要启动多图像生成应用,请按照以下步骤操作:

# 1: Set HF Token as Env variable
export HUGGINGFACE_TOKEN=<HUGGINGFACE_TOKEN>

# 2: Build Docker image for this Multi-Image Generation App
git clone https://github.com/pytorch/serve.git
cd serve
./examples/usecases/llm_diffusion_serving_app/docker/build_image.sh

# 3: Launch the streamlit app for server & client
# After the Docker build is successful, you will see a "docker run" command printed to the console. 
# Run that "docker run" command to launch the Streamlit app for both the server and client.

Docker 构建的示例输出:

ubuntu@ip-10-0-0-137:~/serve$ ./examples/usecases/llm_diffusion_serving_app/docker/build_image.sh 
EXAMPLE_DIR: .//examples/usecases/llm_diffusion_serving_app/docker
ROOT_DIR: /home/ubuntu/serve
DOCKER_BUILDKIT=1 docker buildx build --platform=linux/amd64 --file .//examples/usecases/llm_diffusion_serving_app/docker/Dockerfile --build-arg BASE_IMAGE="pytorch/torchserve:latest-cpu" --build-arg EXAMPLE_DIR=".//examples/usecases/llm_diffusion_serving_app/docker" --build-arg HUGGINGFACE_TOKEN=hf_<token> --build-arg HTTP_PROXY= --build-arg HTTPS_PROXY= --build-arg NO_PROXY= -t "pytorch/torchserve:llm_diffusion_serving_app" .
[+] Building 1.4s (18/18) FINISHED                                                                                                                                                               docker:default
 => [internal] load .dockerignore                                                                                                                                                                          0.0s
 .
 .
 .
 => => naming to docker.io/pytorch/torchserve:llm_diffusion_serving_app                                                                                                                                    0.0s

Docker Build Successful ! 

............................ Next Steps ............................
--------------------------------------------------------------------
[Optional] Run the following command to benchmark Stable Diffusion:
--------------------------------------------------------------------

docker run --rm --platform linux/amd64 \
        --name llm_sd_app_bench \
        -v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
        --entrypoint python \
        pytorch/torchserve:llm_diffusion_serving_app \
        /home/model-server/llm_diffusion_serving_app/sd-benchmark.py -ni 3

-------------------------------------------------------------------
Run the following command to start the Multi-Image generation App:
-------------------------------------------------------------------

docker run --rm -it --platform linux/amd64 \
        --name llm_sd_app \
        -p 127.0.0.1:8080:8080 \
        -p 127.0.0.1:8081:8081 \
        -p 127.0.0.1:8082:8082 \
        -p 127.0.0.1:8084:8084 \
        -p 127.0.0.1:8085:8085 \
        -v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
        -e MODEL_NAME_LLM=meta-llama/Llama-3.2-3B-Instruct \
        -e MODEL_NAME_SD=stabilityai/stable-diffusion-xl-base-1.0 \
        pytorch/torchserve:llm_diffusion_serving_app

Note: You can replace the model identifiers (MODEL_NAME_LLM, MODEL_NAME_SD) as needed.

预期结果

使用成功构建后显示的 docker run .. 命令启动 Docker 容器后,您可以访问两个独立的 Streamlit 应用程序:

  1. TorchServe 服务器应用(运行在 https://127.0.0.1:8084),用于启动/停止 TorchServe、加载/注册模型、向上/向下扩展工作进程。

  2. 客户端应用(运行在 https://127.0.0.1:8085),您可以在其中输入图像生成的提示。

注意:您还可以运行快速基准测试,比较 Stable Diffusion 在 Eager 模式、使用 inductor 的 torch.compile 和 openvino 时的性能。 查看成功构建后显示的 docker run .. 命令以进行基准测试

启动应用的示例输出:

ubuntu@ip-10-0-0-137:~/serve$ docker run --rm -it --platform linux/amd64 \
        --name llm_sd_app \
        -p 127.0.0.1:8080:8080 \
        -p 127.0.0.1:8081:8081 \
        -p 127.0.0.1:8082:8082 \
        -p 127.0.0.1:8084:8084 \
        -p 127.0.0.1:8085:8085 \
        -v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
        -e MODEL_NAME_LLM=meta-llama/Llama-3.2-3B-Instruct \
        -e MODEL_NAME_SD=stabilityai/stable-diffusion-xl-base-1.0 \
        pytorch/torchserve:llm_diffusion_serving_app

Preparing meta-llama/Llama-3.2-1B-Instruct
/home/model-server/llm_diffusion_serving_app/llm /home/model-server/llm_diffusion_serving_app
Model meta-llama---Llama-3.2-1B-Instruct already downloaded.
Model archive for meta-llama---Llama-3.2-1B-Instruct exists.
/home/model-server/llm_diffusion_serving_app

Preparing stabilityai/stable-diffusion-xl-base-1.0
/home/model-server/llm_diffusion_serving_app/sd /home/model-server/llm_diffusion_serving_app
Model stabilityai/stable-diffusion-xl-base-1.0 already downloaded
Model archive for stabilityai---stable-diffusion-xl-base-1.0 exists.
/home/model-server/llm_diffusion_serving_app

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.

  You can now view your Streamlit app in your browser.

  Local URL: https://127.0.0.1:8085
  Network URL: http://123.11.0.2:8085
  External URL: http://123.123.12.34:8085


  You can now view your Streamlit app in your browser.

  Local URL: https://127.0.0.1:8084
  Network URL: http://123.11.0.2:8084
  External URL: http://123.123.12.34:8084

Stable Diffusion 基准测试的示例输出:

要运行 Stable Diffusion 基准测试,请使用 sd-benchmark.py。 请参阅下面的详细信息以获取示例控制台输出。

ubuntu@ip-10-0-0-137:~/serve$ docker run --rm --platform linux/amd64 \
        --name llm_sd_app_bench \
        -v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
        --entrypoint python \
        pytorch/torchserve:llm_diffusion_serving_app \
        /home/model-server/llm_diffusion_serving_app/sd-benchmark.py -ni 3
.
.
.

Hardware Info:
--------------------------------------------------------------------------------
cpu_model: Intel(R) Xeon(R) Platinum 8488C
cpu_count: 64
threads_per_core: 2
cores_per_socket: 32
socket_count: 1
total_memory: 247.71 GB

Software Versions:
--------------------------------------------------------------------------------
Python: 3.9.20
TorchServe: 0.12.0
OpenVINO: 2024.5.0
PyTorch: 2.5.1+cpu
Transformers: 4.46.3
Diffusers: 0.31.0

Benchmark Summary:
--------------------------------------------------------------------------------
+-------------+----------------+---------------------------+
| Run Mode    | Warm-up Time   | Average Time for 3 iter   |
+=============+================+===========================+
| eager       | 11.25 seconds  | 10.13 +/- 0.02 seconds    |
+-------------+----------------+---------------------------+
| tc_inductor | 85.40 seconds  | 8.85 +/- 0.03 seconds     |
+-------------+----------------+---------------------------+
| tc_openvino | 52.57 seconds  | 2.58 +/- 0.04 seconds     |
+-------------+----------------+---------------------------+

Results saved in directory: /home/model-server/model-store/benchmark_results_20241123_071103
Files in the /home/model-server/model-store/benchmark_results_20241123_071103 directory:
benchmark_results.json
image-eager-final.png
image-tc_inductor-final.png
image-tc_openvino-final.png

Results saved at /home/model-server/model-store/ which is a Docker container mount, corresponds to 'serve/model-store-local/' on the host machine.

带有性能分析的 Stable Diffusion 基准测试的示例输出:

要运行带有性能分析的 Stable Diffusion 基准测试,请使用 --run_profiling-rp。 请参阅下面的详细信息以获取示例控制台输出。 示例性能分析基准测试输出文件位于 assets/benchmark_results_20241123_044407/

ubuntu@ip-10-0-0-137:~/serve$ docker run --rm --platform linux/amd64 \
        --name llm_sd_app_bench \
        -v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
        --entrypoint python \
        pytorch/torchserve:llm_diffusion_serving_app \
        /home/model-server/llm_diffusion_serving_app/sd-benchmark.py -rp
.
.
.
Hardware Info:
--------------------------------------------------------------------------------
cpu_model: Intel(R) Xeon(R) Platinum 8488C
cpu_count: 64
threads_per_core: 2
cores_per_socket: 32
socket_count: 1
total_memory: 247.71 GB

Software Versions:
--------------------------------------------------------------------------------
Python: 3.9.20
TorchServe: 0.12.0
OpenVINO: 2024.5.0
PyTorch: 2.5.1+cpu
Transformers: 4.46.3
Diffusers: 0.31.0

Benchmark Summary:
--------------------------------------------------------------------------------
+-------------+----------------+---------------------------+
| Run Mode    | Warm-up Time   | Average Time for 1 iter   |
+=============+================+===========================+
| eager       | 9.33 seconds   | 8.57 +/- 0.00 seconds     |
+-------------+----------------+---------------------------+
| tc_inductor | 81.11 seconds  | 7.20 +/- 0.00 seconds     |
+-------------+----------------+---------------------------+
| tc_openvino | 50.76 seconds  | 1.72 +/- 0.00 seconds     |
+-------------+----------------+---------------------------+

Results saved in directory: /home/model-server/model-store/benchmark_results_20241123_071629
Files in the /home/model-server/model-store/benchmark_results_20241123_071629 directory:
benchmark_results.json
image-eager-final.png
image-tc_inductor-final.png
image-tc_openvino-final.png
profile-eager.txt
profile-tc_inductor.txt
profile-tc_openvino.txt

num_iter is set to 1 as run_profiling flag is enabled !

Results saved at /home/model-server/model-store/ which is a Docker container mount, corresponds to 'serve/model-store-local/' on the host machine.

多图像生成应用 UI

应用工作流

Multi-Image Generation App Workflow Gif

应用截图

服务器应用截图 1 服务器应用截图 2 服务器应用截图 3
客户端应用截图 1 客户端应用截图 2 客户端应用截图 3

© 版权所有 2020,PyTorch Serve 贡献者。

使用 Sphinx 构建,主题由 theme 提供,并由 Read the Docs 托管。

文档

访问 PyTorch 的全面开发者文档

查看文档

教程

获取面向初学者和高级开发者的深入教程

查看教程

资源

查找开发资源并获得您的问题解答

查看资源