快捷方式

构建说明

注意:最新的构建说明包含在 FBGEMM 仓库下 setup_env.bash 捆绑的一组脚本中。

当前可用的 FBGEMM GenAI 构建变体有

  • CUDA

构建 FBGEMM GenAI 的一般步骤如下

  1. 设置独立的构建环境。

  2. 设置用于 CUDA 构建的工具链。

  3. 安装 PyTorch。

  4. 运行构建脚本。

设置独立的构建环境

按照说明设置 Conda 环境

  1. 设置独立的构建环境

  2. 设置用于 CUDA 构建

  3. 安装构建工具

  4. 安装 PyTorch

其他预构建设置

由于 FBGEMM GenAI 利用与 FBGEMM_GPU 相同的构建过程,请参阅 准备构建 以获取更多预构建设置信息。

准备构建

克隆仓库及其子模块,并安装 requirements_genai.txt

# !! Run inside the Conda environment !!

# Select a version tag
FBGEMM_VERSION=v1.2.0

# Clone the repo along with its submodules
git clone --recursive -b ${FBGEMM_VERSION} https://github.com/pytorch/FBGEMM.git fbgemm_${FBGEMM_VERSION}

# Install additional required packages for building and testing
cd fbgemm_${FBGEMM_VERSION}/fbgemm_gpu
pip install -r requirements_genai.txt

设置 Wheel 构建变量

构建 Python wheel 时,必须首先正确设置包名、Python 版本标签和 Python 平台名称

# Set the package name depending on the build variant
export package_name=fbgemm_genai_{cuda}

# Set the Python version tag.  It should follow the convention `py<major><minor>`,
# e.g. Python 3.13 --> py313
export python_tag=py313

# Determine the processor architecture
export ARCH=$(uname -m)

# Set the Python platform name for the Linux case
export python_plat_name="manylinux_2_28_${ARCH}"
# For the macOS (x86_64) case
export python_plat_name="macosx_10_9_${ARCH}"
# For the macOS (arm64) case
export python_plat_name="macosx_11_0_${ARCH}"
# For the Windows case
export python_plat_name="win_${ARCH}"

CUDA 构建

为 CUDA 构建 FBGEMM GenAI 需要安装 NVML 和 cuDNN,并通过环境变量使其可用于构建。但是,构建包不需要存在 CUDA 设备。

与仅支持 CPU 的构建类似,通过将 --cxxprefix=$CONDA_PREFIX 附加到构建命令,可以启用使用 Clang + libstdc++ 进行构建,前提是工具链已正确安装。

# !! Run in fbgemm_gpu/ directory inside the Conda environment !!

# [OPTIONAL] Specify the CUDA installation paths
# This may be required if CMake is unable to find nvcc
export CUDACXX=/path/to/nvcc
export CUDA_BIN_PATH=/path/to/cuda/installation

# [OPTIONAL] Provide the CUB installation directory (applicable only to CUDA versions prior to 11.1)
export CUB_DIR=/path/to/cub

# [OPTIONAL] Allow NVCC to use host compilers that are newer than what NVCC officially supports
nvcc_prepend_flags=(
  -allow-unsupported-compiler
)

# [OPTIONAL] If clang is the host compiler, set NVCC to use libstdc++ since libc++ is not supported
nvcc_prepend_flags+=(
  -Xcompiler -stdlib=libstdc++
  -ccbin "/path/to/clang++"
)

# [OPTIONAL] Set NVCC_PREPEND_FLAGS as needed
export NVCC_PREPEND_FLAGS="${nvcc_prepend_flags[@]}"

# [OPTIONAL] Enable verbose NVCC logs
export NVCC_VERBOSE=1

# Specify cuDNN header and library paths
export CUDNN_INCLUDE_DIR=/path/to/cudnn/include
export CUDNN_LIBRARY=/path/to/cudnn/lib

# Specify NVML filepath
export NVML_LIB_PATH=/path/to/libnvidia-ml.so

# Specify NCCL filepath
export NCCL_LIB_PATH=/path/to/libnccl.so.2

# Build for SM70/80 (V100/A100 GPU); update as needed
# If not specified, only the CUDA architecture supported by current system will be targeted
# If not specified and no CUDA device is present either, all CUDA architectures will be targeted
cuda_arch_list=7.0;8.0

# Unset TORCH_CUDA_ARCH_LIST if it exists, bc it takes precedence over
# -DTORCH_CUDA_ARCH_LIST during the invocation of setup.py
unset TORCH_CUDA_ARCH_LIST

# Build the wheel artifact only
python setup.py bdist_wheel \
    --package_variant=genai \
    --python-tag="${python_tag}" \
    --plat-name="${python_plat_name}" \
    --nvml_lib_path=${NVML_LIB_PATH} \
    --nccl_lib_path=${NCCL_LIB_PATH} \
    -DTORCH_CUDA_ARCH_LIST="${cuda_arch_list}"

# Build and install the library into the Conda environment
python setup.py install \
    --package_variant=genai \
    --nvml_lib_path=${NVML_LIB_PATH} \
    --nccl_lib_path=${NCCL_LIB_PATH} \
    -DTORCH_CUDA_ARCH_LIST="${cuda_arch_list}"

构建后检查(面向开发者)

由于 FBGEMM GenAI 利用与 FBGEMM_GPU 相同的构建过程,请参阅 构建后检查(面向开发者) 以获取有关额外构建后检查的信息。

文档

访问 PyTorch 的全面开发者文档

查看文档

教程

获取面向初学者和高级开发者的深入教程

查看教程

资源

查找开发资源并获取问题解答

查看资源