torch.cuda.cudart¶

torch.cuda.cudart()[源代码][源代码]¶

检索 CUDA 运行时 API 模块。

此函数会初始化 CUDA 运行时环境（如果尚未初始化），并返回 CUDA 运行时 API 模块 (_cudart)。CUDA 运行时 API 模块提供对各种 CUDA 运行时函数的访问。

参数

无 –

返回值

CUDA 运行时 API 模块 (_cudart)。

返回类型

module

引发

RuntimeError – 如果 CUDA 无法在 fork 的子进程中重新初始化。
AssertionError – 如果 PyTorch 未编译 CUDA 支持，或者 libcudart 函数不可用。

带有性能分析的 CUDA 操作示例

>>> import torch
>>> from torch.cuda import cudart, check_error
>>> import os
>>>
>>> os.environ['CUDA_PROFILE'] = '1'
>>>
>>> def perform_cuda_operations_with_streams():
>>>     stream = torch.cuda.Stream()
>>>     with torch.cuda.stream(stream):
>>>         x = torch.randn(100, 100, device='cuda')
>>>         y = torch.randn(100, 100, device='cuda')
>>>         z = torch.mul(x, y)
>>>     return z
>>>
>>> torch.cuda.synchronize()
>>> print("====== Start nsys profiling ======")
>>> check_error(cudart().cudaProfilerStart())
>>> with torch.autograd.profiler.emit_nvtx():
>>>     result = perform_cuda_operations_with_streams()
>>>     print("CUDA operations completed.")
>>> check_error(torch.cuda.cudart().cudaProfilerStop())
>>> print("====== End nsys profiling ======")

要运行此示例并保存性能分析信息，请执行以下命令：

>>> $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py

此命令会分析所提供脚本中的 CUDA 操作，并将性能分析信息保存到名为 trace_name.prof 的文件中。–profile-from-start off 选项确保性能分析仅在脚本中的 cudaProfilerStart 调用后才开始。–csv 和 –print-summary 选项分别将性能分析输出格式化为 CSV 文件并打印摘要。-o 选项指定输出文件名，-f 选项强制覆盖已存在的输出文件。

torch.cuda.cudart¶

文档

教程

资源