变换说明¶

注意

尝试在 colab 上或转到结尾下载完整的示例代码。

此示例说明了 torchvision.transforms.v2 模块中提供的一些变换。

from PIL import Image
from pathlib import Path
import matplotlib.pyplot as plt

import torch
from torchvision.transforms import v2

plt.rcParams["savefig.bbox"] = 'tight'

# if you change the seed, make sure that the randomly-applied transforms
# properly show that the image can be both transformed and *not* transformed!
torch.manual_seed(0)

# If you're trying to run that on collab, you can download the assets and the
# helpers from https://github.com/pytorch/vision/tree/main/gallery/
from helpers import plot
orig_img = Image.open(Path('../assets') / 'astronaut.jpg')

几何变换¶

几何图像变换是指改变图像几何属性的过程，例如形状、大小、方向或位置。它涉及对图像像素或坐标应用数学运算以实现所需的变换。

填充¶

Pad 变换（另请参见 pad()）使用一些像素值填充所有图像边界。

padded_imgs = [v2.Pad(padding=padding)(orig_img) for padding in (3, 10, 30, 50)]
plot([orig_img] + padded_imgs)

调整大小¶

Resize 变换（另请参见 resize()）调整图像大小。

resized_imgs = [v2.Resize(size=size)(orig_img) for size in (30, 50, 100, orig_img.size)]
plot([orig_img] + resized_imgs)

中心裁剪¶

CenterCrop 变换（另请参见 center_crop()）在中心裁剪给定的图像。

center_crops = [v2.CenterCrop(size=size)(orig_img) for size in (30, 50, 100, orig_img.size)]
plot([orig_img] + center_crops)

五点裁剪¶

FiveCrop 变换（另请参见 five_crop()）将给定的图像裁剪成四个角和中心裁剪。

(top_left, top_right, bottom_left, bottom_right, center) = v2.FiveCrop(size=(100, 100))(orig_img)
plot([orig_img] + [top_left, top_right, bottom_left, bottom_right, center])

随机透视¶

RandomPerspective 变换（另请参见 perspective()）对图像执行随机透视变换。

perspective_transformer = v2.RandomPerspective(distortion_scale=0.6, p=1.0)
perspective_imgs = [perspective_transformer(orig_img) for _ in range(4)]
plot([orig_img] + perspective_imgs)

随机旋转¶

RandomRotation 变换（另请参见 rotate()）以随机角度旋转图像。

rotater = v2.RandomRotation(degrees=(0, 180))
rotated_imgs = [rotater(orig_img) for _ in range(4)]
plot([orig_img] + rotated_imgs)

随机仿射¶

RandomAffine 变换（另请参见 affine()）对图像执行随机仿射变换。

affine_transfomer = v2.RandomAffine(degrees=(30, 70), translate=(0.1, 0.3), scale=(0.5, 0.75))
affine_imgs = [affine_transfomer(orig_img) for _ in range(4)]
plot([orig_img] + affine_imgs)

弹性变换¶

ElasticTransform 变换（另请参见 elastic_transform()）随机变换图像中物体的形态，并产生类似透视水面的效果。

elastic_transformer = v2.ElasticTransform(alpha=250.0)
transformed_imgs = [elastic_transformer(orig_img) for _ in range(2)]
plot([orig_img] + transformed_imgs)

随机裁剪¶

RandomCrop 变换（另请参见 crop()）在随机位置裁剪图像。

cropper = v2.RandomCrop(size=(128, 128))
crops = [cropper(orig_img) for _ in range(4)]
plot([orig_img] + crops)

随机调整大小裁剪¶

RandomResizedCrop 变换（另请参见 resized_crop()）在随机位置裁剪图像，然后将裁剪调整为给定大小。

resize_cropper = v2.RandomResizedCrop(size=(32, 32))
resized_crops = [resize_cropper(orig_img) for _ in range(4)]
plot([orig_img] + resized_crops)

光度变换¶

光度图像变换是指修改图像光度属性的过程，例如亮度、对比度、颜色或色调。这些变换应用于改变图像的视觉外观，同时保留其几何结构。

除了 Grayscale，以下变换是随机的，这意味着相同的变换实例每次变换给定图像时都会产生不同的结果。

灰度¶

Grayscale 变换（另请参见 to_grayscale()）将图像转换为灰度

gray_img = v2.Grayscale()(orig_img)
plot([orig_img, gray_img], cmap='gray')

颜色抖动¶

ColorJitter 变换随机更改图像的亮度、对比度、饱和度、色调和其他属性。

jitter = v2.ColorJitter(brightness=.5, hue=.3)
jittered_imgs = [jitter(orig_img) for _ in range(4)]
plot([orig_img] + jittered_imgs)

高斯模糊¶

The GaussianBlur transform (also see gaussian_blur()) performs Gaussian blur transform on an image.

blurrer = v2.GaussianBlur(kernel_size=(5, 9), sigma=(0.1, 5.))
blurred_imgs = [blurrer(orig_img) for _ in range(4)]
plot([orig_img] + blurred_imgs)

RandomInvert¶

The RandomInvert transform (also see invert()) randomly inverts the colors of the given image.

inverter = v2.RandomInvert()
invertered_imgs = [inverter(orig_img) for _ in range(4)]
plot([orig_img] + invertered_imgs)

RandomPosterize¶

The RandomPosterize transform (also see posterize()) randomly posterizes the image by reducing the number of bits of each color channel.

posterizer = v2.RandomPosterize(bits=2)
posterized_imgs = [posterizer(orig_img) for _ in range(4)]
plot([orig_img] + posterized_imgs)

RandomSolarize¶

The RandomSolarize transform (also see solarize()) randomly solarizes the image by inverting all pixel values above the threshold.

solarizer = v2.RandomSolarize(threshold=192.0)
solarized_imgs = [solarizer(orig_img) for _ in range(4)]
plot([orig_img] + solarized_imgs)

RandomAdjustSharpness¶

The RandomAdjustSharpness transform (also see adjust_sharpness()) randomly adjusts the sharpness of the given image.

sharpness_adjuster = v2.RandomAdjustSharpness(sharpness_factor=2)
sharpened_imgs = [sharpness_adjuster(orig_img) for _ in range(4)]
plot([orig_img] + sharpened_imgs)

RandomAutocontrast¶

The RandomAutocontrast transform (also see autocontrast()) randomly applies autocontrast to the given image.

autocontraster = v2.RandomAutocontrast()
autocontrasted_imgs = [autocontraster(orig_img) for _ in range(4)]
plot([orig_img] + autocontrasted_imgs)

RandomEqualize¶

The RandomEqualize transform (also see equalize()) randomly equalizes the histogram of the given image.

equalizer = v2.RandomEqualize()
equalized_imgs = [equalizer(orig_img) for _ in range(4)]
plot([orig_img] + equalized_imgs)

JPEG¶

The JPEG transform (also see jpeg()) applies JPEG compression to the given image with random degree of compression.

jpeg = v2.JPEG((5, 50))
jpeg_imgs = [jpeg(orig_img) for _ in range(4)]
plot([orig_img] + jpeg_imgs)

Augmentation Transforms¶

The following transforms are combinations of multiple transforms, either geometric or photometric, or both.

AutoAugment¶

The AutoAugment transform automatically augments data based on a given auto-augmentation policy. See AutoAugmentPolicy for the available policies.

policies = [v2.AutoAugmentPolicy.CIFAR10, v2.AutoAugmentPolicy.IMAGENET, v2.AutoAugmentPolicy.SVHN]
augmenters = [v2.AutoAugment(policy) for policy in policies]
imgs = [
    [augmenter(orig_img) for _ in range(4)]
    for augmenter in augmenters
]
row_title = [str(policy).split('.')[-1] for policy in policies]
plot([[orig_img] + row for row in imgs], row_title=row_title)

RandAugment¶

The RandAugment is an alternate version of AutoAugment.

augmenter = v2.RandAugment()
imgs = [augmenter(orig_img) for _ in range(4)]
plot([orig_img] + imgs)

TrivialAugmentWide¶

The TrivialAugmentWide is an alternate implementation of AutoAugment. However, instead of transforming an image multiple times, it transforms an image only once using a random transform from a given list with a random strength number.

augmenter = v2.TrivialAugmentWide()
imgs = [augmenter(orig_img) for _ in range(4)]
plot([orig_img] + imgs)

AugMix¶

The AugMix transform interpolates between augmented versions of an image.

augmenter = v2.AugMix()
imgs = [augmenter(orig_img) for _ in range(4)]
plot([orig_img] + imgs)

Randomly-applied Transforms¶

The following transforms are randomly-applied given a probability p. That is, given p = 0.5, there is a 50% chance to return the original image, and a 50% chance to return the transformed image, even when called with the same transform instance!

RandomHorizontalFlip¶

The RandomHorizontalFlip transform (also see hflip()) performs horizontal flip of an image, with a given probability.

hflipper = v2.RandomHorizontalFlip(p=0.5)
transformed_imgs = [hflipper(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

RandomVerticalFlip¶

The RandomVerticalFlip transform (also see vflip()) performs vertical flip of an image, with a given probability.

vflipper = v2.RandomVerticalFlip(p=0.5)
transformed_imgs = [vflipper(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

RandomApply¶

The RandomApply transform randomly applies a list of transforms, with a given probability.

applier = v2.RandomApply(transforms=[v2.RandomCrop(size=(64, 64))], p=0.5)
transformed_imgs = [applier(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

Total running time of the script: (0 minutes 9.073 seconds)

Gallery generated by Sphinx-Gallery