变换说明¶
此示例说明了 torchvision.transforms.v2 模块 中提供的一些变换。
from PIL import Image
from pathlib import Path
import matplotlib.pyplot as plt
import torch
from torchvision.transforms import v2
plt.rcParams["savefig.bbox"] = 'tight'
# if you change the seed, make sure that the randomly-applied transforms
# properly show that the image can be both transformed and *not* transformed!
torch.manual_seed(0)
# If you're trying to run that on collab, you can download the assets and the
# helpers from https://github.com/pytorch/vision/tree/main/gallery/
from helpers import plot
orig_img = Image.open(Path('../assets') / 'astronaut.jpg')
几何变换¶
几何图像变换是指改变图像几何属性的过程,例如形状、大小、方向或位置。它涉及对图像像素或坐标应用数学运算以实现所需的变换。
填充¶
Pad
变换(另请参见 pad()
)使用一些像素值填充所有图像边界。
padded_imgs = [v2.Pad(padding=padding)(orig_img) for padding in (3, 10, 30, 50)]
plot([orig_img] + padded_imgs)

调整大小¶
Resize
变换(另请参见 resize()
)调整图像大小。
resized_imgs = [v2.Resize(size=size)(orig_img) for size in (30, 50, 100, orig_img.size)]
plot([orig_img] + resized_imgs)

中心裁剪¶
CenterCrop
变换(另请参见 center_crop()
)在中心裁剪给定的图像。
center_crops = [v2.CenterCrop(size=size)(orig_img) for size in (30, 50, 100, orig_img.size)]
plot([orig_img] + center_crops)

五点裁剪¶
FiveCrop
变换(另请参见 five_crop()
)将给定的图像裁剪成四个角和中心裁剪。
(top_left, top_right, bottom_left, bottom_right, center) = v2.FiveCrop(size=(100, 100))(orig_img)
plot([orig_img] + [top_left, top_right, bottom_left, bottom_right, center])

随机透视¶
RandomPerspective
变换(另请参见 perspective()
)对图像执行随机透视变换。
perspective_transformer = v2.RandomPerspective(distortion_scale=0.6, p=1.0)
perspective_imgs = [perspective_transformer(orig_img) for _ in range(4)]
plot([orig_img] + perspective_imgs)

随机旋转¶
RandomRotation
变换(另请参见 rotate()
)以随机角度旋转图像。
rotater = v2.RandomRotation(degrees=(0, 180))
rotated_imgs = [rotater(orig_img) for _ in range(4)]
plot([orig_img] + rotated_imgs)

随机仿射¶
RandomAffine
变换(另请参见 affine()
)对图像执行随机仿射变换。
affine_transfomer = v2.RandomAffine(degrees=(30, 70), translate=(0.1, 0.3), scale=(0.5, 0.75))
affine_imgs = [affine_transfomer(orig_img) for _ in range(4)]
plot([orig_img] + affine_imgs)

弹性变换¶
ElasticTransform
变换(另请参见 elastic_transform()
)随机变换图像中物体的形态,并产生类似透视水面的效果。
elastic_transformer = v2.ElasticTransform(alpha=250.0)
transformed_imgs = [elastic_transformer(orig_img) for _ in range(2)]
plot([orig_img] + transformed_imgs)

随机裁剪¶
RandomCrop
变换(另请参见 crop()
)在随机位置裁剪图像。
cropper = v2.RandomCrop(size=(128, 128))
crops = [cropper(orig_img) for _ in range(4)]
plot([orig_img] + crops)

随机调整大小裁剪¶
RandomResizedCrop
变换(另请参见 resized_crop()
)在随机位置裁剪图像,然后将裁剪调整为给定大小。
resize_cropper = v2.RandomResizedCrop(size=(32, 32))
resized_crops = [resize_cropper(orig_img) for _ in range(4)]
plot([orig_img] + resized_crops)

光度变换¶
光度图像变换是指修改图像光度属性的过程,例如亮度、对比度、颜色或色调。这些变换应用于改变图像的视觉外观,同时保留其几何结构。
除了 Grayscale
,以下变换是随机的,这意味着相同的变换实例每次变换给定图像时都会产生不同的结果。
灰度¶
Grayscale
变换(另请参见 to_grayscale()
)将图像转换为灰度
gray_img = v2.Grayscale()(orig_img)
plot([orig_img, gray_img], cmap='gray')

颜色抖动¶
ColorJitter
变换随机更改图像的亮度、对比度、饱和度、色调和其他属性。
jitter = v2.ColorJitter(brightness=.5, hue=.3)
jittered_imgs = [jitter(orig_img) for _ in range(4)]
plot([orig_img] + jittered_imgs)

高斯模糊¶
The GaussianBlur
transform (also see gaussian_blur()
) performs Gaussian blur transform on an image.
blurrer = v2.GaussianBlur(kernel_size=(5, 9), sigma=(0.1, 5.))
blurred_imgs = [blurrer(orig_img) for _ in range(4)]
plot([orig_img] + blurred_imgs)

RandomInvert¶
The RandomInvert
transform (also see invert()
) randomly inverts the colors of the given image.
inverter = v2.RandomInvert()
invertered_imgs = [inverter(orig_img) for _ in range(4)]
plot([orig_img] + invertered_imgs)

RandomPosterize¶
The RandomPosterize
transform (also see posterize()
) randomly posterizes the image by reducing the number of bits of each color channel.
posterizer = v2.RandomPosterize(bits=2)
posterized_imgs = [posterizer(orig_img) for _ in range(4)]
plot([orig_img] + posterized_imgs)

RandomSolarize¶
The RandomSolarize
transform (also see solarize()
) randomly solarizes the image by inverting all pixel values above the threshold.
solarizer = v2.RandomSolarize(threshold=192.0)
solarized_imgs = [solarizer(orig_img) for _ in range(4)]
plot([orig_img] + solarized_imgs)

RandomAdjustSharpness¶
The RandomAdjustSharpness
transform (also see adjust_sharpness()
) randomly adjusts the sharpness of the given image.
sharpness_adjuster = v2.RandomAdjustSharpness(sharpness_factor=2)
sharpened_imgs = [sharpness_adjuster(orig_img) for _ in range(4)]
plot([orig_img] + sharpened_imgs)

RandomAutocontrast¶
The RandomAutocontrast
transform (also see autocontrast()
) randomly applies autocontrast to the given image.
autocontraster = v2.RandomAutocontrast()
autocontrasted_imgs = [autocontraster(orig_img) for _ in range(4)]
plot([orig_img] + autocontrasted_imgs)

RandomEqualize¶
The RandomEqualize
transform (also see equalize()
) randomly equalizes the histogram of the given image.
equalizer = v2.RandomEqualize()
equalized_imgs = [equalizer(orig_img) for _ in range(4)]
plot([orig_img] + equalized_imgs)

JPEG¶
The JPEG
transform (also see jpeg()
) applies JPEG compression to the given image with random degree of compression.

Augmentation Transforms¶
The following transforms are combinations of multiple transforms, either geometric or photometric, or both.
AutoAugment¶
The AutoAugment
transform automatically augments data based on a given auto-augmentation policy. See AutoAugmentPolicy
for the available policies.
policies = [v2.AutoAugmentPolicy.CIFAR10, v2.AutoAugmentPolicy.IMAGENET, v2.AutoAugmentPolicy.SVHN]
augmenters = [v2.AutoAugment(policy) for policy in policies]
imgs = [
[augmenter(orig_img) for _ in range(4)]
for augmenter in augmenters
]
row_title = [str(policy).split('.')[-1] for policy in policies]
plot([[orig_img] + row for row in imgs], row_title=row_title)

RandAugment¶
The RandAugment
is an alternate version of AutoAugment.
augmenter = v2.RandAugment()
imgs = [augmenter(orig_img) for _ in range(4)]
plot([orig_img] + imgs)

TrivialAugmentWide¶
The TrivialAugmentWide
is an alternate implementation of AutoAugment. However, instead of transforming an image multiple times, it transforms an image only once using a random transform from a given list with a random strength number.
augmenter = v2.TrivialAugmentWide()
imgs = [augmenter(orig_img) for _ in range(4)]
plot([orig_img] + imgs)

AugMix¶
The AugMix
transform interpolates between augmented versions of an image.

Randomly-applied Transforms¶
The following transforms are randomly-applied given a probability p
. That is, given p = 0.5
, there is a 50% chance to return the original image, and a 50% chance to return the transformed image, even when called with the same transform instance!
RandomHorizontalFlip¶
The RandomHorizontalFlip
transform (also see hflip()
) performs horizontal flip of an image, with a given probability.
hflipper = v2.RandomHorizontalFlip(p=0.5)
transformed_imgs = [hflipper(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

RandomVerticalFlip¶
The RandomVerticalFlip
transform (also see vflip()
) performs vertical flip of an image, with a given probability.
vflipper = v2.RandomVerticalFlip(p=0.5)
transformed_imgs = [vflipper(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

RandomApply¶
The RandomApply
transform randomly applies a list of transforms, with a given probability.
applier = v2.RandomApply(transforms=[v2.RandomCrop(size=(64, 64))], p=0.5)
transformed_imgs = [applier(orig_img) for _ in range(4)]
plot([orig_img] + transformed_imgs)

Total running time of the script: (0 minutes 9.073 seconds)