Logo

dev-resources.site

for different kinds of informations.

A Visual Guide to Affine Transformations: Translation, Scaling, Rotation, and Shear

Published at
12/5/2024
Categories
python
tutorial
numpy
opencv
Author
San Askaruly
Categories
4 categories in total
python
open
tutorial
open
numpy
open
opencv
open
A Visual Guide to Affine Transformations: Translation, Scaling, Rotation, and Shear

Gif affine transformation

Image credits: murray, Stack Exchange

What is affine transformation exactly?

Affine transformation is a technique used in image processing to modify the geometry of an image while preserving certain properties. It's a combination of linear transformations that can change an image's position, size, shape, and orientation [1, 2].

In simple terms, affine transformations allow you to:

  1. Move the image (translation)
  2. Resize the image (scaling)
  3. Tilt or slant the image (shear)
  4. Rotate the image (rotation)

These transformations can be applied individually or combined to achieve various effects [1, 2]. The key characteristic of affine transformations is that they preserve:

  • Straight lines (they remain straight after transformation)
  • Parallel lines (they stay parallel after transformation)
  • Ratios of distances between points on a line [3, 4]

Why is it useful to know

In the context of image processing, affine transformations are commonly used to:

  • Correct distortions in images, such as those caused by camera angles or lens effects
  • Align or register multiple images
  • Prepare images for further analysis or processing

For example, in satellite imagery, affine transformations help correct distortions from wide-angle lenses and create accurate, flat maps from curved Earth images [2]. This makes it easier to analyze and work with the imagery without having to account for distortions.

Outline of this post

In this tutorial, we will cover and visualize common affine transformations: translation, scaling, shear and rotation. We will use Python code from OpenCV and NumPy libraries. The post is structured as follows:

  • Definitions
  • Translation
  • Scale
  • Shear
    • Shear in X-direction
    • Shear in Y-direction
  • Rotation
  • References

Source code: https://github.com/tuttelikz/notes/blob/main/affine/code.ipynb

Definitions

Mathematically, an affine transformation is a relation between two images that can be expressed as a matrix multiplication (linear transformation), AA A , followed by a vector addition (translation), BBB .

T=A⋅∣xy∣+B=∣a00a01a10a11∣⋅∣xy∣+∣b00b01∣ T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}\cdot\begin{vmatrix} x \\ y \end{vmatrix}+\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix} T=Axy+B=a00a10a01a11xy+b00b01
T=∣a00x+a01y+b00a10x+a11y+b10∣ T=\begin{vmatrix*}[r] a_{00}x+a_{01}y+b_{00}\\ a_{10}x+a_{11}y+b_{10} \end{vmatrix*} T=a00x+a01y+b00a10x+a11y+b10

In Python, affine transformation can be realized using cv2.warpAffine. However, we should supply the 2×32 × 32×3 transformation matrix, MMM .

import cv2

src = cv2.imread("images/lena.jpg")
h, w = src.shape[:2]

# M is affine transformation matrix which shall be defined
warp_dst = cv2.warpAffine(
    src=src, M=M, dsize=(w, h)
)

Translation

A translation

To perform image translation, the elements of AAA and BBB should be as follows:
A=∣a00a01a10a11∣=∣1001∣ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix} A=a00a10a01a11=1001

B=∣b00b01∣=∣dxdy∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} dx \\ dy \end{vmatrix} B=b00b01=dxdy
where dxdxdx is desired shift in horizontal direction, whereas dydydy represents shift in vertical direction. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=Axy+B , we can simply derive a translation transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after translation:

  • A=(0,0)→A′=(dx,dy)A=(0,0) \rightarrow A'=(dx, dy) A=(0,0)A=(dx,dy)
  • B=(0,1)→B′=(dx,1+dy)B=(0,1) \rightarrow B'=(dx, 1+dy) B=(0,1)B=(dx,1+dy)
  • C=(1,0)→C′=(1+dx,dy)C=(1,0) \rightarrow C'=(1+dx, dy) C=(1,0)C=(1+dx,dy)
  • D=(1,1)→D′=(1+dx,1+dy)D=(1,1) \rightarrow D'=(1+dx, 1+dy) D=(1,1)D=(1+dx,1+dy) Translation

To be able to translate an image using cv2.warpAffine mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:

import numpy as np

dx = 20 # shift in X (pixels)
dy = 50 # shift in Y (pixels)
M = np.array([
    [1,0,dx],
    [0,1,dy]
]).astype(np.float32)

And here is the output of image translation:
Image translation
Note: You may have noticed that translation of the image is inverted when compared to the graph. This is because in an image, the origin is at the top-left corner, and the y-axis goes down, while in a typical coordinate system, the origin is at the bottom-left, and the y-axis goes up.

Scale

A scale

To perform image scaling, the elements of AAA and BBB should be as follows:
A=∣a00a01a10a11∣=∣wx00wy∣ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} wx & 0 \\ 0 & wy \end{vmatrix} A=a00a10a01a11=wx00wy

B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=b00b01=00

where wxwxwx is desired scaling in horizontal direction, whereas wywywy represents scaling in vertical direction. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=Axy+B , we can simply derive a scaling transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after scaling:

  • A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)A=(0,0)
  • B=(0,1)→B′=(0,wy)B=(0,1) \rightarrow B'=(0, wy) B=(0,1)B=(0,wy)
  • C=(1,0)→C′=(wx,0)C=(1,0) \rightarrow C'=(wx, 0) C=(1,0)C=(wx,0)
  • D=(1,1)→D′=(wx,wy)D=(1,1) \rightarrow D'=(wx, wy) D=(1,1)D=(wx,wy) Scale To be able to scale an image using cv2.warpAffine mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np

wx = 2 # scale in horizontal direction
wy = 3 # scale in vertical direction
M = np.array([
    [wx,0,0],
    [0,wy,0]
]).astype(np.float32)

And here is the output of image scaling:
Image scaling

Shear

Shear in X direction

A shear in X

To perform shear in X direction, the elements of AAA and BBB should be as follows:
A=∣a00a01a10a11∣=∣1tan⁡ϕ01∣ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & \tan\phi \\ 0 & 1 \end{vmatrix} A=a00a10a01a11=10tanϕ1

B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=b00b01=00

where tan⁡ϕ\tan\phitanϕ represents amount of the horizontal shear. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=Axy+B , we can simply derive a horizontal shearing for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after shearing:

  • A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)A=(0,0)
  • B=(0,1)→B′=(tan⁡ϕ,1)B=(0,1) \rightarrow B'=(\tan\phi, 1) B=(0,1)B=(tanϕ,1)
  • C=(1,0)→C′=(1,0)C=(1,0) \rightarrow C'=(1, 0) C=(1,0)C=(1,0)
  • D=(1,1)→D′=(1+tan⁡ϕ,1)D=(1,1) \rightarrow D'=(1+\tan\phi, 1) D=(1,1)D=(1+tanϕ,1) Shear in X direction To be able to horizontally shear an image using cv2.warpAffine mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np

phi = math.pi/8
M = np.array([
    [1,math.tan(phi),0], # amount of shear in X direction
    [0,1,0]
]).astype(np.float32)

And here is the output of image shear in X direction:
Image shear in X direction

Shear in Y direction

A shear in Y

To perform shear in Y direction, the elements of AAA and BBB should be as follows:
A=∣a00a01a10a11∣=∣10tan⁡ψ1∣ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & 0 \\ \tan\psi & 1 \end{vmatrix} A=a00a10a01a11=1tanψ01

B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=b00b01=00

where tan⁡ψ\tan\psitanψ represents amount of the vertical shear. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=Axy+B , we can simply derive a vertical shearing transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after shearing:

  • A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)A=(0,0)
  • B=(0,1)→B′=(0,1)B=(0,1) \rightarrow B'=(0, 1) B=(0,1)B=(0,1)
  • C=(1,0)→C′=(1,tan⁡ψ)C=(1,0) \rightarrow C'=(1, \tan\psi) C=(1,0)C=(1,tanψ)
  • D=(1,1)→D′=(1,1+tan⁡ψ)D=(1,1) \rightarrow D'=(1, 1+\tan\psi) D=(1,1)D=(1,1+tanψ) Shear in Y direction To be able to vertically shear an image using cv2.warpAffine mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np

psi = math.pi/10
M = np.array([
    [1,0,0],
    [math.tan(psi),1,0] # amount of shear in Y direction
]).astype(np.float32)

And here is the output of image shear in Y direction:
Image shear in Y direction

Rotation

A rotation

To perform rotation, the elements of AAA and BBB should be as follows:
A=∣a00a01a10a11∣=∣cos⁡θ−sin⁡θsin⁡θcos⁡θ∣ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{vmatrix} A=a00a10a01a11=cosθsinθsinθcosθ

B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=b00b01=00

where θ\thetaθ represents desired angle of the rotation. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=Axy+B , we can simply derive a rotation transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after rotation:

  • A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)A=(0,0)
  • B=(0,1)→B′=(−sin⁡θ,cos⁡θ)B=(0,1) \rightarrow B'=(-\sin\theta, \cos\theta) B=(0,1)B=(sinθ,cosθ)
  • C=(1,0)→C′=(cos⁡θ,sin⁡θ)C=(1,0) \rightarrow C'=(\cos\theta, \sin\theta) C=(1,0)C=(cosθ,sinθ)
  • D=(1,1)→D′=(cos⁡θ−sin⁡θ,sin⁡θ+cos⁡θ)D=(1,1) \rightarrow D'=(\cos\theta-\sin\theta, \sin\theta+\cos\theta) D=(1,1)D=(cosθsinθ,sinθ+cosθ) Rotation To be able to rotate an image using cv2.warpAffine mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np
import math

angle = math.pi/6 # rotation angle
M = np.array([
    [math.cos(angle),-math.sin(angle),0],
    [math.sin(angle),math.cos(angle),0]
]).astype(np.float32)

And here is the output of image rotation:
Image rotation

References

[1] Educative. What is affine transformation? Educative. Retrieved from https://www.educative.io/answers/what-is-affine-transformation
[2] MathWorks. Affine transformation. MathWorks. Retrieved from https://www.mathworks.com/discovery/affine-transformation.html
[3] Hughes, R. Affine transformation. University of Edinburgh. Retrieved from https://homepages.inf.ed.ac.uk/rbf/HIPR2/affine.htm
[4] Wikipedia. Affine transformation. Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Affine_transformation

Thanks for reading! Stay tuned for more content, and feel free to share your thoughts and feedback! Your reactions help me improve and create even more useful posts 🙂

Alternate URL: https://github.com/tuttelikz/notes/tree/main/affine

Featured ones: