dev-resources.site
for different kinds of informations.
A Visual Guide to Affine Transformations: Translation, Scaling, Rotation, and Shear
What is affine transformation exactly?
Affine transformation is a technique used in image processing to modify the geometry of an image while preserving certain properties. It's a combination of linear transformations that can change an image's position, size, shape, and orientation [1, 2].
In simple terms, affine transformations allow you to:
- Move the image (translation)
- Resize the image (scaling)
- Tilt or slant the image (shear)
- Rotate the image (rotation)
These transformations can be applied individually or combined to achieve various effects [1, 2]. The key characteristic of affine transformations is that they preserve:
- Straight lines (they remain straight after transformation)
- Parallel lines (they stay parallel after transformation)
- Ratios of distances between points on a line [3, 4]
Why is it useful to know
In the context of image processing, affine transformations are commonly used to:
- Correct distortions in images, such as those caused by camera angles or lens effects
- Align or register multiple images
- Prepare images for further analysis or processing
For example, in satellite imagery, affine transformations help correct distortions from wide-angle lenses and create accurate, flat maps from curved Earth images [2]. This makes it easier to analyze and work with the imagery without having to account for distortions.
Outline of this post
In this tutorial, we will cover and visualize common affine transformations: translation, scaling, shear and rotation. We will use Python code from OpenCV and NumPy libraries. The post is structured as follows:
- Definitions
- Translation
- Scale
-
Shear
- Shear in X-direction
- Shear in Y-direction
- Rotation
- References
Source code: https://github.com/tuttelikz/notes/blob/main/affine/code.ipynb
Definitions
Mathematically, an affine transformation is a relation between two images that can be expressed as a matrix multiplication (linear transformation), AA A , followed by a vector addition (translation), BBB .
In Python, affine transformation can be realized using cv2.warpAffine
. However, we should supply the
2×32 × 32×3
transformation matrix,
MMM
.
import cv2
src = cv2.imread("images/lena.jpg")
h, w = src.shape[:2]
# M is affine transformation matrix which shall be defined
warp_dst = cv2.warpAffine(
src=src, M=M, dsize=(w, h)
)
Translation
To perform image translation, the elements of
AAA
and
BBB
should be as follows:
A=∣a00a01a10a11∣=∣1001∣
A=\begin{vmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11}
\end{vmatrix}=\begin{vmatrix}
1 & 0 \\
0 & 1
\end{vmatrix}
A=∣∣a00a10a01a11∣∣=∣∣1001∣∣
B=∣b00b01∣=∣dxdy∣
B=\begin{vmatrix}
b_{00} \\
b_{01}
\end{vmatrix}=\begin{vmatrix}
dx \\
dy
\end{vmatrix}
B=∣∣b00b01∣∣=∣∣dxdy∣∣
where
dxdxdx
is desired shift in horizontal direction, whereas
dydydy
represents shift in vertical direction. Remembering that affine transformation is
T=A⋅∣xy∣+B
T=A\cdot\begin{vmatrix}
x \\
y
\end{vmatrix}+B
T=A⋅∣∣xy∣∣+B
, we can simply derive a translation transformation for the simplest case. Let's take as an example four coordinates:
(0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1)
and derive where each of these points should be located after translation:
- A=(0,0)→A′=(dx,dy)A=(0,0) \rightarrow A'=(dx, dy) A=(0,0)→A′=(dx,dy)
- B=(0,1)→B′=(dx,1+dy)B=(0,1) \rightarrow B'=(dx, 1+dy) B=(0,1)→B′=(dx,1+dy)
- C=(1,0)→C′=(1+dx,dy)C=(1,0) \rightarrow C'=(1+dx, dy) C=(1,0)→C′=(1+dx,dy)
- D=(1,1)→D′=(1+dx,1+dy)D=(1,1) \rightarrow D'=(1+dx, 1+dy) D=(1,1)→D′=(1+dx,1+dy)
To be able to translate an image using cv2.warpAffine
mentioned previously,
2×32 × 32×3
transformation matrix
MMM
can be created using NumPy as follows:
import numpy as np
dx = 20 # shift in X (pixels)
dy = 50 # shift in Y (pixels)
M = np.array([
[1,0,dx],
[0,1,dy]
]).astype(np.float32)
And here is the output of image translation:
Note: You may have noticed that translation of the image is inverted when compared to the graph. This is because in an image, the origin is at the top-left corner, and the y-axis goes down, while in a typical coordinate system, the origin is at the bottom-left, and the y-axis goes up.
Scale
To perform image scaling, the elements of
AAA
and
BBB
should be as follows:
A=∣a00a01a10a11∣=∣wx00wy∣
A=\begin{vmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11}
\end{vmatrix}=\begin{vmatrix}
wx & 0 \\
0 & wy
\end{vmatrix}
A=∣∣a00a10a01a11∣∣=∣∣wx00wy∣∣
B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=∣∣b00b01∣∣=∣∣00∣∣
where wxwxwx is desired scaling in horizontal direction, whereas wywywy represents scaling in vertical direction. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=A⋅∣∣xy∣∣+B , we can simply derive a scaling transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after scaling:
- A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)→A′=(0,0)
- B=(0,1)→B′=(0,wy)B=(0,1) \rightarrow B'=(0, wy) B=(0,1)→B′=(0,wy)
- C=(1,0)→C′=(wx,0)C=(1,0) \rightarrow C'=(wx, 0) C=(1,0)→C′=(wx,0)
-
D=(1,1)→D′=(wx,wy)D=(1,1) \rightarrow D'=(wx, wy)
D=(1,1)→D′=(wx,wy)
To be able to scale an image using
cv2.warpAffine
mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np
wx = 2 # scale in horizontal direction
wy = 3 # scale in vertical direction
M = np.array([
[wx,0,0],
[0,wy,0]
]).astype(np.float32)
And here is the output of image scaling:
Shear
Shear in X direction
To perform shear in X direction, the elements of
AAA
and
BBB
should be as follows:
A=∣a00a01a10a11∣=∣1tanϕ01∣
A=\begin{vmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11}
\end{vmatrix}=\begin{vmatrix}
1 & \tan\phi \\
0 & 1
\end{vmatrix}
A=∣∣a00a10a01a11∣∣=∣∣10tanϕ1∣∣
B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=∣∣b00b01∣∣=∣∣00∣∣
where tanϕ\tan\phitanϕ represents amount of the horizontal shear. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=A⋅∣∣xy∣∣+B , we can simply derive a horizontal shearing for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after shearing:
- A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)→A′=(0,0)
- B=(0,1)→B′=(tanϕ,1)B=(0,1) \rightarrow B'=(\tan\phi, 1) B=(0,1)→B′=(tanϕ,1)
- C=(1,0)→C′=(1,0)C=(1,0) \rightarrow C'=(1, 0) C=(1,0)→C′=(1,0)
-
D=(1,1)→D′=(1+tanϕ,1)D=(1,1) \rightarrow D'=(1+\tan\phi, 1)
D=(1,1)→D′=(1+tanϕ,1)
To be able to horizontally shear an image using
cv2.warpAffine
mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np
phi = math.pi/8
M = np.array([
[1,math.tan(phi),0], # amount of shear in X direction
[0,1,0]
]).astype(np.float32)
And here is the output of image shear in X direction:
Shear in Y direction
To perform shear in Y direction, the elements of
AAA
and
BBB
should be as follows:
A=∣a00a01a10a11∣=∣10tanψ1∣
A=\begin{vmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11}
\end{vmatrix}=\begin{vmatrix}
1 & 0 \\
\tan\psi & 1
\end{vmatrix}
A=∣∣a00a10a01a11∣∣=∣∣1tanψ01∣∣
B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=∣∣b00b01∣∣=∣∣00∣∣
where tanψ\tan\psitanψ represents amount of the vertical shear. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=A⋅∣∣xy∣∣+B , we can simply derive a vertical shearing transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after shearing:
- A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)→A′=(0,0)
- B=(0,1)→B′=(0,1)B=(0,1) \rightarrow B'=(0, 1) B=(0,1)→B′=(0,1)
- C=(1,0)→C′=(1,tanψ)C=(1,0) \rightarrow C'=(1, \tan\psi) C=(1,0)→C′=(1,tanψ)
-
D=(1,1)→D′=(1,1+tanψ)D=(1,1) \rightarrow D'=(1, 1+\tan\psi)
D=(1,1)→D′=(1,1+tanψ)
To be able to vertically shear an image using
cv2.warpAffine
mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np
psi = math.pi/10
M = np.array([
[1,0,0],
[math.tan(psi),1,0] # amount of shear in Y direction
]).astype(np.float32)
And here is the output of image shear in Y direction:
Rotation
To perform rotation, the elements of
AAA
and
BBB
should be as follows:
A=∣a00a01a10a11∣=∣cosθ−sinθsinθcosθ∣
A=\begin{vmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11}
\end{vmatrix}=\begin{vmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{vmatrix}
A=∣∣a00a10a01a11∣∣=∣∣cosθsinθ−sinθcosθ∣∣
B=∣b00b01∣=∣00∣ B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix} B=∣∣b00b01∣∣=∣∣00∣∣
where θ\thetaθ represents desired angle of the rotation. Remembering that affine transformation is T=A⋅∣xy∣+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B T=A⋅∣∣xy∣∣+B , we can simply derive a rotation transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1)(0,0),(0,1),(1,0),(1,1) and derive where each of these points should be located after rotation:
- A=(0,0)→A′=(0,0)A=(0,0) \rightarrow A'=(0, 0) A=(0,0)→A′=(0,0)
- B=(0,1)→B′=(−sinθ,cosθ)B=(0,1) \rightarrow B'=(-\sin\theta, \cos\theta) B=(0,1)→B′=(−sinθ,cosθ)
- C=(1,0)→C′=(cosθ,sinθ)C=(1,0) \rightarrow C'=(\cos\theta, \sin\theta) C=(1,0)→C′=(cosθ,sinθ)
-
D=(1,1)→D′=(cosθ−sinθ,sinθ+cosθ)D=(1,1) \rightarrow D'=(\cos\theta-\sin\theta, \sin\theta+\cos\theta)
D=(1,1)→D′=(cosθ−sinθ,sinθ+cosθ)
To be able to rotate an image using
cv2.warpAffine
mentioned previously, 2×32 × 32×3 transformation matrix MMM can be created using NumPy as follows:
import numpy as np
import math
angle = math.pi/6 # rotation angle
M = np.array([
[math.cos(angle),-math.sin(angle),0],
[math.sin(angle),math.cos(angle),0]
]).astype(np.float32)
And here is the output of image rotation:
References
[1] Educative. What is affine transformation? Educative. Retrieved from https://www.educative.io/answers/what-is-affine-transformation
[2] MathWorks. Affine transformation. MathWorks. Retrieved from https://www.mathworks.com/discovery/affine-transformation.html
[3] Hughes, R. Affine transformation. University of Edinburgh. Retrieved from https://homepages.inf.ed.ac.uk/rbf/HIPR2/affine.htm
[4] Wikipedia. Affine transformation. Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Affine_transformation
Thanks for reading! Stay tuned for more content, and feel free to share your thoughts and feedback! Your reactions help me improve and create even more useful posts 🙂
Alternate URL: https://github.com/tuttelikz/notes/tree/main/affine
Featured ones: