I’ll have one slice of 4D, thank you!

An ode to Affine Transformations.

Prologue

Almost every render you see today, be it in a game, in a movie, or your favourite graphics application, you are witnessing the result of a really cool technique that takes those 3D objects, “expands” it into a 4D realm, twists and turns the objects in the fourth dimension, and ultimately extracts a slice of it, all so that you can witness your friend vroom past you in Mario Kart™ in full 3D glory. Wut? Why do we have to enter this realm of 4 dimensions you ask? You have come to the right place, my friend, where we talk about matrices, what they can do for us, what they cannot (jump here if you have witnessed the glory of matrices already), and how we overcome their limitation.

Let us take a look at this tiny and abundantly common example:


Let’s consider the four corner points of this mesh. Say, they are at points $(x, y)$. To create an animation as shown above, we need move the points around the origin, $(0, 0)$. Say we are required to move by an angle $\theta$ every frame. Using a tad bit of trigonometry and jumping onto the complex plane, if the current point $(x, y)$ be written as $k\cdot e^{i\alpha}$, we have the new point $(x’, y’)$ which is rotated by $\theta$ radians as:
$$\begin{eqnarray}
k \cdot e^{i \beta} &=& k\cdot e^{i(\alpha + \theta)} \\
&=& k\cdot e^{i\alpha} \cdot e^{i\theta} \\
&=& k \cdot (\cos \alpha + i\sin \alpha) \cdot \\
& & (\cos \theta + i\sin \theta) \\
&=& (x + iy) \cdot (\cos \theta + i\sin \theta) \\
x’+iy’&=& (x \cos \theta – y \sin \theta) +\\
& & i(y \cos \theta + x \sin \theta)
\end{eqnarray}$$
Dissociating the real and complex parts:
$$\begin{eqnarray}
x’ &=& x \cos \theta &-& y \sin \theta \\
y’ &=& x \sin \theta &+& y \cos \theta
\end{eqnarray}$$
So we can place the points in their new coordinates every frame according to our equation here and we would end up with an animation shown above, where the points would be rotating about the origin by $\theta$ every frame.

Another way of writing the same equation is:
$$
\begin{bmatrix}
x’\\
y’
\end{bmatrix}
=
\begin{bmatrix}
\cos \theta & -\sin \theta\\
\sin \theta & \cos \theta
\end{bmatrix}
\cdot
\begin{bmatrix}
x\\
y
\end{bmatrix}
$$ If we multiply the matrices on the right hand side of this equation, we see that it’s the same old wine, just in a new bottle:
$$
\begin{bmatrix}
x’\\
y’
\end{bmatrix}
=
\begin{bmatrix}
x \cos \theta &-& y \sin \theta\\
x \sin \theta &+& y \cos \theta
\end{bmatrix}
$$

Part I
Rise of the Planet of the Matrices

Then, why would we bother writing a perfectly fine equation as a matrix multiplication? While matrices are not mathemagic, they definitely are a convenience. Please, allow me to demonstrate.

Say our new task is to both rotate and scale a square like this:


Scaling matrix can be created similar to how the rotation matrix was created above. We can trivially observe the following equation to be true:
$$
\begin{bmatrix}
x'\\
y'
\end{bmatrix}
=
\begin{bmatrix}
s_x & 0\\
0 & s_y
\end{bmatrix}
\cdot
\begin{bmatrix}
x\\
y
\end{bmatrix}
\\
=
\begin{bmatrix}
s_x\cdot x\\
s_y\cdot y
\end{bmatrix}
$$ where $s_x$ and $s_y$ are $x$ scaling and $y$ scaling factors respectively.

To achieve multiple transformations (rotation and scaling as seen above), we can take a shortcut; instead of applying rotation matrix first and then applying the scaling matrix, we can just multiply corresponding matrices together like so and apply it once on the points:
$$
\begin{bmatrix}
s_x & 0\\
0 & s_y
\end{bmatrix} \cdot
\begin{bmatrix}
\cos \theta & -\sin \theta\\
\sin \theta & \cos \theta
\end{bmatrix}$$ How cool is that?! You can chain as many transformations as you please. Imagine the possibilities! If you have a thousand points and a hundred transformations that you want to apply these transformations to; you could apply hundred transformations on each point, resulting in 100*1000 matrix multiplications, or, you can "collapse" transformations into one matrix, by multiplying the hundred matrices together, and finally multiply this matrix with the thousand points. To put things in perspective, this shortcut solves such a problem in 10 minutes, as opposed to taking more than 24 hours using the older method (assuming a second per matrix multiplication). There are a bunch of other nice things about matrices making them our ultimate choice for a variety of tasks, ranging from computer graphics to machine learning.

Okay, so matrices do have a few tricks up their sleeves. Can we use matrices for any transformation? The answer is, no. But before we go there, let's get a feel for what a 2D matrix allows us to do. Below is a demo with a $2 \times 2$ matrix represented as sliders:


Playing around with the 2D matrix, we can make a bunch of observations:
• top left and bottom right elements scale our square like we expected.
• top right and bottom left elements "shear" the square. But more importantly, we can use a combination of the two shears to rotate the object, albeit with some unintended scaling.
• We can of course fix the scaling my using the first step to ensure the object remains the same size. This is essentially what happens during rotation: $\cos\theta$ and $\sin\theta$ are playing around to ensure shearing and re-scaling ultimately resulting is rotation.
• Ultimately, we soon realise that the square is always anchored to the center.

It's so dull, we could say our square is rather, square! In using matrices, we are limiting ourselves to a subset of transformations called Linear Transformations. TL;DR, linear transformations are the kind of transformations that preserve the origin. If you notice the square in the widget above, its origin is never transformed, it is anchored to the same point irrespective of how the other points move around it. So, how in the world do we move our dear squarie if 2D matrices do not allow "translations"?!

Let's evaluate our options. We can of course add the translation vector $(t_x, t_y)$ post matrix multiplication like so:
$$
\begin{bmatrix}
x'\\
y'
\end{bmatrix}
=
\begin{bmatrix}
t_x\\
t_y
\end{bmatrix}
+
\begin{bmatrix}
a & b\\
c & d
\end{bmatrix}
\cdot
\begin{bmatrix}
x\\
y
\end{bmatrix}
$$ but if we do this, we would lose out on all of the sweet matrix properties such as chaining etc. So, this idea is out.

Back to the drawing board! So matrices allow only a certain kind of transformations as seen in the demo above, where the points move around the origin. Let's think about what linear transformations mean in the next dimension!

Part II
The Prodigal Mathematician

A mathematician does not play miser when it comes to dimensions! So off we go adding another dimension to our square problem, resulting in a stack of squares, or a cuboid; let us now apply linear transformations to the cuboid. We can imagine the cuboid transforming around the origin akin to how the square did, i.e., with no translations. If we take a slice of the cube at plane parallel to the $x-y$ plane, we will see this (feel free to change the camera by dragging on the canvas):

Time to make some observations:
• The top left $2 \times 2$ elements affects the square exactly like the $2 \times 2$ matrix we saw above. This is good, we are starting off of solid foundations!
• The bottom left element and the element to its right shear the cuboid along the $z$-axis and as such, generally do not have an effect on our square.
• The top right element and the element below it result in the shearing of cuboid along the $x-y$ plane, and in turn facilitates in translating our square about the plane of slice. This is exactly what we wanted!

By promoting ourselves to a higher dimension, we have successfully found a way to translate our squarie all while retaining the sweet properties of matrices!

Just an aside: we do not really create the cuboid, you must have guessed that much; all we do is convert a 2D vector to 3D as:
$$\begin{bmatrix}
x\\
y
\end{bmatrix} \rightarrow
\begin{bmatrix}
x\\
y\\
1
\end{bmatrix}
$$

Epilogue

This concludes our journey that took us from a simple problem of wanting to move a 2D shape, to linear transformations and matrices, to tapping into higher dimensionality to solve this problem and eventually to finding a way to perform the standard transformations – Rotate, Scale and Translate, all using only matrices. This is exactly what is used in the three dimensional graphics too, only now, we would be tapping into the fourth dimension, and "slicing" it off to get a 3D representation that we are, oh so fond of! Finally for the sake of completeness, let's take a look at a typical transformation of a 3D point:
$$
\begin{eqnarray}
\begin{bmatrix}
x'\\
y'\\
z'\\
1\\
\end{bmatrix}
&=&
\begin{bmatrix}
s_x & 0 & 0 & 0\\
0 & s_y & 0 & 0\\
0 & 0 & s_z & 0\\
0 & 0 & 0 & 1\\
\end{bmatrix}\\
& \cdot &
\begin{bmatrix}
\cos \theta & -\sin \theta & 0 & 0\\
\sin \theta & \cos \theta & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1\\
\end{bmatrix}\\
& \cdot &
\begin{bmatrix}
1 & 0 & 0 & t_x\\
0 & 1 & 0 & t_y\\
0 & 0 & 1 & t_z\\
0 & 0 & 0 & 1\\
\end{bmatrix}\\
& \cdot &
\begin{bmatrix}
x\\
y\\
z\\
1\\
\end{bmatrix}\\
\end{eqnarray}$$

Author: Harsha

Harsha is not particularly keen on the out-of-body experience of writing in third-person. What a weirdo!

Leave a Reply

Your email address will not be published. Required fields are marked *