of this collection [1], [2], and [3], we have now noticed:
- interpretation of multiplication of a matrix by a vector,
- the bodily which means of matrix-matrix multiplication,
- the conduct of a number of special-type matrices, and
- visualization of matrix transpose.
On this story, I need to share my perspective on what lies beneath matrix inversion, why completely different formulation associated to inversion are the way in which they really are, and at last, why calculating the inverse will be achieved far more simply for matrices of a number of particular sorts.
Listed here are the definitions that I take advantage of all through the tales of this collection:
- Matrices are denoted with uppercase (like ‘A‘, ‘B‘), whereas vectors and scalars are denoted with lowercase (like ‘x‘, ‘y‘ or ‘m‘, ‘n‘).
- |x| – is the size of vector ‘x‘,
- AT – is the transpose of matrix ‘A‘,
- B-1 – is the inverse of matrix ‘B‘.
Definition of the inverse matrix
From half 1 of this collection – “matrix-vector multiplication” [1], we keep in mind that a sure matrix “A“, when multiplied by a vector ‘x‘ as “y = Ax“, will be handled as a change of enter vector ‘x‘ into the output vector ‘y‘. If that’s the case, then the inverse matrix A-1 ought to do the reverse transformation – it ought to rework vector ‘y‘ again to ‘x‘:
[begin{equation*}
x = A^{-1}y
end{equation*}]

Substituting “y = Ax” there’ll give us:
[begin{equation*}
x = A^{-1}y = A^{-1}(Ax) = (A^{-1}A)x
end{equation*}]
which signifies that the product of the unique matrix and its inverse – A-1A, ought to be such a matrix, which does no transformation to any enter vector ‘x‘. In different phrases:
[begin{equation*}
(A^{-1}A) = E
end{equation*}]
the place “E” is the id matrix.

The primary query that may come up right here is, is it at all times attainable to reverse the affect of a sure matrix “A“? The reply is – it’s attainable, provided that no 2 completely different enter vectors x1 and x2 are being reworked by way of “A” into the identical output vector ‘y‘. In different phrases, the inverse matrix A-1 exists provided that for any output vector ‘y‘ there exists precisely one enter vector ‘x‘, which is reworked by way of “A” into it:
[begin{equation*}
y = Ax
end{equation*}]


On this collection, I don’t need to dive an excessive amount of into the formal a part of definitions and proofs. As an alternative, I need to observe a number of instances the place it’s really attainable to invert the given matrix “A“, and we are going to see how the inverse matrix A-1 is calculated for every of these instances.
Inverting chains of matrices
An essential system associated to matrix inverse is:
[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]
which states that the inverse of the product of matrices is the same as the product of inverse matrices, however within the reverse order. Let’s perceive why the order of matrices is being reversed.
What’s the bodily which means of the inverse (AB)-1? It ought to be such a matrix that turns again the affect of the matrix (AB). So if:
[begin{equation*}
y = (AB)x,
end{equation*}]
then, we must always have:
[begin{equation*}
x = (AB)^{-1}y.
end{equation*}]
Now the transformation “y = (AB)x” goes in 2 steps: first, we do:
[begin{equation*}
Bx = t,
end{equation*}]
which supplies an intermediate vector ‘t‘, after which that ‘t‘ is multiplied by “A“:
[begin{equation*}
y = At = A(Bx).
end{equation*}]

So the matrix “A” influenced the vector after it was already influenced by “B“. On this case, to show again such a sequential affect, at first we must always flip again the affect of “A“, by multiplying A-1 over ‘y‘, which can give us:
[begin{equation*}
A^{-1}y = A^{-1}(ABx) = (A^{-1}A)Bx = EBx = Bx = t,
end{equation*}]
… the intermediate vector ‘t‘, produced a bit above.

Notice, the vector ‘t’ participates right here twice.
Then, after getting again the intermediate vector ‘t‘, to revive ‘x‘, we also needs to reverse the affect of matrix “B“. And that’s achieved by multiplying B-1 over ‘t‘:
[begin{equation*}
B^{-1}t = B^{-1}(Bx) = (B^{-1}B)x = Ex = x,
end{equation*}]
or writing all of it in an expanded method:
[begin{equation*}
x = B^{-1}(A^{-1}A)Bx = (B^{-1}A^{-1})(AB)x,
end{equation*}]
which explicitly exhibits that to show again the affect of the matrix (AB) we must always use (B-1A-1).

Notice, each vectors ‘x’ and ‘t’ take part right here twice.
For this reason within the inverse of a product of matrices, their order is reversed:
[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]
The identical precept is utilized when we have now extra matrices in a series, like:
[begin{equation*}
(ABC)^{-1} = C^{-1}B^{-1}A^{-1}
end{equation*}]
Inversion of a number of particular matrices
Now, with the notion of what lies beneath matrix inversion, let’s view how matrices of a number of particular sorts are being inverted.
Inverse of cyclic-shift matrix
A cyclic-shift matrix is such a matrix “V“, which when multiplied by an enter vector ‘x‘, produces an output vector “y = Vx“, the place all values of ‘x‘ are cyclic shifted by some ‘ok‘ positions. To realize that, the cyclic-shift matrix “V” has 2 traces of ‘1’s, which reside parallel to its major diagonal, whereas all different cells of it are ‘0’s.
[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Vx =
begin{bmatrix}
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_3 x_4 x_5 x_1 x_2
end{pmatrix}
end{equation*}]

Now, how ought to we undo the transformation of the cyclic-shift matrix “V“? Clearly, we must always apply one other cyclic-shift matrix V-1, which now cyclic shifts all of the values of ‘y‘ downwards by ‘ok‘ positions (bear in mind, “V” was shifting all of the values of ‘x‘ upwards).
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = V^{-1}Vx =
begin{bmatrix}
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
0 & 0 & 1 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= V^{-1}y
end{equation*}]

For this reason the inverse of a cyclic-shift matrix is one other cyclic-shift matrix:
[begin{equation*}
V_1^{-1} = V_2
end{equation*}]
Greater than that, we are able to be aware that the X-diagram of V-1 is definitely the horizontal flip of the X-diagram of “V“. And from the earlier a part of this collection – “transpose of a matrix” [3], we keep in mind that the horizontal flip of an X-diagram corresponds to the transpose of that matrix. For this reason the inverse of a cyclic shift matrix is the same as its transpose:
[begin{equation*}
V^{-1} = V^T
end{equation*}]
Inverse of an change matrix
An change matrix, usually denoted by “J“, is such a matrix, which when multiplied by an enter vector ‘x‘, produces an output vector ‘y‘, having all of the values of ‘x‘, however in reverse order. To realize that, “J” has ‘1’s on its anti-diagonal, whereas all different cells are ‘0’s.
[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_5 x_4 x_3 x_2 x_1
end{pmatrix}
end{equation*}]

Clearly, to undo the sort of transformation, we must always apply yet one more change matrix.
[
begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = J^{-1}Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= J^{-1}y
end{equation*}]

For this reason the inverse of an change matrix is the change matrix itself:
[begin{equation*}
J^{-1} = J
end{equation*}]
Inverse of a permutation matrix
A permutation matrix is such a matrix “P” which, when multiplied by an enter vector ‘x‘, rearranges its values in a special order. To realize that, an n*n-sized permutation matrix “P” has ‘n‘ 1(s), organized in such a method that no two 1(s) seem on the identical row or the identical column. All different cells of “P” are 0(s).
[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Px =
begin{bmatrix}
0 & 0 & 1 & 0 & 0
1 & 0 & 0 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_3 x_1 x_4 x_5 x_2
end{pmatrix}
end{equation*}]

Now, what sort of matrix ought to be the inverse of a permutation matrix? In different phrases, the right way to undo the transformation of a permutation matrix “P“? Clearly, we have to do one other rearrangement, which acts in reverse order. So, for instance, if the enter worth x3 was moved by “P” to output worth y1, then within the inverse permutation matrix P-1, the enter worth y1 ought to be moved again to output worth x3. Which means that when drawing X-diagrams of permutation matrices “P-1” and “P“, one would be the reflection of the opposite.

Equally to the case of an change matrix, within the case of a permutation matrix, we are able to visually be aware that the X-diagrams of “P” and P-1 differ solely by a horizontal flip. That’s the reason the inverse of any permutation matrix “P” is the same as its transposition:
[begin{equation*}
P^{-1} = P^T
end{equation*}]
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = P^{-1}Px =
begin{bmatrix}
0 & 1 & 0 & 0 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0
1 & 0 & 0 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= P^{-1}y
end{equation*}]
Inverse of a rotation matrix
A rotation matrix on 2D airplane is such a matrix “R“, which, when multiplied by a vector (x1, x2), rotates the purpose “x=(x1, x2)” counter-clockwise by a sure angle “ϴ” across the null-point. Its system is:
[
begin{equation*}
begin{pmatrix}
y_1 y_2
end{pmatrix}
= y = Rx =
begin{bmatrix}
cos(theta) & -sin(theta)
sin(theta) & phantom{+} cos(theta)
end{bmatrix}
*
begin{pmatrix}
x_1 x_2
end{pmatrix}
end{equation*}]

Now, what ought to be the inverse of a rotation matrix? The best way to undo the rotation produced by a matrix “R“? Clearly, that ought to be one other rotation matrix, this time with an angle “-ϴ” (or “360°-ϴ“):
[begin{equation*}
R^{-1} =
begin{bmatrix}
cos(-theta) & -sin(-theta)
sin(-theta) & phantom{+} cos(-theta)
end{bmatrix}
=
begin{bmatrix}
phantom{+} cos(theta) & sin(theta)
-sin(theta) & cos(theta)
end{bmatrix}
=
R^T
end{equation*}]
Which is why the inverse of a rotation matrix is one other rotation matrix. We additionally see that the inverse R-1 is the same as the transpose of the unique matrix “R“.
Inverse of a triangular matrix
An upper-triangular matrix is a sq. matrix that has zeros under its diagonal. Due to that, in its X-diagram, there aren’t any arrows directed downwards:

The horizontal arrows correspond to cells of the diagonal, whereas the arrows which might be directed upwards correspond to the cells above the diagonal.
Equally, the lower-triangular matrix is outlined, which has zeroes above its major diagonal. On this article, we are going to focus solely on upper-triangular matrices, as for lower-triangular ones, inversion is carried out in an identical method.
For simplicity, let’s at first tackle inverting a 2×2-sized upper-triangular matrix ‘A‘.

As soon as ‘A‘ is multiplied by an enter vector ‘x‘, the consequence vector “y = Ax” has the next kind:
[begin{equation*}
y =
begin{pmatrix}
y_1 y_2
end{pmatrix}
=
begin{bmatrix}
a_{1,1} & a_{1,2}
0 & a_{2,2}
end{bmatrix}
begin{pmatrix}
x_1 x_2
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2
a_{2,2}x_2
end{aligned}
end{pmatrix}
end{equation*}]
Now, when calculating the inverse matrix A-1, we would like it to behave within the reverse order:

How ought to we restore (x1, x2) from (y1, y2)? The primary and easiest step is to revive x2, utilizing solely y2, as a result of y2 was initially affected solely by x2. We don’t want the worth of y1 for that:

Subsequent, how ought to we restore x1? This time, we are able to’t use solely y1, as a result of the worth “y1 = a1,1x1 + a1,2x2” is type of a mix of x1 and x2. However we are able to restore x1 if utilizing each y1 and y2 correctly. This time, y2 will assist to filter out the affect of x2, so the pure worth of x1 will be restored:

We see now that the inverse A-1 of the upper-triangular matrix “A” can be an upper-triangular matrix.
What about triangular matrices of bigger sizes? Let’s take this time a 3×3-sized matrix and discover its inverse analytically.

Values of the output vector ‘y‘ are obtained now from ‘x‘ within the following method:
[
begin{equation*}
y =
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
= Ax =
begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3}
0 & a_{2,2} & a_{2,3}
0 & 0 & a_{3,3}
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 + a_{1,3}x_3
a_{2,2}x_2 + a_{2,3}x_3
a_{3,3}x_3
end{aligned}
end{pmatrix}
end{equation*}]
As we’re eager about constructing the inverse matrix A-1, our goal is to seek out (x1, x2, x3), having the values of (y1, y2, y3):
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]
In different phrases, we should remedy the system of linear equations talked about above.
Doing that can restore at first the worth of x3 as:
[begin{equation*}
y_3 = a_{3,3}x_3, hspace{1cm} x_3 = frac{1}{a_{3,3}} y_3
end{equation*}]
which can make clear cells of the final row of A-1 :
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]
Having x3 found out, we are able to deliver all its occurrences to the left aspect of the system:
[begin{equation*}
begin{pmatrix}
y_1 – a_{1,3}x_3
y_2 – a_{2,3}x_3
y_3 – a_{3,3}x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2
a_{2,2}x_2
0
end{aligned}
end{pmatrix}
end{equation*}]
which can enable us to calculate x2 as:
[begin{equation*}
y_2 – a_{2,3}x_3 = a_{2,2}x_2, hspace{1cm}
x_2 = frac{y_2 – a_{2,3}x_3}{a_{2,2}} = frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}
end{equation*}]
This already clarifies the cells of the second row of A-1 :
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?} [0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} [0.2cm]
0 & 0 & frac{1}{a_{3,3}}
finish{bmatrix}
*
start{pmatrix}
y_1 y_2 y_3
finish{pmatrix}
finish{equation*}]
Lastly, having the values of x3 and x2 found out, we are able to do the identical trick of shifting now x2 to the left aspect of the system:
[begin{equation*}
begin{pmatrix}
begin{aligned}
y_1 – a_{1,3}x_3 & – a_{1,2}x_2
y_2 – a_{2,3}x_3 & – a_{2,2}x_2
y_3 – a_{3,3}x_3 &
end{aligned}
end{pmatrix}
=
begin{pmatrix}
a_{1,1}x_1
0
0
end{pmatrix}
end{equation*}]
from which x1 will probably be derived as:
[begin{equation*}
begin{aligned}
& y_1 – a_{1,3}x_3 – a_{1,2}x_2 = a_{1,1}x_1,
& x_1
= frac{y_1 – a_{1,3}x_3 – a_{1,2}x_2}{a_{1,1}}
= frac{y_1 – (a_{1,3}/a_{3,3})y_3 – a_{1,2}frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}}{a_{1,1}}
end{aligned}
end{equation*}]
so the primary row of matrix A-1 may also be clarified:
[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
frac{1}{a_{1,1}} & – frac{a_{1,2}}{a_{1,1}a_{2,2}} & frac{a_{1,2}a_{2,3} – a_{1,3}a_{2,2}}{a_{1,1}a_{2,2}a_{3,3}} [0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} [0.2cm]
0 & 0 & frac{1}{a_{3,3}}
finish{bmatrix}
*
start{pmatrix}
y_1 y_2 y_3
finish{pmatrix}
finish{equation*}]
After deriving A-1 analytically, we are able to see that additionally it is an upper-triangular matrix.
Listening to the sequence of actions that we used right here to calculate A-1, we are able to say for positive now that the inverse of any upper-triangular matrix ‘A‘ can be an upper-triangular matrix:

A similar judgment will present that the inverse of a lower-triangular matrix is one other lower-triangular matrix.
A numerical instance of inverting a series of matrices
Let’s have one other have a look at why, throughout an inversion of a series of matrices, their order is reversed. Recalling the system:
[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]
This time, for each ‘A‘ and ‘B‘, we are going to take sure sorts of matrices. The primary matrix “A=V” will probably be a cyclic shift matrix:

Let’s recall right here that to revive the enter vector ‘x‘, the inverse V-1 ought to do the other – cyclic shift values of the argument vector ‘y‘ downwards:

The second matrix “B=S” will probably be a diagonal matrix with completely different values on its major diagonal:

The inverse S-1 of such a scale matrix, to revive the unique vector ‘x‘, should halve solely the primary 2 values of its argument vector ‘y‘:

Now, what sort of conduct will the product matrix “VS” have? When calculating “y = VSx“, it should double solely the primary 2 values of the enter vector ‘x‘, and cyclic shift the whole consequence upwards.

We all know already that when the output vector “y = VSx” is calculated, to reverse the affect of the product matrix “VS” and to revive the enter vector ‘x‘, we must always do:
[begin{equation*}
x = (VS)^{-1}y = S^{-1}V^{-1}y
end{equation*}]
In different phrases, the order of matrices ‘V‘ and ‘S‘ ought to be reversed throughout inversion:

And what is going to occur if we attempt to invert the love of “VS” in an improper method, with out reversing the order of the matrices, assuming that V-1S-1 is what ought to be used for it:

We see that the unique vector (x1, x2, x3, x4) from the suitable aspect isn’t restored on the left aspect now. As an alternative, we have now vector
(2x1, x2, 0.5x3, x4) there. One motive for that is that the worth x3 shouldn’t be halved on its path, however it really will get halved as a result of for the time being when matrix S-1 is utilized, x3 seems on the second place from the highest, which really halves it. Similar refers back to the path of worth x1. All that leads to having an altered vector on the left aspect.
Conclusion
On this story, we have now checked out matrix inversion operation A-1 as one thing that undoes the transformation of the given matrix “A“. We have now noticed why inverting a series of matrices like (ABC)-1 really reverses the order of multiplication, leading to C-1B-1A-1. Additionally, we acquired a visible perspective on why inverting a number of particular sorts of matrices leads to one other matrix of the identical sort.
Thanks for studying!
That is in all probability the final a part of my “Understanding Matrices” collection. I hope you loved studying all 4 components! If that’s the case, be happy to observe me on LinkedIn, as hopefully different articles will probably be coming quickly, and I’ll put up updates there!
My gratitude to:
– Asya Papyan, for exact design of all of the used illustrations ( behance.web/asyapapyan ).
– Roza Galstyan, for cautious evaluation of the draft, and helpful strategies ( linkedin.com/in/roza-galstyan-a54a8b352/ ).If you happen to loved studying this story, be happy to attach with me on LinkedIn ( linkedin.com/in/tigran-hayrapetyan-cs/ ).
All used pictures, except in any other case famous, are designed by request of the creator.
References:
[1] – Understanding matrices | Half 1: Matrix-Vector Multiplication
[2] – Understanding matrices | Half 2: Matrix-Matrix Multiplication