Defactoring: Difference between revisions
Cmloegcmluin (talk | contribs) |
Cmloegcmluin (talk | contribs) →by hand: HNF and inverse by hand |
||
| Line 313: | Line 313: | ||
<div class="mw-collapsible"> | <div class="mw-collapsible"> | ||
To work through DCF by hand, we first need to know how to do a couple of its building blocks by hand: Hermite decomposition, and inversion. Both of these processes can be done using | To work through DCF by hand, we first need to know how to do a couple of its building blocks by hand: Hermite decomposition, and inversion. Both of these processes can be done using a technique you may be familiar with already if you've calculated the null-space of a mapping by hand (as demonstrated [[User:Cmloegcmluin/RTT_How-To#null-space|here]]): augmenting your matrix with an identity matrix, then performing elementary row or column operations until a desired state is achieved. | ||
In the Wolfram Language algorithm described above, the matrix is first transposed and then Hermite decomposed. By hand, we do it a bit more simply. | ===== Hermite decomposition by hand ===== | ||
Let's do a Hermite decomposition example first. We begin with the mapping for the temperament 12&26bbb, which looks like {{vector|{{map|12 19 28}} {{map|26 43 60}}}}, or as a matrix: | |||
<math>\left[ \begin{array} {rrr} | |||
12 & 19 & 28 \\ | |||
26 & 43 & 60 \\ | |||
\end{array} \right]</math> | |||
Then we augment it with an identity matrix. However, unlike with the method for finding the null-space, we do not augment it on the bottom, but to the right: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
12 & 19 & 28 \\ | |||
26 & 43 & 60 \\ | |||
\end{array} & \rule[-5mm]{0.375mm}{12mm} & \begin{array} {r} | |||
1 & 0 \\ | |||
0 & 1 \\ | |||
\end{array} \end{array} \right]</math> | |||
Now we begin applying elementary row operations until the part on the left is in Hermite Normal Form. If you need to review the definition of HNF and its constraints, you can find that in detail later in this article [[User:Cmloegcmluin/Defactored_canonical_form#HNF|here]]. The quick and dirty is: | |||
# all pivots > 0 | |||
# all entries in pivot columns below the pivots = 0 | |||
# all entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot | |||
One special thing about computing the HNF is that we're not allowed to use all elementary operations; in particular we're not allowed to multiply (or divide) rows. Our main technique, then, will be adding or subtract rows from each other. This, of course, includes adding or subtracting ''multiples'' of rows from each other, because doing so is equivalent to performing those additions or subtractions one at a time (note that adding or subtracting ''multiples'' of rows from each other is significantly different than simply ''multiplying'' a row by itself). | |||
So let's begin by subtracting the 1st row from the 2nd row, and let's do it 2 times, because we can see that would get the pivot of the 2nd row pretty close to 0, which is where we're trying to get it, per the 2nd HNF constraint above. | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
12 & 19 & 28 \\ | |||
2 & 5 & 4 \\ | |||
\end{array} & \rule[-5mm]{0.375mm}{12mm} & \begin{array} {r} | |||
1 & 0 \\ | |||
-2 & 1 \\ | |||
\end{array} \end{array} \right]</math> | |||
Okay. We can see now that if we subtract 6 times the 2nd row back from the 1st row now, we'll create a 0 in the first column. It's not in the bottom left where we need it, but that's no big deal; reordering the rows is an allowed row operation. So let's do that subtraction: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
0 & -11 & 4 \\ | |||
2 & 5 & 4 \\ | |||
\end{array} & \rule[-5mm]{0.375mm}{12mm} & \begin{array} {r} | |||
13 & -6 \\ | |||
-2 & 1 \\ | |||
\end{array} \end{array} \right]</math> | |||
And then reorder the rows: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
2 & 5 & 4 \\ | |||
0 & -11 & 4 \\ | |||
\end{array} & \rule[-5mm]{0.375mm}{12mm} & \begin{array} {r} | |||
-2 & 1 \\ | |||
13 & -6 \\ | |||
\end{array} \end{array} \right]</math> | |||
We're actually quite close to done! All we need to do is flip the signs on the 2nd row. But wait, you protest! Isn't that multiplying a row by -1, which we specifically forbade? Well, sure, but that just shows we need to clarity what we're concerned about, which is essentially enfactoring. Multiplying by -1 does not change the GCD of the row, where multiplying by -2 or 2 would. Note that because the process for taking the HNF forbids multiplying ''or dividing'', it will never introduce enfactoring where was there was none previously, but it also does not remove enfactoring that is there. | |||
Perhaps another helpful way of thinking about this is that multiplying the row by -1 does not alter the potential effects this row could have being added or subtracted from other rows. It merely swaps addition and subtraction. Whereas multiplying the row by any integer with absolute value greater than 1 ''would'' affect the potential effects this row could have being added or subtracted from other rows: it would limit them. | |||
So, let's do that sign flip: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
2 & 5 & 4 \\ | |||
0 & 11 & -4 \\ | |||
\end{array} & \rule[-5mm]{0.375mm}{12mm} & \begin{array} {r} | |||
-2 & 1 \\ | |||
-13 & 6 \\ | |||
\end{array} \end{array} \right]</math> | |||
And we're done! Let's confirm though. | |||
# '''all pivots > 0?''' Check. The 1st row's pivot is 2 and the 2nd row's pivot is 11. | |||
# '''all entries in pivot columns below the pivots = 0'''? Check. This only applies to one entry — the bottom right one, below the 1st row's pivot — but it is indeed 0. | |||
# '''all entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot'''? Check. Again, this only applies to one entry — the center top one, above the 2nd row's pivot of 11 — but it is 5, and 5 is indeed non-negative and < 11. | |||
And so, we have performed the Hermite decomposition. The matrix to the left of the augmentation line — the one in place of our original matrix — is that original matrix in HNF: | |||
<math>\left[ \begin{array} {rrr} | |||
2 & 5 & 4 \\ | |||
0 & 11 & -4 \\ | |||
\end{array} \right]</math> | |||
And the matrix to the right of the augmentation line is the other output of the Hermite decomposition: a unimodular matrix with a special property. | |||
<math>\left[ \begin{array} {rrr} | |||
-2 & 1 \\ | |||
-13 & 6 \\ | |||
\end{array} \right]</math> | |||
This special property is that if you take the original matrix and left multiply it by this unimodular matrix, you will get the HNF. | |||
<math>\begin{array} {l} \left[ \begin{array} {rrr} | |||
-2 & 1 \\ | |||
-13 & 6 \\ | |||
\end{array} \right] & × & \left[ \begin{array} {r} | |||
12 & 19 & 28 \\ | |||
26 & 43 & 60 \\ | |||
\end{array} \right] & = & \left[ \begin{array} {r} | |||
2 & 5 & 4 \\ | |||
0 & 11 & -4 \\ | |||
\end{array} \right] \end{array}</math> | |||
And that's all there is to the Hermite decomposition. | |||
===== inversion by hand ===== | |||
Now let's take a look at how to do inversion by hand. The first thing to note is that this process only works for rectangular matrices. So you will not be using on non-trivial mappings or comma-bases directly. But, as you know, there is a useful application for this process on the unimodular matrix which is the other result of the Hermite decomposition than the HNF, and unimodular matrices are always square. | |||
Here's a random example matrix (well, one I stole from a quick web search, anyway): | |||
<math>\left[ \begin{array} {rrr} | |||
3 & -2 & 4 \\ | |||
1 & 0 & 2 \\ | |||
0 & 1 & 0 \\ | |||
\end{array} \right]</math> | |||
The first step is to augment it. Similar to the Hermite decomposition process, we augment it to the right: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
3 & -2 & 4 \\ | |||
1 & 0 & 2 \\ | |||
0 & 1 & 0 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
1 & 0 & 0 \\ | |||
0 & 1 & 0 \\ | |||
0 & 0 & 1 \\ | |||
\end{array} \end{array} \right]</math> | |||
Our goal now is to use elementary row operations until the original matrix — the one to the left of the augmentation line — is an identity matrix. At that point, the matrix on the right side of the augmentation line — the one that started out as an identity matrix — will be the inverse of the original matrix. I don't know about you, but for me, this relationship was instantly intuitive and memorable, and I'm super glad I learned how to this one by hand! | |||
As our first step, let's rearrange the rows of the matrix. Just some obvious-looking moves (that ''probably'' won't backfire) to get us superficially closer to the identity matrix on the left: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
1 & 0 & 2 \\ | |||
0 & 1 & 0 \\ | |||
3 & -2 & 4 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
0 & 1 & 0 \\ | |||
0 & 0 & 1 \\ | |||
1 & 0 & 0 \\ | |||
\end{array} \end{array} \right]</math> | |||
Okay, now let's target the bottom-right entry. How can we make that 3 into a 0? Let's subtract the 1st row from the 3rd row 3 times: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
1 & 0 & 2 \\ | |||
0 & 1 & 0 \\ | |||
0 & -2 & -2 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
0 & 1 & 0 \\ | |||
0 & 0 & 1 \\ | |||
1 & -3 & 0 \\ | |||
\end{array} \end{array} \right]</math> | |||
Okay, let's next target the bottom-center entry. How can we make that -2 into a 0? Let's add the 2nd row to the 3rd row 2 times: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
1 & 0 & 2 \\ | |||
0 & 1 & 0 \\ | |||
0 & 0 & -2 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
0 & 1 & 0 \\ | |||
0 & 0 & 1 \\ | |||
1 & -3 & 2 \\ | |||
\end{array} \end{array} \right]</math> | |||
Next, let's target the top-right entry, making that a 0 by adding the 3rd row to the 1st row: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
1 & 0 & 0 \\ | |||
0 & 1 & 0 \\ | |||
0 & 0 & -2 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
1 & -2 & 2 \\ | |||
0 & 0 & 1 \\ | |||
1 & -3 & 2 \\ | |||
\end{array} \end{array} \right]</math> | |||
Finally, we just need to divide the 3rd row by -2. Yes, unlike with the Hermite decomposition, all elementary row operations are permitted, including multiplying or dividing rows. And in this case there's no restrictions against non-integers (which we didn't even explicitly mention when doing the HNF, but yes, HNF requires integers). So here's where we end up: | |||
<math>\left[ \begin{array} {l} \begin{array} {rrr} | |||
1 & 0 & 0 \\ | |||
0 & 1 & 0 \\ | |||
0 & 0 & 1 \\ | |||
\end{array} & \rule[-7.5mm]{0.375mm}{18mm} & \begin{array} {r} | |||
1 & -2 & 2 \\ | |||
0 & 0 & 1 \\ | |||
-\frac12 & \frac32 & -1 \\ | |||
\end{array} \end{array} \right]</math> | |||
I chose this matrix specifically to demonstrate the importance of the unimodularity of the other matrix produced by the Hermite decomposition. A unimodular matrix is defined by having a determinant of ±1. And what does this have to do with inverses? Well, take a look at the determinant of our original matrix here, {{vector|{{map|3 -2 4}} {{map|1 0 2}} {{map|0 1 0}}}}. It's 2. The determinant of an invertible matrix will tell you what the LCM of all the denominators in the inverse will be<ref>If you're familiar with the formula for the Moore-Penrose pseudoinverse of rectangular matrices, you may recognize this fact as akin to how you multiply the outside of the pseudoinverse by the reciprocal of the determinant of the matrix.</ref><ref>This may also shed some light on the fact that the only square matrices that are not invertible are those with determinants equal to 0.</ref> And so, the fact that the other matrix produced by the Hermite decomposition is unimodular means that not only is it invertible, if it has only integer terms (which it will, being involved in HNF), then its inverse will also have only integer terms. And this is important because the inverse of a Hermite unimodular matrix is just one step away from the defactored form of an input matrix. | |||
===== putting it all together ===== | |||
In the Wolfram Language algorithm described above, the input matrix is first transposed and then Hermite decomposed. In fact, this is due to a limitation of the built-in <code>HermiteDecomposition[]</code> function in Wolfram Language. The HNF is well understood both in its more common and default row-style as well as a column-style, so it might be expected for Wolfram Language to provide an option for the <code>HermiteDecomposition[]</code> function to perform it by columns. The transposition our code does is a workaround for this lack. But by hand, there's no reason we can't simply do a column-style Hermite decomposition directly. Literally the only difference is that instead of augmenting the original matrix to the right, we augment it to the bottom, and use elementary column operations instead of elementary row operations (so it's actually more similar to the process for computing the null-space in that way). | |||
Because column-style Hermite decomposition is | |||
By hand, because we know what we're doing, we do it a bit more simply. | |||
<math>\begin{array} {l} \begin{array} {r} | <math>\begin{array} {l} \begin{array} {r} | ||