Defactoring algorithms: Difference between revisions
ArrowHead294 (talk | contribs) m Update links and footnote references |
ArrowHead294 (talk | contribs) m Formatting of markuop |
||
| Line 183: | Line 183: | ||
<math>\displaystyle (U^{-1})_{1:r}</math> | <math>\displaystyle (U^{-1})_{1:r}</math> | ||
where hnf is the column-style Hermite normal form, and 1:''r'' is taking the first ''r'' rows. | where hnf is the column-style Hermite normal form, and {{nowrap|1:''r''}} is taking the first ''r'' rows. | ||
Here is an implementation in Python (with SciPy and SymPy): | Here is an implementation in Python (with SciPy and SymPy): | ||
| Line 217: | Line 217: | ||
The basic idea is that the column Hermite decomposition leaves any common row factor the mapping might have had in the HNF part, while preserving in the unimodular part everything that's still meaningful about how the mapping works. So that's why we throw the column HNF away and keep the unimodular part. The rest of the algorithm is basically just "undoing" stuff so we get back to the structure of the matrix that we input. | The basic idea is that the column Hermite decomposition leaves any common row factor the mapping might have had in the HNF part, while preserving in the unimodular part everything that's still meaningful about how the mapping works. So that's why we throw the column HNF away and keep the unimodular part. The rest of the algorithm is basically just "undoing" stuff so we get back to the structure of the matrix that we input. | ||
So inverting is one of those "undo" type operations. To understand why, we have to understand the nature of this decomposition. What the Hermite decomposition does is return a unimodular matrix U and a Hermite normal form matrix H such that if you left-multiply your original matrix A by the unimodular matrix U you get the normal form matrix H, or in other words, UA = H. So, think of it this way. If A is what we input, and we want something sort of like A, but U is what we've taken, and U is multiplied with A in this equality to get H, where H is also kind of like A, then probably what we really want is something like U, but inverted. | So inverting is one of those "undo" type operations. To understand why, we have to understand the nature of this decomposition. What the Hermite decomposition does is return a unimodular matrix U and a Hermite normal form matrix H such that if you left-multiply your original matrix A by the unimodular matrix U you get the normal form matrix H, or in other words, {{nowrap|UA {{=}} H}}. So, think of it this way. If A is what we input, and we want something sort of like A, but U is what we've taken, and U is multiplied with A in this equality to get H, where H is also kind of like A, then probably what we really want is something like U, but inverted. | ||
Finally, we take only the top <math>r</math> rows, which again is an "undo" type operation. Here what we're undoing is that we had to graduate from a rectangle to a square temporarily, storing our important information in the form of this invertible square unimodular matrix temporarily, so we could invert it while keeping it integer, but now we need to get it back into the same type of rectangular shape as we put in. So that's what this part is for.<ref group="note">There is probably some special meaning or information in the rows you throw away here, but we're not sure what it might be.</ref> | Finally, we take only the top <math>r</math> rows, which again is an "undo" type operation. Here what we're undoing is that we had to graduate from a rectangle to a square temporarily, storing our important information in the form of this invertible square unimodular matrix temporarily, so we could invert it while keeping it integer, but now we need to get it back into the same type of rectangular shape as we put in. So that's what this part is for.<ref group="note">There is probably some special meaning or information in the rows you throw away here, but we're not sure what it might be.</ref> | ||
| Line 237: | Line 237: | ||
The following proof is adapted primarily from Tom Price's thinking: | The following proof is adapted primarily from Tom Price's thinking: | ||
# The input matrix is an m×n matrix A. | # The input matrix is an m×n matrix A. | ||
# It decomposes into a slightly bigger and square ( | # It decomposes into a slightly bigger and square {{nowrap|(''n'' × ''n'')}} unimodular matrix U and another m×n matrix which is not exactly A in HNF (because we only have to use unimodular operations so far as to get all the all-zero columns off to one side of A; we don't need to satisfy all of the conventional constraints of HNF), but we'll still call it H. The unimodular matrix is a transformation from A into H, so, {{nowrap|AU {{=}} H}}. | ||
# If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (m×m) matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A). | # If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (m×m) matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A). | ||
# We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get | # We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get {{nowrap|S{{inv}}AUU{{inv}} {{=}} S{{inv}}HU{{inv}}}}. The U's cancel out on the left so we end up with {{nowrap|S{{inv}}A {{=}} S{{inv}}HU{{inv}}}}. At first glance we don't seem to have gained any further insight. But there's more we can do from here. | ||
# Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have S{{inv}}A = TU{{inv}}. | # Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have {{nowrap|S{{inv}}A {{=}} TU{{inv}}}}. | ||
# Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating U⁻¹. That's how we think of the output of column Hermite defactoring—our supposedly defactored matrix—so let's call that D. We now have | # Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating U⁻¹. That's how we think of the output of column Hermite defactoring—our supposedly defactored matrix—so let's call that D. We now have {{nowrap|S{{inv}}A {{=}} D}}. (This is how we can see that the Pernet-Stein method of multiplying the input matrix by a transformation matrix that is a truncated and inversed column Hermite normal form of the input is equivalent to our column Hermite method, which takes the other route to the same result: inverting and truncating the unimodular result of the Hermite decomposition.) | ||
# We need to prove now that D has three qualities: | # We need to prove now that D has three qualities: | ||
:: a) It's defactored, | :: a) It's defactored, | ||
:: b) It still represents the same temperament (i.e. it has the same nullspace as A), and | :: b) It still represents the same temperament (i.e. it has the same nullspace as A), and | ||
:: c) It's integer. | :: c) It's integer. | ||
# Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, | # Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, S{{inv}}A is surjective as a function on lattice points (in other words, there's no points in the tempered lattice that JI lattice points don't map to). We begin with the fact that H has the same image as A, because right-multiplication with a unimodular matrix such as U doesn't change the image. Then S has the same image as H, too, and therefore the same image as A, because removing the all-zero columns doesn't change the image either. Now that we've established this, we can assert that S⁻¹A is surjective by describing a lattice point x such that {{nowrap|y {{=}} S{{inv}}Ax}} for any given lattice point y. And because S and A have the same image, we know that {{nowrap|Sy {{=}} Ax}}, and therefore {{nowrap|y {{=}} S{{inv}}Ax}}. | ||
# Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. | # Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. S{{inv}} is clearly invertible, being itself the inverse of S. A way to understand this is: a non-invertible matrix is the same as a singular matrix, i.e. one whose determinant is 0. So as long as you don't wipe things out by essentially multiplying by 0, the nullspace information is preserved, just scaled. | ||
# Proving (c) is a bit trickier, because | # Proving (c) is a bit trickier, because S{{inv}} is not necessarily an integer matrix. But can show that S{{inv}}A is an integer matrix by showing that it maps lattice points to lattice points. Suppose we have that same equation from the proof of (a), namely that {{nowrap|y {{=}} S{{inv}}Ax}}, where x is a lattice point. We want to show that y is a lattice point. Again, since A and S have the same image (when considered as functions on lattice points), there must be some lattice point z with {{nowrap|Sz {{=}} Ax}}. But we also know that Sy = Ax. Since S is invertible, and therefore injective, {{nowrap|y {{=}} z}}, so y is a lattice point. | ||
===== Relationship with other defactoring methods ===== | ===== Relationship with other defactoring methods ===== | ||
| Line 289: | Line 289: | ||
After we know how to do these two things individually, we will learn how to tweak them and assemble them together in order to perform a complete column Hermite defactoring. | After we know how to do these two things individually, we will learn how to tweak them and assemble them together in order to perform a complete column Hermite defactoring. | ||
Fortunately, both of these two processes can be done using a technique you may already be familiar with if you have learned how to calculate the nullspace of a mapping by hand (as demonstrated [[Dave Keenan | Fortunately, both of these two processes can be done using a technique you may already be familiar with if you have learned how to calculate the nullspace of a mapping by hand (as demonstrated [[Dave Keenan & Douglas Blumeyer's guide to RTT/Exploring temperaments#Nullspace|here]]): | ||
# Augmenting your matrix with an identity matrix | # Augmenting your matrix with an identity matrix | ||
| Line 320: | Line 320: | ||
Now we begin applying elementary row operations until the part on the left is in Hermite Normal Form. If you need to review the definition of HNF and its constraints, you can find more detail [[matrix echelon forms#HNF|here]]. The quick and dirty is: | Now we begin applying elementary row operations until the part on the left is in Hermite Normal Form. If you need to review the definition of HNF and its constraints, you can find more detail [[matrix echelon forms#HNF|here]]. The quick and dirty is: | ||
# All pivots > 0 | # All pivots > 0 | ||
# All entries in pivot columns below the pivots | # All entries in pivot columns below the pivots = 0 | ||
# All entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot | # All entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot | ||
One special thing about computing the HNF is that we're not allowed to use all elementary operations; in particular we're not allowed to multiply (or divide) rows. Our main technique, then, will be adding or subtract rows from each other. This, of course, includes adding or subtracting ''multiples'' of rows from each other, because doing so is equivalent to performing those additions or subtractions one at a time (note that adding or subtracting ''multiples'' of rows from each other is significantly different than simply ''multiplying'' a row by itself).<ref group="note">The fact that you're not allowed to multiply or divide is equivalent to the fact that at every step along the way, the augmented matrix remains unimodular.</ref> | One special thing about computing the HNF is that we're not allowed to use all elementary operations; in particular we're not allowed to multiply (or divide) rows. Our main technique, then, will be adding or subtract rows from each other. This, of course, includes adding or subtracting ''multiples'' of rows from each other, because doing so is equivalent to performing those additions or subtractions one at a time (note that adding or subtracting ''multiples'' of rows from each other is significantly different than simply ''multiplying'' a row by itself).<ref group="note">The fact that you're not allowed to multiply or divide is equivalent to the fact that at every step along the way, the augmented matrix remains unimodular.</ref> | ||
| Line 353: | Line 353: | ||
</math> | </math> | ||
We're actually quite close to done! All we need to do is flip the signs on the 2nd row. But wait, you protest! Isn't that multiplying a row by | We're actually quite close to done! All we need to do is flip the signs on the 2nd row. But wait, you protest! Isn't that multiplying a row by −1, which we specifically forbade? Well, sure, but that just shows we need to clarity what we're concerned about, which is essentially enfactoring. Multiplying by −1 does not change the GCD of the row, where multiplying by −2 or 2 would. Note that because the process for taking the HNF forbids multiplying ''or dividing'', it will never introduce enfactoring where was there was none previously, but it also does not remove enfactoring that is there. | ||
Perhaps another helpful way of thinking about this is that multiplying the row by | Perhaps another helpful way of thinking about this is that multiplying the row by −1 does not alter the potential effects this row could have being added or subtracted from other rows. It merely swaps addition and subtraction. Whereas multiplying the row by any integer with absolute value greater than 1 ''would'' affect the potential effects this row could have being added or subtracted from other rows: it would limit them. | ||
So, let's do that sign flip: | So, let's do that sign flip: | ||
| Line 368: | Line 368: | ||
And we're done! Let's confirm though. | And we're done! Let's confirm though. | ||
# '''All pivots | # '''All pivots > 0?''' Check. The 1st row's pivot is 2 and the 2nd row's pivot is 11. | ||
# '''All entries in pivot columns below the pivots | # '''All entries in pivot columns below the pivots = 0'''? Check. This only applies to one entry—the bottom right one, below the 1st row's pivot—but it is indeed 0. | ||
# '''All entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot'''? Check. Again, this only applies to one entry—the center top one, above the 2nd row's pivot of 11—but it is 5, and 5 is indeed non-negative and | # '''All entries in pivot columns above the pivots ≥ 0 and strictly less than the pivot'''? Check. Again, this only applies to one entry—the center top one, above the 2nd row's pivot of 11—but it is 5, and 5 is indeed non-negative and < 11. | ||
And so, we have performed the Hermite decomposition. The matrix to the left of the augmentation line—the one in place of our original matrix—is that original matrix in HNF: | And so, we have performed the Hermite decomposition. The matrix to the left of the augmentation line—the one in place of our original matrix—is that original matrix in HNF: | ||
| Line 449: | Line 449: | ||
</math> | </math> | ||
Okay, let's next target the bottom-center entry. How can we make that | Okay, let's next target the bottom-center entry. How can we make that −2 into a 0? Let's add the 2nd row to the 3rd row 2 times: | ||
<math> | <math> | ||
| Line 469: | Line 469: | ||
</math> | </math> | ||
Finally, we just need to divide the 3rd row by | Finally, we just need to divide the 3rd row by −2. Yes, unlike with the Hermite decomposition, all elementary row operations are permitted, including multiplying or dividing rows. And in this case there's no restrictions against non-integers (which we didn't even explicitly mention when doing the HNF, but yes, HNF requires integers). So here's where we end up: | ||
<math> | <math> | ||
| Line 507: | Line 507: | ||
</math> | </math> | ||
This matrix is chosen specifically to demonstrate the importance of the unimodularity of the other matrix produced by the Hermite decomposition. A unimodular matrix is defined by having a determinant of | This matrix is chosen specifically to demonstrate the importance of the unimodularity of the other matrix produced by the Hermite decomposition. A unimodular matrix is defined by having a determinant of ±1. And what does this have to do with inverses? Well, take a look at the determinant of our original matrix here, {{rket|{{map|3 -2 4}} {{map|1 0 2}} {{map|0 1 0}}}}. It's 2. The determinant of an invertible matrix will tell you what the LCM of all the denominators in the inverse will be.<ref group="note">If you're familiar with the formula for the Moore-Penrose inverse of rectangular matrices, you may recognize this fact as akin to how you multiply the outside of the pseudoinverse by the reciprocal of the determinant of the matrix.</ref><ref group="note">This may also shed some light on the fact that the only square matrices that are not invertible are those with determinants equal to 0.</ref> And so, the fact that the other matrix produced by the Hermite decomposition is unimodular means that not only is it invertible, if it has only integer terms (which it will, being involved in HNF), then its inverse will also have only integer terms. And this is important because the inverse of a Hermite unimodular matrix is just one step away from the defactored form of an input matrix. | ||
}} | }} | ||
| Line 782: | Line 782: | ||
\end{matrix} \right]</math> | \end{matrix} \right]</math> | ||
The pivots are 1 and 11, so that 11 tells us that we had a common factor of 11<ref group="note">In the doubly-enfactored case of {{rket|{{map|17 16 -4}} {{map|4 -4 1}}}}, i.e. with a common factor of 33 = 3 | The pivots are 1 and 11, so that 11 tells us that we had a common factor of 11<ref group="note">In the doubly-enfactored case of {{rket|{{map|17 16 -4}} {{map|4 -4 1}}}}, i.e. with a common factor of {{nowrap|33 {{=}} 3 × 11}}, the two pivots of the HNF are 3 and 11, putting each of them on display separately.</ref><ref group="note">It's interesting to observe that while the 11-enfactoring can be observed in the original matrix as a linear combination of 2 of the 1st row with -3 of the 2nd row, i.e. 2{{map|6 5 -4}} + -3{{map|4 -4 1}} = {{map|0 22 -11}}, the linear combination of ''columns'', i.e. slicing the original {{rket|{{map|6 5 -4}} {{map|4 -4 1}}}} mapping the other direction like {{rbra|{{vector|6 4}} {{vector|5 -4}} {{vector|-4 1}}}}, that leads to the revelation of this 11 is completely different: −1{{vector|6 4}} + 2{{vector|5 -4}} + 1{{vector|-4 1}} = {{vector|0 11}}.</ref>. You could say that the HNF is useful for identifying common factors, but not for removing them. But if you leave them behind in the column-style HNF, the information that is retained in the unimodular matrix which is the other product of the Hermite decomposition, is enough to preserve everything important about the temperament, to get you back to where you started via an inverse and a trimming of extraneous rows. | ||
}} | }} | ||