Defactoring: Difference between revisions

Line 295:

Finally we take only the top <math>r</math> rows of this, where <math>r</math> is the rank of the original matrix. That's found with <code>MatrixRank[m]</code>.

==== why it works ====

==== how/why it works ====

The basic idea is that the column Hermite decomposition leaves any common row factor the mapping might have had in the HNF part, while preserving in the unimodular part everything that's still meaningful about how the mapping works. So that's why we throw the column HNF away and keep the unimodular part. The rest of the algorithm is basically just "undoing" stuff so we get back to the structure of the matrix that we input.

Line 303:

Finally, we take only the top <math>r</math> rows, which again is an "undo" type operation. Here what we're undoing is that we had to graduate from a rectangle to a square temporarily, storing our important information in the form of this invertible square unimodular matrix temporarily, so we could invert it while keeping it integer, but now we need to get it back into the same type of rectangular shape as we put in. So that's what this part is for.<ref>There is probably some special meaning or information in the rows you throw away here, but we're not sure what it might be.</ref>

~~One reason for doing a~~ column Hermite ~~decomposition on a mapping can be understood by comparing the sizes of the unimodular matrices. Matrices are often described~~ as ~~<math>m×n</math>~~, ~~where <math>m</math> is~~ the row ~~count and <math>n</math> is the~~ column ~~count~~. In the ~~case~~ of ~~mappings it may be superior~~ to ~~use variable names corresponding to the domain concepts of rank <math>r</math>~~, and ~~dimension <math>d</math>~~, i.e. to ~~speak of ~~<~~math~~>~~r×d<~~/~~math><~~/~~span> mappings~~. ~~The key bit of info here is that — for non-trivial mappings, anyway — <math>d<~~/~~math><~~/~~span> is always greater than <math>r<~~/~~math>~~</~~span~~>~~. So a standard row-based~~ Hermite decomposition~~, i~~.~~e. to~~ the ~~right, is going to produce an~~ <~~span~~>~~<math>r×r<~~/~~math><~~/~~span> unimodular matrix, while a column-based Hermite decomposition, i~~.e. ~~to the bottom, is going to produce a <math>d×d<~~/~~math><~~/~~span> unimodular matrix. For example, 5-limit meantone has <math>r=2<~~/~~math><~~/~~span> and <math>d=3<~~/~~math>~~</~~span~~>, ~~so a standard row-based~~ Hermite ~~decomposition is going~~ to ~~produce~~ a ~~unimodular~~ matrix that is ~~only 2×2, while~~ the ~~column-based~~ Hermite decomposition ~~will produce one~~ that ~~is 3×3. With <math>d>r</math>, it~~'~~s clear that the column-based decomposition in general will always produced~~ the ~~larger unimodular matrix~~. ~~In fact~~, the ~~row-based decomposition is too small to be capable of enclosing an amount~~ of ~~entries equal to~~ the ~~count of entries in~~ the ~~original mapping, and therefore it could never support preserving the entirety of~~ the ~~important information from the input (in terms of our example, a 3×3 matrix can hold a 2×3~~ matrix, ~~but a 2×2 matrix cannot)~~.

==== various additional ways of thinking about how/why it works ====

===== relationship with Smith defactoring =====

If you describe column Hermite defactoring as "column-Hermite inverse take-r-rows", the analogous way to describe Smith defactoring would be like "row-Hermite column-Hermite row-Hermite column-Hermite row-Hermite column-Hermite ... inverse take-r-rows". In other words, the nature of Smith decomposition is to essentially repeatedly Hermite decompose from either angle until you've achieved Smith normal form, which is wasteful and unnecessary in this context, when all that is required is a single column-Hermite decomposition. This helps explain why column Hermite defactoring is more performant in general than Smith defactoring, when code is run against thousands of examples at a time.

According to Wolfram<ref>See: https://reference.wolfram.com/language/ref/SmithDecomposition.html</ref>, "the Smith decomposition can often be found by two iterations of the Hermite decomposition". This notion is echoed in the Sage docs<ref>See: https://doc.sagemath.org/html/en/reference/matrices/sage/matrix/matrix_integer_dense.html</ref>, which read, "Use Hermite normal form twice to find an invertible matrix whose inverse transforms a matrix with the same row span as self to its saturation," remembered that saturation is the same as defactoring. The reason for the multiple applications of Hermite decomposition to achieve Smith normal form is that you won't necessarily get a diagonal matrix on the first go. But as we know from Smith defactoring, the center matrix of the Smith decomposition — the Smith normal form one — is not the target matrix, so its exact form is not necessary to achieve to accomplish defactoring.

===== relationship with VEA =====

Another thought that might help congeal the notion of column Hermite defactoring for you is to use what you know about multimaps (AKA "wedgies"), in particular a) what they are, and b) how to defactor them. The answer to a) is that they are just the minor determinants (or "minors" for short) of rectangular matrices, or in other words, the closest thing rectangular matrices such as mappings have to a real determinant. And the answer to b) is that you simply extract the GCD of the entries in this list of minors. So if defactoring a list of minor determinants means dividing common factors out, then it should be little surprise that achieving a real determinant of ±1 is equivalent to defactoring, and thereby that leveraging the unimodularity of the other matrix produced by the Hermite decomposition should be valuable in this capacity.

==== ~~relationship with Smith defactoring~~ ====

===== unimodular matrix size =====

~~If you describe~~ column Hermite ~~defactoring~~ as "column-~~Hermite inverse take-~~r-~~rows"~~, the ~~analogous way~~ to ~~describe Smith defactoring would be like "row-Hermite~~ column-Hermite ~~row~~-~~Hermite column-Hermite~~ row-Hermite column-Hermite .~~.. inverse take-~~r-~~rows"~~. In ~~other words~~, the ~~nature of Smith~~ decomposition is to ~~essentially repeatedly Hermite decompose from either angle until you've achieved Smith normal form~~, ~~which is wasteful~~ and ~~unnecessary~~ in ~~this context~~, ~~when all that is required is~~ a ~~single column-Hermite decomposition. This helps explain why column Hermite defactoring is more performant in general than Smith defactoring~~, ~~when code is run against thousands of examples at~~ a ~~time~~.

One reason for doing a column Hermite decomposition on a mapping can be understood by comparing the sizes of the unimodular matrices. Matrices are often described as <math>m×n</math>, where <math>m</math> is the row count and <math>n</math> is the column count. In the case of mappings it may be superior to use variable names corresponding to the domain concepts of rank <math>r</math>, and dimension <math>d</math>, i.e. to speak of <math>r×d</math> mappings. The key bit of info here is that — for non-trivial mappings, anyway — <math>d</math> is always greater than <math>r</math>. So a standard row-based Hermite decomposition, i.e. to the right, is going to produce an <math>r×r</math> unimodular matrix, while a column-based Hermite decomposition, i.e. to the bottom, is going to produce a <math>d×d</math> unimodular matrix. For example, 5-limit meantone has <math>r=2</math> and <math>d=3</math>, so a standard row-based Hermite decomposition is going to produce a unimodular matrix that is only 2×2, while the column-based Hermite decomposition will produce one that is 3×3. With <math>d>r</math>, it's clear that the column-based decomposition in general will always produced the larger unimodular matrix. In fact, the row-based decomposition is too small to be capable of enclosing an amount of entries equal to the count of entries in the original mapping, and therefore it could never support preserving the entirety of the important information from the input (in terms of our example, a 3×3 matrix can hold a 2×3 matrix, but a 2×2 matrix cannot).

==== by hand ====

@@ Line 295: / Line 295: @@
 Finally we take only the top <span><math>r</math></span> rows of this, where <span><math>r</math></span> is the rank of the original matrix. That's found with <code>MatrixRank[m]</code>.
-==== why it works ====
+==== how/why it works ====
 The basic idea is that the column Hermite decomposition leaves any common row factor the mapping might have had in the HNF part, while preserving in the unimodular part everything that's still meaningful about how the mapping works. So that's why we throw the column HNF away and keep the unimodular part. The rest of the algorithm is basically just "undoing" stuff so we get back to the structure of the matrix that we input.
@@ Line 303: / Line 303: @@
 Finally, we take only the top <span><math>r</math></span> rows, which again is an "undo" type operation. Here what we're undoing is that we had to graduate from a rectangle to a square temporarily, storing our important information in the form of this invertible square unimodular matrix temporarily, so we could invert it while keeping it integer, but now we need to get it back into the same type of rectangular shape as we put in. So that's what this part is for.<ref>There is probably some special meaning or information in the rows you throw away here, but we're not sure what it might be.</ref>
-One reason for doing a column Hermite decomposition on a mapping can be understood by comparing the sizes of the unimodular matrices. Matrices are often described as <span><math>m×n</math></span>, where <span><math>m</math></span> is the row count and <span><math>n</math></span> is the column count. In the case of mappings it may be superior to use variable names corresponding to the domain concepts of rank <span><math>r</math></span>, and dimension <span><math>d</math></span>, i.e. to speak of <span><math>r×d</math></span> mappings. The key bit of info here is that — for non-trivial mappings, anyway — <span><math>d</math></span> is always greater than <span><math>r</math></span>. So a standard row-based Hermite decomposition, i.e. to the right, is going to produce an <span><math>r×r</math></span> unimodular matrix, while a column-based Hermite decomposition, i.e. to the bottom, is going to produce a <span><math>d×d</math></span> unimodular matrix. For example, 5-limit meantone has <span><math>r=2</math></span> and <span><math>d=3</math></span>, so a standard row-based Hermite decomposition is going to produce a unimodular matrix that is only 2×2, while the column-based Hermite decomposition will produce one that is 3×3. With <span><math>d>r</math></span>, it's clear that the column-based decomposition in general will always produced the larger unimodular matrix. In fact, the row-based decomposition is too small to be capable of enclosing an amount of entries equal to the count of entries in the original mapping, and therefore it could never support preserving the entirety of the important information from the input (in terms of our example, a 3×3 matrix can hold a 2×3 matrix, but a 2×2 matrix cannot).
+==== various additional ways of thinking about how/why it works ====
+===== relationship with Smith defactoring =====
+If you describe column Hermite defactoring as "column-Hermite inverse take-r-rows", the analogous way to describe Smith defactoring would be like "row-Hermite column-Hermite row-Hermite column-Hermite row-Hermite column-Hermite ... inverse take-r-rows". In other words, the nature of Smith decomposition is to essentially repeatedly Hermite decompose from either angle until you've achieved Smith normal form, which is wasteful and unnecessary in this context, when all that is required is a single column-Hermite decomposition. This helps explain why column Hermite defactoring is more performant in general than Smith defactoring, when code is run against thousands of examples at a time.
+According to Wolfram<ref>See: https://reference.wolfram.com/language/ref/SmithDecomposition.html</ref>, "the Smith decomposition can often be found by two iterations of the Hermite decomposition". This notion is echoed in the Sage docs<ref>See: https://doc.sagemath.org/html/en/reference/matrices/sage/matrix/matrix_integer_dense.html</ref>, which read, "Use Hermite normal form twice to find an invertible matrix whose inverse transforms a matrix with the same row span as self to its saturation," remembered that saturation is the same as defactoring. The reason for the multiple applications of Hermite decomposition to achieve Smith normal form is that you won't necessarily get a diagonal matrix on the first go. But as we know from Smith defactoring, the center matrix of the Smith decomposition — the Smith normal form one — is not the target matrix, so its exact form is not necessary to achieve to accomplish defactoring.
+===== relationship with VEA =====
 Another thought that might help congeal the notion of column Hermite defactoring for you is to use what you know about multimaps (AKA "wedgies"), in particular a) what they are, and b) how to defactor them. The answer to a) is that they are just the minor determinants (or "minors" for short) of rectangular matrices, or in other words, the closest thing rectangular matrices such as mappings have to a real determinant. And the answer to b) is that you simply extract the GCD of the entries in this list of minors. So if defactoring a list of minor determinants means dividing common factors out, then it should be little surprise that achieving a real determinant of ±1 is equivalent to defactoring, and thereby that leveraging the unimodularity of the other matrix produced by the Hermite decomposition should be valuable in this capacity.
-==== relationship with Smith defactoring ====
+===== unimodular matrix size =====
-If you describe column Hermite defactoring as "column-Hermite inverse take-r-rows", the analogous way to describe Smith defactoring would be like "row-Hermite column-Hermite row-Hermite column-Hermite row-Hermite column-Hermite ... inverse take-r-rows". In other words, the nature of Smith decomposition is to essentially repeatedly Hermite decompose from either angle until you've achieved Smith normal form, which is wasteful and unnecessary in this context, when all that is required is a single column-Hermite decomposition. This helps explain why column Hermite defactoring is more performant in general than Smith defactoring, when code is run against thousands of examples at a time.
+One reason for doing a column Hermite decomposition on a mapping can be understood by comparing the sizes of the unimodular matrices. Matrices are often described as <span><math>m×n</math></span>, where <span><math>m</math></span> is the row count and <span><math>n</math></span> is the column count. In the case of mappings it may be superior to use variable names corresponding to the domain concepts of rank <span><math>r</math></span>, and dimension <span><math>d</math></span>, i.e. to speak of <span><math>r×d</math></span> mappings. The key bit of info here is that — for non-trivial mappings, anyway — <span><math>d</math></span> is always greater than <span><math>r</math></span>. So a standard row-based Hermite decomposition, i.e. to the right, is going to produce an <span><math>r×r</math></span> unimodular matrix, while a column-based Hermite decomposition, i.e. to the bottom, is going to produce a <span><math>d×d</math></span> unimodular matrix. For example, 5-limit meantone has <span><math>r=2</math></span> and <span><math>d=3</math></span>, so a standard row-based Hermite decomposition is going to produce a unimodular matrix that is only 2×2, while the column-based Hermite decomposition will produce one that is 3×3. With <span><math>d>r</math></span>, it's clear that the column-based decomposition in general will always produced the larger unimodular matrix. In fact, the row-based decomposition is too small to be capable of enclosing an amount of entries equal to the count of entries in the original mapping, and therefore it could never support preserving the entirety of the important information from the input (in terms of our example, a 3×3 matrix can hold a 2×3 matrix, but a 2×2 matrix cannot).
 ==== by hand ====