Defactoring algorithms: Difference between revisions

m Polish the intro
Algorithms: details in math and Python
Line 50: Line 50:
Smith defactoring has not yet been mathematically proven to always defactor mappings, while Pernet-Stein and column Hermite defactoring have been proven. Tests Douglas ran on thousands of random mappings, however, have empirically proven that all three methods work all of the time. Pernet-Stein and column Hermite are more closely related, and so they give the exact same results as each other every time, whereas Smith defactoring sometimes gives different results; however, after taking the HNF of the results, all three do become exactly the same.  
Smith defactoring has not yet been mathematically proven to always defactor mappings, while Pernet-Stein and column Hermite defactoring have been proven. Tests Douglas ran on thousands of random mappings, however, have empirically proven that all three methods work all of the time. Pernet-Stein and column Hermite are more closely related, and so they give the exact same results as each other every time, whereas Smith defactoring sometimes gives different results; however, after taking the HNF of the results, all three do become exactly the same.  


Douglas argues that Column Hermite defactoring is the preferable defactoring algorithm particularly when the routine for Hermite normal form also gives a [[Wikipedia: Unimodular matrix|unimodular]] transformation matrix, such as that in [[Wikipedia: Wolfram Language|Wolfram Language]]. The following reasons are listed:  
Douglas argues that Column Hermite defactoring is the preferable defactoring algorithm particularly when the routine for Hermite normal form also gives a [[Wikipedia: Unimodular matrix|unimodular]] transformation matrix, such as that in [[Wikipedia: Wolfram Language|Wolfram Language]] (formerly Mathematica, see [https://www.wolfram.com/language/ site]). The following reasons are listed:  
* It is computationally cheap, wasting little resources computing things irrelevant to the result<ref>
* It is computationally cheap, wasting little resources computing things irrelevant to the result<ref>
Using the following code in Wolfram Language:<br>
Using the following code in Wolfram Language:<br>
Line 80: Line 80:
=== Smith defactoring ===
=== Smith defactoring ===


Dave and Douglas did much of their work in [https://www.wolfram.com/language/ Wolfram Language] (formerly Mathematica), a popular programming language used for math problems. In this section we'll give examples using it.
Dave and Douglas did much of their work in Wolfram Language, a popular programming language used for math problems. Flora provided the mathematical formulae and [https://www.python.org/ Python] implementations based on their result.  


An input mapping <math>m</math>, such as the example Gene gives [[saturation|on the xen wiki page for Saturation]], {{rket|{{map|12 19 28 34}} {{map|26 41 60 72}}}}, in Wolfram Language you would have to write as a list:
An input mapping <math>m</math>, such as the example Gene gives in [[Saturation]], {{rket| {{map| 12 19 28 34 }} {{map| 26 41 60 72 }} }}, in Wolfram Language you would have to write as a list:


<nowiki>m = {{12,19,28,34},{26,41,60,72}};</nowiki>
<pre>
m = {{12,19,28,34},{26,41,60,72}};
</pre>


The implementation of Gene's method in Wolfram Language is simple. Just two lines:
The implementation of Gene's method in Wolfram Language is simple. Just two lines:


<nowiki>rightReducingMatrix[m_] := Last[SmithDecomposition[m]]
<pre>
smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]]</nowiki>
rightReducingMatrix[m_] := Last[SmithDecomposition[m]]
smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]]
</pre>


So the first thing that happens to <math>m</math> when you pass it in to <code>smithDefactor[]</code> is that it calls <code>rightReducingMatrix[]</code> on it. This will find the Smith decomposition (using a function built in to Wolfram Language), which gives you three outputs: the Smith normal form, flanked by its left and right reducing matrices. Gene asks only for the right reducing matrix, so we grab that with <code>Last[]</code>. So that's what the function on the first line, <code>rightReducingMatrix[]</code>, does.
So the first thing that happens to <math>m</math> when you pass it in to <code>smithDefactor[]</code> is that it calls <code>rightReducingMatrix[]</code> on it. This will find the Smith decomposition (using a function built in to Wolfram Language), which gives you three outputs: the Smith normal form, flanked by its left and right reducing matrices. Gene asks only for the right reducing matrix, so we grab that with <code>Last[]</code>. So that's what the function on the first line, <code>rightReducingMatrix[]</code>, does.
Line 97: Line 101:
Almost done! Except Gene not only defactors, he also calls for HNF, as we would, to achieve canonical (unique ID) form.
Almost done! Except Gene not only defactors, he also calls for HNF, as we would, to achieve canonical (unique ID) form.


<nowiki>normalForm[m_] := Last[HermiteDecomposition[m]]</nowiki>
<pre>
normalForm[m_] := Last[HermiteDecomposition[m]]
</pre>


Similar to the Smith Normal Form, we do a decomposition, which gives you the normal form plus some other bonus results. In this case we actually want the normal form itself, and it happens to be the last element in the result list. So putting it all together, we defactor and then put into normal form:
Similar to the Smith Normal Form, we do a decomposition, which gives you the normal form plus some other bonus results. In this case we actually want the normal form itself, and it happens to be the last element in the result list. So putting it all together, we defactor and then put into normal form:


<nowiki>rightReducingMatrix[m_] := Last[SmithDecomposition[m]];
<pre>
rightReducingMatrix[m_] := Last[SmithDecomposition[m]];
smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]];
smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]];


Line 108: Line 115:
m = {{12,19,28,34},{26,41,60,72}};
m = {{12,19,28,34},{26,41,60,72}};
normalForm[smithDefactor[m]]</nowiki>
normalForm[smithDefactor[m]]</nowiki>
</pre>


<nowiki>→ {{1,0,-4,-13},{0,1,4,10}}</nowiki>
<pre>
→ {{1,0,-4,-13},{0,1,4,10}}
</pre>


And that result matches what Gene finds in that xen wiki article. Defactoring and putting into normal form is equivalent to canonicalization.  
And that result matches what Gene finds in that article. Defactoring and putting into normal form is equivalent to canonicalization.  


=== Pernet-Stein defactoring ===
=== Pernet-Stein defactoring ===


This algorithm was described in the 2009 paper "Fast computation of HNF of random integer matrices"<ref>https://www.wstein.org/papers/hnf/pernet-stein-fast_computation_of_hnf_of_random_integer_matrices.pdf</ref> by Clément Pernet and William Stein. At the time Dave and Douglas wrote the first draft of this article and developed column Hermite defactoring, they were unaware of this algorithm. After publicizing column Hermite defactoring, they were referred by [[Graham Breed]] to a similar method in [http://x31eq.com/temper/ Graham's popular online regular temperament tool], implemented as <code>saturate</code><ref>https://bitbucket.org/x31eq/regular/src/9bc9b5bd8c8e0ced6223b29c3ea487719d120c43/kernel.py#lines-178</ref> since 2011, and which includes in its commented documentation a link to the aforementioned paper. Unable to reverse-engineer Gene Ward Smith's saturation algorithm, Graham had gone back to the same source Gene had supposedly gotten his inspiration from — the Sage software developed by William Stein, co-author of this paper — and came across this paper. Graham's implementation turned out to be much more similar to the original description by Pernet and Stein than Gene's, differing only by an additional unnecessary use of the HNF at the beginning (while Gene's, by virtue of using the Smith Normal Form, could be said to essentially use some variable number of extraneous uses of HNF). It is not clear how Gene derived his saturation algorithm from Pernet and Stein's work, however, if the fact that Dave and Douglas derived something almost identical to Pernet and Stein's algorithm from Gene's, it suggests that it's not unreasonable for development to lead someone in the opposite direction along the same path. The very close relationship between column Hermite defactoring and Pernet-Stein defactoring will be discussed shortly.
This algorithm was described in the 2009 paper ''Fast Computation of HNF of Random Integer Matrices'' by Clément Pernet and William Stein.<ref>[https://www.wstein.org/papers/hnf/pernet-stein-fast_computation_of_hnf_of_random_integer_matrices.pdf Clément Pernet and William Stein. ''Fast Computation of HNF of Random Integer Matrices''. Journal of Number Theory.]</ref>
 
For a mapping V of rank ''r'', its defactored form is given by
 
<math>\displaystyle (\operatorname {hnf} (V)_{:, 1:r})^{-1} \cdot V</math>
 
where hnf is the column-style Hermite normal form, and :, 1:''r'' is taking the first ''r'' columns.
 
Here is an implementation in Python (with [https://scipy.org/ SciPy] and [https://www.sympy.org/en/index.html SymPy]):
 
<syntaxhighlight lang="python">
import numpy as np
from scipy import linalg
from sympy import Matrix, normalforms
 
def __hnf_col (main):
    return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main))), dtype = int))
 
def __sat (main):
    return np.rint (linalg.inv (__hnf_col (main)) @ main).astype (int)
</syntaxhighlight>
 
Note that <code>__hnf_col</code> is just a wrapper for the column-style Hermite normal form, which takes the first ''r'' columns already. Here is the same functionality implemented in Wolfram:
 
<pre>
getGreatestFactorA[a_] := Transpose[Take[hnf[Transpose[a]], MatrixRank[a]]];
pernetSteinDefactor[m_] := Inverse[getGreatestFactorA[m].m;
</pre>


It should also be noted that toward the very beginning of Dave and Douglas's effort to develop a defactoring algorithm, Thomas McMurray Price described — in a message sent to the Xenharmonic Alliance Discord server — an algorithm almost identical to the Pernet-Stein algorithm, while also still being unaware of the Pernet-Stein paper. At this time, Dave and Douglas could not understand Tom's math well enough to realize that he'd just dropped the solution in their laps. Again, it wasn't until column Hermite defactoring was published that Tom commented on the findings and brought his ideas back into the conversation that Dave and Douglas realized the close connection between his ideas, Pernet-Stein defactoring, and column Hermite defactoring.
Note that the <code>hnf</code> is only available in row-style, so we have added a transpose at each end to convert it to column-style.  


Here is the implementation of Pernet-Stein defactoring:
==== Development note ====
At the time Dave and Douglas wrote the first draft of this article and developed column Hermite defactoring, they were unaware of this algorithm. After publicizing column Hermite defactoring, they were referred by [[Graham Breed]] to a similar method in [http://x31eq.com/temper/ Graham's popular online regular temperament tool], implemented as <code>saturate</code><ref>[https://bitbucket.org/x31eq/regular/src/9bc9b5bd8c8e0ced6223b29c3ea487719d120c43/kernel.py#lines-178 Bitbucket | x31eq / regular / kernel.py]</ref> since 2011, and which includes in its commented documentation a link to the aforementioned paper. Unable to reverse-engineer Gene Ward Smith's saturation algorithm, Graham had gone back to the same source Gene had supposedly gotten his inspiration from – the Sage software developed by William Stein, co-author of this paper – and came across this paper. Graham's implementation turned out to be much more similar to the original description by Pernet and Stein than Gene's, differing only by an additional unnecessary use of the HNF at the beginning (while Gene's, by virtue of using the Smith Normal Form, could be said to essentially use some variable number of extraneous uses of HNF). It is not clear how Gene derived his saturation algorithm from Pernet and Stein's work, however, if the fact that Dave and Douglas derived something almost identical to Pernet and Stein's algorithm from Gene's, it suggests that it is not unreasonable for development to lead someone in the opposite direction along the same path. The very close relationship between column Hermite defactoring and Pernet-Stein defactoring will be discussed shortly.


<nowiki>getGreatestFactorA[a_] := Transpose[Take[hnf[Transpose[a]], MatrixRank[a]]];
It should also be noted that toward the very beginning of Dave and Douglas's effort to develop a defactoring algorithm, Thomas McMurray Price described – in a message sent to the Xenharmonic Alliance Discord server – an algorithm almost identical to the Pernet-Stein algorithm, while also still being unaware of the Pernet-Stein paper. At this time, Dave and Douglas could not understand Tom's math well enough to realize that he'd just dropped the solution in their laps. Again, it was not until column Hermite defactoring was published that Tom commented on the findings and brought his ideas back into the conversation that Dave and Douglas realized the close connection between his ideas, Pernet-Stein defactoring, and column Hermite defactoring.
pernetSteinDefactor[m_] := Inverse[getGreatestFactorA[m].m;</nowiki>


=== Column Hermite defactoring ===
=== Column Hermite defactoring ===
For a mapping V of rank ''r'', its defactored form is found through the following steps. First, we must find the unimodular transformation matrix U associated with the Hermite normal form. This can be obtained by performing the column-style Hermite normal form operator on the mapping vertically augmented with an identity matrix:
<math>\displaystyle
\operatorname {hnf}:
\begin{bmatrix}
V \\
\hline
I
\end{bmatrix}
\rightarrow
\begin{bmatrix}
H \\
\hline
U
\end{bmatrix}
</math>
Then the defactored form is given by
<math>\displaystyle (U^{-1})_{1:r}</math>
where hnf is the column-style Hermite normal form, and 1:''r'' is taking the first ''r'' rows.
Here is an implementation in Python (with SciPy and SymPy):
<syntaxhighlight lang="python">
import numpy as np
from scipy import linalg
from sympy import Matrix, normalforms
def __hnf_col (main):
    return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main))), dtype = int))
def __sat (main):
    r = Matrix (main).rank ()
    unimodular = __hnf_col (np.vstack ((main, np.eye (main.shape[1], dtype = int))))[r:]
    return np.rint (linalg.inv (unimodular)[:r]).astype (int)
</syntaxhighlight>


Here is the implementation for column Hermite defactoring:
Note that <code>__hnf_col</code> is just a wrapper for the column-style Hermite normal form. Here is the same functionality implemented in Wolfram:  


<nowiki>hermiteUnimodular[m_] := Transpose[First[HermiteDecomposition[Transpose[m]]]]
<pre>
columnHermiteDefactor[m_] := Take[Inverse[hermiteUnimodular[m]],MatrixRank[m]]</nowiki>
hermiteUnimodular[m_] := Transpose[First[HermiteDecomposition[Transpose[m]]]]
columnHermiteDefactor[m_] := Take[Inverse[hermiteUnimodular[m]],MatrixRank[m]]
</pre>


So this implementation begins by finding the unimodular matrix from a column Hermite decomposition. To make it a column Hermite decomposition, we first transpose the matrix. The decomposition in Wolfram returns two items - the unimodular matrix, and the input matrix in Hermite normal form, in that order and in this case we actually want the unimodular matrix. So we take that with <code>First[]</code>. Then we <code>Transpose[]</code> it to in effect undo the transposition we did at the beginning.  
So this implementation begins by finding the unimodular matrix from a column Hermite decomposition. Note that the <code>hnf</code> is only available in row-style, so we first transpose the matrix to convert it to column-style. The decomposition in Wolfram returns two items the unimodular matrix, and the input matrix in Hermite normal form, in that order and in this case we actually want the unimodular matrix. So we take that with <code>First[]</code>. Then we transpose it again to in effect undo the transposition we did at the beginning.  


The next step is to invert that matrix, which is doable because it is unimodular; a key property of unimodular matrices is that they are always invertible, and because their determinant is ±1, if they contain all integer entries, their inverse will also contain all integer entries (which it does, and we need it to)<ref>Interesting tidbit regarding [[full-rank]] matrices and unimodular matrices: for square matrices, unimodularity implies full-rank, and while full-rank doesn't imply unimodularity, it does imply a non-zero determinant.</ref>.
The next step is to invert that matrix, which is doable because it is unimodular; a key property of unimodular matrices is that they are always invertible, and because their determinant is ±1, if they contain all integer entries, their inverse will also contain all integer entries (which it does, and we need it to)<ref>Interesting tidbit regarding [[full-rank]] matrices and unimodular matrices: for square matrices, unimodularity implies full-rank, and while full-rank does not imply unimodularity, it does imply a non-zero determinant.</ref>.


Finally we take only the top <math>r</math> rows of this, where <math>r</math> is the rank of the original matrix. That's found with <code>MatrixRank[m]</code>.
Finally we take only the top ''r'' rows of this. That is found with <code>MatrixRank[m]</code>.


==== How/why it works ====
==== How/why it works ====