Talk:Defactoring algorithms: Difference between revisions

(5 intermediate revisions by 2 users not shown)

Line 46:

::::# find H = HNF (A);

::::# find the pseudoinverse A+;

::::# ~~Multiply~~ them.

::::# multiply them.

:::: By contrast, the Pernet-Stein method only needs H i.e. the first step at this point. It could be simplified a bit in an algorithm that's designed to return both, but the unimodular matrix is still something more than the HNF itself (see discussion at [https://github.com/sympy/sympy/pull/22845 add igcdLLL to numbers by smichr · Pull Request #22845 · sympy/sympy]). Your comparsion made it as if they were the same.

Line 57:

import numpy as np

from scipy import linalg

from sympy import Matrix

from sympy import Matrix, normalforms

def ~~__hnf~~ (main):

def __hnf_col (main):

return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main)).T).T, dtype = int))

return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main))), dtype = int))

def __sat (main):

return np.rint (linalg.inv (~~__hnf~~ (main.T).T) @ main).astype (int)

return np.rint (linalg.inv (__hnf_col (main)) @ main).astype (int)

</syntaxhighlight>

Line 73:

:::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 10:58, 10 February 2023 (UTC)

::::: Ah, very interesting. Thank you for this. I've reviewed that GitHub thread, in particular https://github.com/sympy/sympy/pull/22845#issuecomment-1013936367, "It looks like [Cohen's] Alg 2.4.10 is almost exactly the same [as Alg 2.4.5] except that it also computes the unimodular matrix." Cohen's 2.4.5 is what's implemented in sympy here: https://github.com/sympy/sympy/blob/99b4abb6313990136cde93735f552a6a98152fc6/sympy/polys/matrices/normalforms.py#L177-L248 . Cohen's work can be found here: http://tomlr.free.fr/Math%E9matiques/Math%20Complete/Number%20theory/A%20course%20in%20computational%20algebraic%20number%20theory%20-%20Cohen%20H..pdf (2.4.5 is on page 69, and 2.4.10 is on page 74). So, yes, I can now see that it is possible to compute HNF without also computing the unimodular matrix. However, the unimodular matrix comes essentially for free; it's nothing but the same operations the algorithm chooses to apply to the original matrix, but also applied to an identity matrix. The main effort of the algorithm is choosing those unimodular transformations; tracking them is almost zero cost, computationally.

::::: So I think the article should be amended to reflect this fact. Sorry I missed it.

::::: However, in order to prove that Pernet-Stein is more efficient than column Hermite defactoring, you'd have to prove that its extra matrix multiplication step at the end is cheaper than just tracking the unimodular transformations applied to the original matrix. Intuitively, I doubt this is the case. Maybe they'd be almost the same. But empirically testing this would be more work than I'm willing to invest. So it's fine by me if you want to include both methods for now.

::::: I don't particularly care about "beating" Pernet and Stein. I only wanted to simplify things for our readership if I could. Maybe one day we could settle it. It would be a big win if we only had to teach them one method.

::::: And regardless to the outcome of possible empirical performance testing, assuming the results will be very very similar whichever way that goes, I think we should teach the one which is the most straightforward to understand ''why''/''how'' it works. I make an argument in the article for that being column Hermite defactoring. You are welcome to write a counterargument for Pernet-Stein if you see it differently, and believe that some readers will find ''its'' process more obvious and illuminating.

::::: In short, if we can't agree on one algorithm, that's alright. It's not necessarily a strict loss. We can still find a different sort of mutual strength through our different approaches to the problem.

::::: As for this part of what you said, though, I'm sorry, but I can't figure out what you're referring to: "Then you actually use U-1 instead of U, so to sum it up you're taking an inverse after taking a pseudoinverse, with multiplication involved also. That looks to me like there's a lot of back and forth in it." Can you explain another way? Perhaps it may be helpful to reference this set of flow charts I prepared: https://en.xen.wiki/w/File:Comparison_of_defactoring_algorithms.png --[[User:Cmloegcmluin|Cmloegcmluin]] ([[User talk:Cmloegcmluin|talk]]) 17:37, 10 February 2023 (UTC)

:::::: I hope you've actively tried to understand what I said, cuz I believe I was presenting it clearly enough. I'm summarizing them as lists here since I apparently can't edit the figure. The Pernet-Stein method does these things for a mapping V:

::::::# find the transpose A = VT;

::::::# find the Hermite normal form H = HNF (A);

::::::# take the first ''r'' rows H1:''r'';

::::::# find the transpose H1:''r''T;

::::::# find the inverse (H1:''r''T)-1;

::::::# multiply it by the original matrix and return.

:::::: You should see it involves an HNF, two transposes, an inverse, a multiplication, and a slice, as you noted. In comparison, the column Hermite method does these things:

::::::# find the transpose A = VT;

::::::# find the Hermite normal form H = HNF (A);

::::::# find the pseudoinverse A+;

::::::# multiply them for the unimodular matrix U = HA+;

::::::# find the inverse U-1;

::::::# find the transpose (U-1)T;

::::::# take the first ''r'' rows and return.

:::::: You should see it involves an HNF, two transposes, an inverse, a pseudoinverse, a multiplication, and a slice. So to sum it up it has an additional pseudoinverse step compared to Pernet-Stein. Note I unwrapped the "Hermite decomposition" into an HNF, a pseudoinverse, and a multiplication, shown in the steps 2–4, cuz that's what happens in general. After that you immediately take the inverse as in step 5. A pseudoinverse (step 3) followed by an inverse (step 5), that's what I meant by "back and forth".

:::::: Now, as I said, an algorithm designed to return both the HNF and the unimodular matrix could somewhat optimize it, but that's still not the same as HNF itself. Apparently you missed the point of the SymPy development thread I showed you – they said asking for the unimodular matrix would slow it much down. Moreover you can't rely on the implementation of such an optimized algorithm as I've seen many libraries that come with an HNF routine without also giving the unimodular matrix, such as the current stable version of SymPy, as well as Sage, another library based on Python. And I'm afraid expecting the user to implement the Cohen algorithm themselves is even less practical.

:::::: In fact, using the unimodular matrix with SymPy is more difficult than it seems, since the HNF routine removes the extra rows of zeros. I have to pad them back to the correct dimension, which I'm yet to implement, before doing the multiplication. I think it's more advisable to simply avoid the unimodular matrix and its quirks.

:::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 21:43, 10 February 2023 (UTC)

::::::: I am actively trying to understand what you're saying. I'm sorry I missed the point you were trying to make with the GitHub thread. Given how long and obscure it was, I should have asked you to be more specific, rather than try to guess at your intentions. It looked to me, based on a quick look over their code and Cohen's work it was based on, that their algorithm worked in a way fairly similar to my existing understanding of how HNF is computed, whereby tracking unimodular transformations is cheap. It's quite possible that their algorithm looks superficially similar but is completely different. I am not interested enough in this problem to figure it out, though, at least not now. Sorry. So again, it's fine by me for now if you want to keep both algorithms.

::::::: But I'm still at a loss about how you've gotten the impression that column Hermite defactoring involves a pseudoinverse step. As you can see in the diagram I shared in my previous reply, it does not. The word "pseudoinverse" only appears once in the article, in a tangential footnote. On my end, "column Hermite defactoring" is a method which Dave Keenan and I developed about a year ago, and the only information about it exists on this wiki article. Are you confusing it with something else? These are the steps:

::::::::# find the transpose A = VT;

::::::::# find the unimodular matrix U = First[HermiteDecomposition[A]];

::::::::# find the inverse U-1;

::::::::# find the transpose (U-1)T;

::::::::# take the first ''r'' rows and return.

::::::: You can find an example of us computing the entire thing by hand in the article. No pseudoinverse is required. --[[User:Cmloegcmluin|Cmloegcmluin]] ([[User talk:Cmloegcmluin|talk]]) 22:12, 10 February 2023 (UTC)

:::::::: Oh, allow me to correct myself. After some trial and errors it turns out the pseudoinverse doesn't get you the unimodular matrix, but a singular matrix, so it doesn't work. The correct way seems to be using the augmented matrix to track the elementary row operations in HNF: (A|I) → (H|U), as you pointed out and as one of them showed in the development thread. Here's my implementation of the column Hermite method:

:::::::: <syntaxhighlight lang="python">

def __sat (main):

r = Matrix (main).rank ()

unimodular = __hnf_col (np.vstack ((main, np.eye (main.shape[1], dtype = int))))[r:]

return np.rint (linalg.inv (unimodular)[:r]).astype (int)

</syntaxhighlight>

:::::::: As I said, "Hermite decomposition" isn't a single operation. Here it involves concatenation, HNF, and slice, tho it doesn't involve the pseudoinverse. Compare Pernet-Stein:

:::::::: <syntaxhighlight lang="python">

def __sat (main):

r = Matrix (main).rank ()

return np.rint (linalg.inv (__hnf_col (main)[:, :r]) @ main).astype (int)

</syntaxhighlight>

:::::::: Note the slice and thus computing the rank are redundant; they're there just for futureproofness.

:::::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 10:58, 11 February 2023 (UTC)

:::::::: Edit: update the code to adopt column-style HNF so as to get rid of all the transpositions.

:::::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 07:08, 15 February 2023 (UTC)

@@ Line 46: / Line 46: @@
 ::::# find H = HNF (A);
 ::::# find the pseudoinverse A<sup>+</sup>;
-::::# Multiply them.
+::::# multiply them.
 :::: By contrast, the Pernet-Stein method only needs H i.e. the first step at this point. It could be simplified a bit in an algorithm that's designed to return both, but the unimodular matrix is still something more than the HNF itself (see discussion at [https://github.com/sympy/sympy/pull/22845 add igcdLLL to numbers by smichr · Pull Request #22845 · sympy/sympy]). Your comparsion made it as if they were the same.
@@ Line 57: / Line 57: @@
 import numpy as np
 from scipy import linalg
-from sympy import Matrix
+from sympy import Matrix, normalforms
-def __hnf (main):
+def __hnf_col (main):
-     return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main)).T).T, dtype = int))
+     return np.flip (np.array (normalforms.hermite_normal_form (Matrix (np.flip (main))), dtype = int))
 def __sat (main):
-     return np.rint (linalg.inv (__hnf (main.T).T) @ main).astype (int)
+     return np.rint (linalg.inv (__hnf_col (main)) @ main).astype (int)
 </syntaxhighlight>
@@ Line 73: / Line 73: @@
 :::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 10:58, 10 February 2023 (UTC)
+::::: Ah, very interesting. Thank you for this. I've reviewed that GitHub thread, in particular https://github.com/sympy/sympy/pull/22845#issuecomment-1013936367, "It looks like [Cohen's] Alg 2.4.10 is almost exactly the same [as Alg 2.4.5] except that it also computes the unimodular matrix." Cohen's 2.4.5 is what's implemented in sympy here: https://github.com/sympy/sympy/blob/99b4abb6313990136cde93735f552a6a98152fc6/sympy/polys/matrices/normalforms.py#L177-L248 . Cohen's work can be found here: http://tomlr.free.fr/Math%E9matiques/Math%20Complete/Number%20theory/A%20course%20in%20computational%20algebraic%20number%20theory%20-%20Cohen%20H..pdf (2.4.5 is on page 69, and 2.4.10 is on page 74). So, yes, I can now see that it is possible to compute HNF without also computing the unimodular matrix. However, the unimodular matrix comes essentially for free; it's nothing but the same operations the algorithm chooses to apply to the original matrix, but also applied to an identity matrix. The main effort of the algorithm is choosing those unimodular transformations; tracking them is almost zero cost, computationally.
+::::: So I think the article should be amended to reflect this fact. Sorry I missed it.
+::::: However, in order to prove that Pernet-Stein is more efficient than column Hermite defactoring, you'd have to prove that its extra matrix multiplication step at the end is cheaper than just tracking the unimodular transformations applied to the original matrix. Intuitively, I doubt this is the case. Maybe they'd be almost the same. But empirically testing this would be more work than I'm willing to invest. So it's fine by me if you want to include both methods for now.
+::::: I don't particularly care about "beating" Pernet and Stein. I only wanted to simplify things for our readership if I could. Maybe one day we could settle it. It would be a big win if we only had to teach them one method.
+::::: And regardless to the outcome of possible empirical performance testing, assuming the results will be very very similar whichever way that goes, I think we should teach the one which is the most straightforward to understand ''why''/''how'' it works. I make an argument in the article for that being column Hermite defactoring. You are welcome to write a counterargument for Pernet-Stein if you see it differently, and believe that some readers will find ''its'' process more obvious and illuminating.
+::::: In short, if we can't agree on one algorithm, that's alright. It's not necessarily a strict loss. We can still find a different sort of mutual strength through our different approaches to the problem.
+::::: As for this part of what you said, though, I'm sorry, but I can't figure out what you're referring to: "Then you actually use U<sup>-1</sup> instead of U, so to sum it up you're taking an inverse after taking a pseudoinverse, with multiplication involved also. That looks to me like there's a lot of back and forth in it." Can you explain another way? Perhaps it may be helpful to reference this set of flow charts I prepared: https://en.xen.wiki/w/File:Comparison_of_defactoring_algorithms.png --[[User:Cmloegcmluin|Cmloegcmluin]] ([[User talk:Cmloegcmluin|talk]]) 17:37, 10 February 2023 (UTC)
+:::::: I hope you've actively tried to understand what I said, cuz I believe I was presenting it clearly enough. I'm summarizing them as lists here since I apparently can't edit the figure. The Pernet-Stein method does these things for a mapping V:
+::::::# find the transpose A = V<sup>T</sup>;
+::::::# find the Hermite normal form H = HNF (A);
+::::::# take the first ''r'' rows H<sub>1:''r''</sub>;
+::::::# find the transpose H<sub>1:''r''</sub><sup>T</sup>;
+::::::# find the inverse (H<sub>1:''r''</sub><sup>T</sup>)<sup>-1</sup>;
+::::::# multiply it by the original matrix and return.
+:::::: You should see it involves an HNF, two transposes, an inverse, a multiplication, and a slice, as you noted. In comparison, the column Hermite method does these things:
+::::::# find the transpose A = V<sup>T</sup>;
+::::::# find the Hermite normal form H = HNF (A);
+::::::# find the pseudoinverse A<sup>+</sup>;
+::::::# multiply them for the unimodular matrix U = HA<sup>+</sup>;
+::::::# find the inverse U<sup>-1</sup>;
+::::::# find the transpose (U<sup>-1</sup>)<sup>T</sup>;
+::::::# take the first ''r'' rows and return.
+:::::: You should see it involves an HNF, two transposes, an inverse, a pseudoinverse, a multiplication, and a slice. So to sum it up it has an additional pseudoinverse step compared to Pernet-Stein. Note I unwrapped the "Hermite decomposition" into an HNF, a pseudoinverse, and a multiplication, shown in the steps 2–4, cuz that's what happens in general. After that you immediately take the inverse as in step 5. A pseudoinverse (step 3) followed by an inverse (step 5), that's what I meant by "back and forth".
+:::::: Now, as I said, an algorithm designed to return both the HNF and the unimodular matrix could somewhat optimize it, but that's still not the same as HNF itself. Apparently you missed the point of the SymPy development thread I showed you – they said asking for the unimodular matrix would slow it much down. Moreover you can't rely on the implementation of such an optimized algorithm as I've seen many libraries that come with an HNF routine without also giving the unimodular matrix, such as the current stable version of SymPy, as well as Sage, another library based on Python. And I'm afraid expecting the user to implement the Cohen algorithm themselves is even less practical.
+:::::: In fact, using the unimodular matrix with SymPy is more difficult than it seems, since the HNF routine removes the extra rows of zeros. I have to pad them back to the correct dimension, which I'm yet to implement, before doing the multiplication. I think it's more advisable to simply avoid the unimodular matrix and its quirks.
+:::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 21:43, 10 February 2023 (UTC)
+::::::: I am actively trying to understand what you're saying. I'm sorry I missed the point you were trying to make with the GitHub thread. Given how long and obscure it was, I should have asked you to be more specific, rather than try to guess at your intentions. It looked to me, based on a quick look over their code and Cohen's work it was based on, that their algorithm worked in a way fairly similar to my existing understanding of how HNF is computed, whereby tracking unimodular transformations is cheap. It's quite possible that their algorithm looks superficially similar but is completely different. I am not interested enough in this problem to figure it out, though, at least not now. Sorry. So again, it's fine by me for now if you want to keep both algorithms.
+::::::: But I'm still at a loss about how you've gotten the impression that column Hermite defactoring involves a pseudoinverse step. As you can see in the diagram I shared in my previous reply, it does not. The word "pseudoinverse" only appears once in the article, in a tangential footnote. On my end, "column Hermite defactoring" is a method which Dave Keenan and I developed about a year ago, and the only information about it exists on this wiki article. Are you confusing it with something else? These are the steps:
+::::::::# find the transpose A = V<sup>T</sup>;
+::::::::# find the unimodular matrix U = First[HermiteDecomposition[A]];
+::::::::# find the inverse U<sup>-1</sup>;
+::::::::# find the transpose (U<sup>-1</sup>)<sup>T</sup>;
+::::::::# take the first ''r'' rows and return.
+::::::: You can find an example of us computing the entire thing by hand in the article. No pseudoinverse is required. --[[User:Cmloegcmluin|Cmloegcmluin]] ([[User talk:Cmloegcmluin|talk]]) 22:12, 10 February 2023 (UTC)
+:::::::: Oh, allow me to correct myself. After some trial and errors it turns out the pseudoinverse doesn't get you the unimodular matrix, but a singular matrix, so it doesn't work. The correct way seems to be using the augmented matrix to track the elementary row operations in HNF: (A|I) → (H|U), as you pointed out and as one of them showed in the development thread. Here's my implementation of the column Hermite method:
+:::::::: <syntaxhighlight lang="python">
+def __sat (main):
+    r = Matrix (main).rank ()
+    unimodular = __hnf_col (np.vstack ((main, np.eye (main.shape[1], dtype = int))))[r:]
+    return np.rint (linalg.inv (unimodular)[:r]).astype (int)
+</syntaxhighlight>
+:::::::: As I said, "Hermite decomposition" isn't a single operation. Here it involves concatenation, HNF, and slice, tho it doesn't involve the pseudoinverse. Compare Pernet-Stein:
+:::::::: <syntaxhighlight lang="python">
+def __sat (main):
+    r = Matrix (main).rank ()
+    return np.rint (linalg.inv (__hnf_col (main)[:, :r]) @ main).astype (int)
+</syntaxhighlight>
+:::::::: Note the slice and thus computing the rank are redundant; they're there just for futureproofness.
+:::::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 10:58, 11 February 2023 (UTC)
+:::::::: Edit: update the code to adopt column-style HNF so as to get rid of all the transpositions.
+:::::::: [[User:FloraC|FloraC]] ([[User talk:FloraC|talk]]) 07:08, 15 February 2023 (UTC)