Defactoring algorithms: Difference between revisions

Line 236:

===== Proof of why column Hermite defactoring (and Pernet-Stein defactoring) work =====

The following proof is adapted primarily from Tom Price's thinking:

# The input matrix is an ~~m×n~~ matrix A.

# The input matrix is an ''m''×''n'' matrix A.

# It decomposes into a slightly bigger and square (~~n×n~~) unimodular matrix U and another ~~m×n~~ matrix which is not exactly A in HNF (because we only have to use unimodular operations so far as to get all the all-zero columns off to one side of A; we don't need to satisfy all of the conventional constraints of HNF), but we'll still call it H. The unimodular matrix is a transformation from A into H, so, {{nowrap|AU {{=}} H}}.

# It decomposes into a slightly bigger and square (''n''×''n'') unimodular matrix U and another ''m''×''n'' matrix which is not exactly A in HNF (because we only have to use unimodular operations so far as to get all the all-zero columns off to one side of A; we don't need to satisfy all of the conventional constraints of HNF), but we'll still call it H. The unimodular matrix is a transformation from A into H, so, {{nowrap|AU {{=}} H}}.

# If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (~~m×m~~) matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A).

# If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (''m''×''m'') matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A).

# We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get S{{inv}}AUU{{inv}} = S{{inv}}HU{{inv}}. The U's cancel out on the left so we end up with S{{inv}}A = S{{inv}}HU{{inv}}. At first glance we don't seem to have gained any further insight. But there's more we can do from here.

# We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get {{nowrap|S{{inv}}AUU{{inv}} {{=}} S{{inv}}HU{{inv}}}}. The U's cancel out on the left so we end up with {{nowrap|S{{inv}}A {{=}} S{{inv}}HU{{inv}}}}. At first glance we don't seem to have gained any further insight. But there's more we can do from here.

# Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have S{{inv}}A = TU{{inv}}.

# Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have {{nowrap|S{{inv}}A {{=}} TU{{inv}}}}.

# Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating ~~U⁻¹~~. That's how we think of the output of column Hermite defactoring—our supposedly defactored matrix—so let's call that D. We now have ~~S⁻¹A~~ = D. (This is how we can see that the Pernet-Stein method of multiplying the input matrix by a transformation matrix that is a truncated and inversed column Hermite normal form of the input is equivalent to our column Hermite method, which takes the other route to the same result: inverting and truncating the unimodular result of the Hermite decomposition.)

# Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating U{{inv}}. That's how we think of the output of column Hermite defactoring—our supposedly defactored matrix—so let's call that D. We now have {{nowrap|S{{inv}}A {{=}} D}}. (This is how we can see that the Pernet-Stein method of multiplying the input matrix by a transformation matrix that is a truncated and inversed column Hermite normal form of the input is equivalent to our column Hermite method, which takes the other route to the same result: inverting and truncating the unimodular result of the Hermite decomposition.)

# We need to prove now that D has three qualities:

:: a) It's defactored,

:: b) It still represents the same temperament (i.e. it has the same nullspace as A), and

:: c) It's integer.

:: c) It's an integer.

# Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, ~~S⁻¹A~~ is surjective as a function on lattice points (in other words, there's no points in the tempered lattice that JI lattice points don't map to). We begin with the fact that H has the same image as A, because right-multiplication with a unimodular matrix such as U doesn't change the image. Then S has the same image as H, too, and therefore the same image as A, because removing the all-zero columns doesn't change the image either. Now that we've established this, we can assert that ~~S⁻¹A~~ is surjective by describing a lattice point x such that y = ~~S⁻¹Ax~~ for any given lattice point y. And because S and A have the same image, we know that Sy = Ax, and therefore y = ~~S⁻¹Ax~~.

# Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, S{{inv}}A is surjective as a function on lattice points (in other words, there's no points in the tempered lattice that JI lattice points don't map to). We begin with the fact that H has the same image as A, because right-multiplication with a unimodular matrix such as U doesn't change the image. Then S has the same image as H, too, and therefore the same image as A, because removing the all-zero columns doesn't change the image either. Now that we've established this, we can assert that S{{inv}}A is surjective by describing a lattice point ''x'' such that {{nowrap|''y'' {{=}} S{{inv}}A''x''}} for any given lattice point ''y''. And because S and A have the same image, we know that {{nowrap|S''y'' {{=}} A''x''}}, and therefore {{nowrap|''y'' {{=}} S{{inv}}A''x''}}.

# Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. ~~S⁻¹~~ is clearly invertible, being itself the inverse of S. A way to understand this is: a non-invertible matrix is the same as a singular matrix, i.e. one whose determinant is 0. So as long as you don't wipe things out by essentially multiplying by 0, the nullspace information is preserved, just scaled.

# Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. S{{inv}} is clearly invertible, being itself the inverse of S. A way to understand this is: a non-invertible matrix is the same as a singular matrix, i.e. one whose determinant is 0. So as long as you don't wipe things out by essentially multiplying by 0, the nullspace information is preserved, just scaled.

# Proving (c) is a bit trickier, because ~~S⁻¹~~ is not necessarily an integer matrix. But can show that ~~S⁻¹A~~ is an integer matrix by showing that it maps lattice points to lattice points. Suppose we have that same equation from the proof of (a), namely that y = ~~S⁻¹Ax~~, where x is a lattice point. We want to show that y is a lattice point. Again, since A and S have the same image (when considered as functions on lattice points), there must be some lattice point z with Sz = Ax. But we also know that Sy = Ax. Since S is invertible, and therefore injective, y = z, so y is a lattice point.

# Proving (c) is a bit trickier, because S{{inv}} is not necessarily an integer matrix. But can show that S{{inv}}A is an integer matrix by showing that it maps lattice points to lattice points. Suppose we have that same equation from the proof of (a), namely that {{nowrap|''y'' {{=}} S{{inv}}A''x''}}, where ''x'' is a lattice point. We want to show that ''y'' is a lattice point. Again, since A and S have the same image (when considered as functions on lattice points), there must be some lattice point ''z'' with {{nowrap|S''z'' {{=}} A''x''}}. But we also know that {{nowrap|S''y'' {{=}} A''x''}}. Since S is invertible, and therefore injective, {{nowrap|''y'' {{=}} ''z''}}, so ''y'' is a lattice point.

===== Relationship with other defactoring methods =====

@@ Line 236: / Line 236: @@
 ===== Proof of why column Hermite defactoring (and Pernet-Stein defactoring) work =====
 The following proof is adapted primarily from Tom Price's thinking:
-# The input matrix is an m×n matrix A.
+# The input matrix is an ''m''&#215;''n'' matrix A.
-# It decomposes into a slightly bigger and square (n×n) unimodular matrix U and another m×n matrix which is not exactly A in HNF (because we only have to use unimodular operations so far as to get all the all-zero columns off to one side of A; we don't need to satisfy all of the conventional constraints of HNF), but we'll still call it H. The unimodular matrix is a transformation from A into H, so, {{nowrap|AU {{=}} H}}.
+# It decomposes into a slightly bigger and square (''n''&#215;''n'') unimodular matrix U and another ''m''&#215;''n'' matrix which is not exactly A in HNF (because we only have to use unimodular operations so far as to get all the all-zero columns off to one side of A; we don't need to satisfy all of the conventional constraints of HNF), but we'll still call it H. The unimodular matrix is a transformation from A into H, so, {{nowrap|AU {{=}} H}}.
-# If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (m×m) matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A).
+# If we were to actually slice off the all-zero cols we've isolated in H, we'd end up with a slightly smaller and square (''m''&#215;''m'') matrix. So let's call this little square matrix S (this is our "[[Defactoring algorithms#Finding the greatest factor|greatest factor matrix]]", because its determinant is the greatest factor of A).
-# We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get  S{{inv}}AUU{{inv}} = S{{inv}}HU{{inv}}. The U's cancel out on the left so we end up with S{{inv}}A = S{{inv}}HU{{inv}}. At first glance we don't seem to have gained any further insight. But there's more we can do from here.
+# We can left-multiply both sides of our equation by the inverse of S (S{{inv}}) and right-multiply both sides of our equation by the inverse of U (U{{inv}}) to get {{nowrap|S{{inv}}AUU{{inv}} {{=}} S{{inv}}HU{{inv}}}}. The U's cancel out on the left so we end up with {{nowrap|S{{inv}}A {{=}} S{{inv}}HU{{inv}}}}. At first glance we don't seem to have gained any further insight. But there's more we can do from here.
-# Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have S{{inv}}A = TU{{inv}}.
+# Because H is just S with a bunch of 0 cols appended, S{{inv}}H is just the identity matrix with a bunch of zero columns appended, in other words it is a truncated identity matrix. We could call that T, and now we have {{nowrap|S{{inv}}A {{=}} TU{{inv}}}}.
-# Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating U⁻¹. That's how we think of the output of column Hermite defactoring&mdash;our supposedly defactored matrix&mdash;so let's call that D. We now have S⁻¹A = D. (This is how we can see that the Pernet-Stein method of multiplying the input matrix by a transformation matrix that is a truncated and inversed column Hermite normal form of the input is equivalent to our column Hermite method, which takes the other route to the same result: inverting and truncating the unimodular result of the Hermite decomposition.)
+# Multiplying U{{inv}} on the left by a truncated identity matrix is the same as truncating U{{inv}}. That's how we think of the output of column Hermite defactoring&mdash;our supposedly defactored matrix&mdash;so let's call that D. We now have {{nowrap|S{{inv}}A {{=}} D}}. (This is how we can see that the Pernet-Stein method of multiplying the input matrix by a transformation matrix that is a truncated and inversed column Hermite normal form of the input is equivalent to our column Hermite method, which takes the other route to the same result: inverting and truncating the unimodular result of the Hermite decomposition.)
 # We need to prove now that D has three qualities:
 :: a) It's defactored,
 :: b) It still represents the same temperament (i.e. it has the same nullspace as A), and
-:: c) It's integer.
+:: c) It's an integer.
-# Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, S⁻¹A is surjective as a function on lattice points (in other words, there's no points in the tempered lattice that JI lattice points don't map to). We begin with the fact that H has the same image as A, because right-multiplication with a unimodular matrix such as U doesn't change the image. Then S has the same image as H, too, and therefore the same image as A, because removing the all-zero columns doesn't change the image either. Now that we've established this, we can assert that S⁻¹A is surjective by describing a lattice point x such that y = S⁻¹Ax for any given lattice point y. And because S and A have the same image, we know that Sy = Ax, and therefore y = S⁻¹Ax.
+# Proving (a) is easy. It's defactored because U was unimodular. U's determinant was 1, and neither inverting it nor truncating it would change that. Alternatively, we can prove this by showing how on the other side of the equation, S{{inv}}A is surjective as a function on lattice points (in other words, there's no points in the tempered lattice that JI lattice points don't map to). We begin with the fact that H has the same image as A, because right-multiplication with a unimodular matrix such as U doesn't change the image. Then S has the same image as H, too, and therefore the same image as A, because removing the all-zero columns doesn't change the image either. Now that we've established this, we can assert that S{{inv}}A is surjective by describing a lattice point ''x'' such that {{nowrap|''y'' {{=}} S{{inv}}A''x''}} for any given lattice point ''y''. And because S and A have the same image, we know that {{nowrap|S''y'' {{=}} A''x''}}, and therefore {{nowrap|''y'' {{=}} S{{inv}}A''x''}}.
-# Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. S⁻¹ is clearly invertible, being itself the inverse of S. A way to understand this is: a non-invertible matrix is the same as a singular matrix, i.e. one whose determinant is 0. So as long as you don't wipe things out by essentially multiplying by 0, the nullspace information is preserved, just scaled.
+# Proving (b) is even easier. Multiplying any matrix with an invertible matrix on the left keeps the nullspace the same. S{{inv}} is clearly invertible, being itself the inverse of S. A way to understand this is: a non-invertible matrix is the same as a singular matrix, i.e. one whose determinant is 0. So as long as you don't wipe things out by essentially multiplying by 0, the nullspace information is preserved, just scaled.
-# Proving (c) is a bit trickier, because S⁻¹ is not necessarily an integer matrix. But can show that S⁻¹A is an integer matrix by showing that it maps lattice points to lattice points. Suppose we have that same equation from the proof of (a), namely that y = S⁻¹Ax, where x is a lattice point. We want to show that y is a lattice point. Again, since A and S have the same image (when considered as functions on lattice points), there must be some lattice point z with Sz = Ax. But we also know that Sy = Ax. Since S is invertible, and therefore injective, y = z, so y is a lattice point.
+# Proving (c) is a bit trickier, because S{{inv}} is not necessarily an integer matrix. But can show that S{{inv}}A is an integer matrix by showing that it maps lattice points to lattice points. Suppose we have that same equation from the proof of (a), namely that {{nowrap|''y'' {{=}} S{{inv}}A''x''}}, where ''x'' is a lattice point. We want to show that ''y'' is a lattice point. Again, since A and S have the same image (when considered as functions on lattice points), there must be some lattice point ''z'' with {{nowrap|S''z'' {{=}} A''x''}}. But we also know that {{nowrap|S''y'' {{=}} A''x''}}. Since S is invertible, and therefore injective, {{nowrap|''y'' {{=}} ''z''}}, so ''y'' is a lattice point.
 ===== Relationship with other defactoring methods =====