Defactoring
A regular temperament mapping is in defactored canonical (DC) form when it is put into Hermite Normal Form (HNF) after being "defactored".
vs. normal form
normal vs. canonical
A mapping in canonical form uniquely identifies a set of mappings that are equivalent to it. Historically, the xenharmonic community has most often used the word normal for this idea, and evidence of this can be found on many pages across this wiki. And this is not wrong; normal forms are indeed often required to be unique. However, canonical forms are required to be unique even more often that normal forms are[1], and so we prefer the term canonical to normal for this purpose.
Also, using "canonical" helps establish a clear distinction from previous efforts to establish unique representations of equivalent mappings; due to its lack of historical use in RTT, it appears to be safe to simply use "canonical form" for short to refer to matrices in defactored canonical form.
vs. HNF
More importantly, and perhaps partially a result of this weak understanding of the difference between the conventions for normal and canonical forms, the xenharmonic community ha mistakenly used HNF as if it provides a unique representation of equivalent mappings. To be more specific, HNF does provide a unique representation of matrices, i.e. from a perspective of pure mathematics, and so you will certainly find throughout mathematical literature that HNF is described as providing a unique representation, and this is correct. However, when applied to the RTT domain, i.e. to mappings, the HNF sometimes fails to identify equivalent mappings as such.
The critical flaw with HNF is its failure to defactor matrices. The DC form that will be described here, on the other hand, does defactor matrices, and therefore it delivers a truly canonical result.
defactoring
Defactoring a matrix means to perform an operation on it which ensures that it is not "enfactored". And a matrix is considered to be "enfactored" if linear combinations of its rows can produce another row whose elements have a common factor (other than 1). This definition includes matrices whose rows already include a common factor, such as ⟨24 38 56] which has a common factor of 2. Being enfactored is a bad thing. Enfactored matrices — those in the RTT domain, at least — are sick, in a way[2]; it's no accident that "enfactored" sounds sort of like "infected". Fortunately, the remedy is simple: all one has to do is "defactor" it — identify and divide out the common factor — to produce a healthy mapping.
Due to complications associated with enfactored mappings which we'll get into later in this article, we discourage treating them as representations of true temperaments. Instead we recommend that they be considered to represent mere "temperoids": temperament-like structures.
vs. saturation and (con)torsion
If you've studied RTT extensively, you've probably encountered the terms "saturated" and "contorted" that are sometimes used to describe mappings. These two terms each have several flaws, and so this article presents alternative terms that are clearer and more descriptive: "defactored" and "enfactored", respectively. These new terms were coined by Dave Keenan in collaboration with Douglas Blumeyer in June of 2021.
Several concerns with the term "saturation" may be identified:
- It does not have any obvious musical or mathematical meaning in this context (whereas enfactored and defactored do have obvious mathematical meaning).
- The most common everyday usage of that word is for "saturated fats", which are the bad kind of fats, so it has negative associations, despite "saturation" being the good state for a matrix to be in.
- Research on the tuning list archives suggests that Gene Ward Smith chose the word "saturation" because it was used in the mathematical software he was using at the time, Sage[3]. However, there is another common but conflicting sense of saturation for matrices which clamps entry values to between -1 and 1[4]. We think we should avoid propagating Sage's decision to overload matrix saturation with a second meaning.
As for the term "contorsion", the concerns with it are:
- Again, it does not have any obvious musical or mathematical meaning in this context.
- It's a word that was invented for RTT and has no meaning outside of RTT[5].
- It was made up due to false assumptions. Through researching on tuning list archives, Dave and Douglas concluded that the associated concept of "torsion" was first described in January of 2002[6], with regards to commas used to form Fokker periodicity blocks. The concept of enfactoring was recognized in temperament mappings (though of course it did not yet go by that name), and — because torsion in lists of commas for Fokker blocks looks the same way as enfactoring looks in temperament comma-bases — torsion got conflated with it[7]. But they can't truly be the same thing; the critical difference is that periodicity blocks do not involve tempering, while temperaments do. In concrete terms, while it can make sense to construct a Fokker block with [-4 4 -1⟩ in the middle and [-8 8 -2⟩ = 2[-4 4 -1⟩ at the edge, it does not make sense to imagine a temperament which tempers out 2[-4 4 -1⟩ but does not temper out [-4 4 -1⟩. Unfortunately, however, this critical difference seems to have been overlooked, and so it seemed that enfactored comma-bases exhibited torsion, and thus because mappings are the dual of comma-bases, then enfactoring of a mapping should be the dual of torsion, and because the prefix co- or con- means "dual" (as in vectors and covectors), the term "con-torsion" was coined for it. "Torsion" already has the problem of being an obscure mathematical term that means nothing to most people, "contorsion" just compounds that problem by being made up, and it is made up in order to convey a duality which is false. So while "torsion" could be preserved as a term for the effect on periodicity blocks (though there's almost certainly something more helpful than that, but that's a battle for another day), the term "contorsion" must be banished from the RTT community altogether.
In accordance with this research and reasoning, this article henceforth will eschew the terms saturation and contorsion in favor of defactored and enfactored.
identifying enfactored mappings
immediately apparent enfactoring
Sometimes the enfactoring of a mapping is immediately apparent. For example:
⟨24 38 56]
This mapping has only a single row, and we can see that every element in that row is even. Therefore we have a common factor of at least 2. In this case it is in fact exactly 2. So we can say that this is a 2-enfactored mapping.
Being enfactored tells us that it's wasteful to use this mapping. Specifically, being 2-enfactored tells us that we have 2x as many pitches as we need. Said another way, half of the pitches in our system are bringing nothing to the table, at least not in terms of approximating intervals built out of the 5-limit primes 2, 3, and 5, which is the primary goal of a temperament.
This is the mapping for 5-limit 24-ET. To be clear, we're not saying there's a major problem with 24 as an EDO. The point here is only that — if you're after a 5-limit temperament — you may as well use 12-ET. So we would consider 24-ET to stand for 24 Equal Temperoid.
Think of it this way: because every element is even, any JI interval you'd map with with the mapping must come out as an even number of steps of 24-ET, by the definition of the dot product, and every even step of 24-ET is just a step of 12-ET. Examples: [1 -2 1⟩.⟨24 38 56] = 24 - 76 + 56 = 4, [1 1 -1⟩.⟨24 38 56] = 24 + 38 - 56 = 6.
Other times, enfactoring is less apparent. Consider this example:
[⟨3 0 -1] ⟨0 3 5]⟩
This is a form of 5-limit porcupine, a rank-2 temperament. Looking at either row, neither map has a common factor. But remember that we also need to check linear combinations of rows. If we subtract the 2nd row from the 1st row, we can produce the row ⟨3 -3 -6], which has a common factor of 3. So this mapping is also enfactored, even though it's not obvious from just looking at it.
If you're unsure why this ⟨3 -3 -6] matters despite not being in [⟨3 0 -1] ⟨0 3 5]⟩, we may need to quickly review some linear algebra fundamentals. It may take some getting used to, but a mapping can be changed to another equivalent mapping (both mappings will map input vectors to the same scalars) by replacing any row with linear combinations of its rows. That is, we could replace either ⟨3 0 -1] or ⟨0 3 5] in our original matrix [⟨3 0 -1] ⟨0 3 5]⟩ to get [⟨3 -3 -6] ⟨0 3 5]⟩ or [⟨3 0 -1] ⟨3 -3 -6]⟩ and any of these mappings represent the same temperament.
Sometimes the hidden common factor is even harder to find. Consider the mapping [⟨6 5 -4] ⟨4 -4 1]⟩. To find this common factor, we need to linearly combine two of the first row ⟨6 5 -4] and negative three of the 2nd row ⟨4 -4 1] to produce ⟨0 22 -11]. So we can see here that its common factor is 11.
And so we can begin to see that the problem of identifying enfactored mapping may not be very simple or straightforward.
defactoring methods
Even better than identifying enfactored mappings is actually full-on defactoring them. Here are two methods that do just that: Smith defactoring, developed by Gene Ward Smith[8], and column Hermite defactoring, developed by Dave and Douglas (the name comes, of course, from Hermite normal form, which it uses[9]).
Neither of these methods have been rigorously proven to always defactor mappings, but tests Douglas ran on thousands of random mappings strongly suggested that both methods work and give the exact same results as each other.
This article prefers column Hermite defactoring to Smith defactoring because it is:
- Cheaper computationally, wasting less resources computing things irrelevant to the result,
- Is easy to understand how it works, and can be worked out by hand (as we will demonstrate below),
- If interested, you can see what the common factor is, if there was any.
Column Hermite defactoring could not have been developed, however, were it not for Gene's pioneering work with the Smith defactoring (what he calls the process of "saturating" a mapping). At first Dave and Douglas had no idea what the right reducing matrix of the Smith decomposition (the process which also provides the Smith normal form) had to do with common factors, only that it somehow magically worked. So they analyzed the Smith decomposition until they isolated its key actions which actually effect the defactoring, and then honed their method down to do only these necessary actions. Again, they wouldn't have known where to start were it not for Gene.
precedent: Smith defactoring
Dave and Douglas did much of their work in Wolfram Language (formerly Mathematica), a popular programming language used for math problems. In this section we'll give examples using it.
An input mapping [math]\displaystyle{ m }[/math], such as the example Gene gives on the xen wiki page for Saturation, [⟨12 19 28 34] ⟨26 41 60 72]⟩, in Wolfram Language you would have to write as a list:
m = {{12,19,28,34},{26,41,60,72}};
The implementation of Gene's method in Wolfram Language is simple. Just two lines:
rightReducingMatrix[m_] := Last[SmithDecomposition[m]] smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]]
So the first thing that happens to [math]\displaystyle{ m }[/math] when you pass it in to smithDefactor[]
is that it calls rightReducingMatrix[]
on it. This will find the Smith decomposition (using a function built in to Wolfram Language), which gives you three outputs: the Smith normal form, flanked by its left and right reducing matrices. We're asked only for the right reducing matrix, so we grab that with Last[]
. So that's what the function on the first line, rightReducingMatrix[]
, does.
Then Gene asks us to invert this result and take its first [math]\displaystyle{ r }[/math] rows, where [math]\displaystyle{ r }[/math] is the rank of the temperament. Invert[]
takes care of the inversion, of course. MatrixRank[m]
gives the count of linearly independent rows to the mapping, AKA the rank, or count of generators in this temperament. In this case that's 2. And so Take[list, 2]
simply returns the first 2 entries of the list.
Almost done! Except Gene not only defactors, he also calls for HNF, as we would, to achieve canonical form.
normalize[m_] := Last[HermiteDecomposition[m]]
Similar to the Smith Normal Form, we do a decomposition, which gives you the normal form plus some other bonus results. In this case we actually want the normal form itself, and it happens to be the last element in the result list. So putting it all together, we defactor and then normalize:
rightReducingMatrix[m_] := Last[SmithDecomposition[m]]; smithDefactor[m_] := Take[Inverse[rightReducingMatrix[m]], MatrixRank[m]]; normalize[m_] := Last[HermiteDecomposition[m]]; m = {{12,19,28,34},{26,41,60,72}}; normalize[smithDefactor[m]]
→ {{1,0,-4,-13},{0,1,4,10}}
And that result matches what Gene finds in that xen wiki article. Defactoring and normalizing is equivalent to canonicalization.
new development: column Hermite defactoring
Here is the implementation for column Hermite defactoring:
hermiteUnimodular[m_]:=Transpose[First[HermiteDecomposition[Transpose[m]]]] columnEchelonDefactor[m_]:=Take[Inverse[hermiteUnimodular[m]],MatrixRank[m]]
So this implementation begins by transposing the matrix, so that when it then performs the Hermite Decomposition, it is doing a column decomposition. We then take the unimodular matrix from the decomposition using First[]
, and Transpose[]
it to in effect undo the transposition we did at the beginning.
- include the L-shape work-through example
- refer to the email thread with Dave and talk about how it leaves the enfactoring behind while preserving the important information in the unimodular matrix, then does an inverse
other stuff to report
- talk about the criteria, like has to be integer, full-rank
In addition to being canonical and defactored, DC form has other important properties:
- It is integer, i.e. contains only integer terms.
- reduced
- full-rank
- more I think, check email
- show the examples we tried, like in the big defactoring table
Relationship between various matrix echelon forms
There are several well-known echelon matrix forms. The most general form, with the fewest constraints, is simply called Row Echelon Form, or REF. It's only constraints are:
- the rows are shaped like echelon
and therefore it does not produce a unique representation.
An Integer Row Echelon Form, or IREF, is, unsurprisingly, any REF whose terms are all integers. Again, this is still not necessarily unique.
The Reduced Row Echelon Form, or RREF, takes REF in a different direction. It stipulates that the pivots are all 1's. This may require dividing rows by a number such that resulting elements are no longer integers. Because of this constraint, however, the RREF form of a matrix is unique.
So IREF and RREF make a Venn Diagram inside the category of REF: some IREF are RREF, but there are some RREF that are not IREF and some IREF that are not RREF. When we scope the situation to a specific matrix, however, because RREF is a unique form, this means that one or the other sector of the Venn diagram for RREF will be empty; either the unique RREF form will also be IREF (and therefore the RREF-but-not-IREF sector will be empty), or it will not be IREF (and vice versa).
The next form to discuss is the Integer Reduced Row Echelon Form, or IRREF. Based on the name, one might expect this form to be a combination of the constraints for RREF and IREF, and therefore if represented in an Euler diagram (generalization of Venn diagram) would only exist within their intersection. However this is not the case. That's because the IRREF does not include the key constraint of RREF which is that all of the pivots must be 1. IRREF is produced by simply taking the unique RREF form and multiplying each row by whatever minimum value is necessary to make all of the elements integers. Of course, this sometimes results in the pivots no longer being 1, so sometimes it is no longer RREF. It is always still REF, though, and because it is also always integer, that makes it always IREF; therefore, IRREF is strictly a subcategory of IREF. And because the RREF form is unique, and the conversion process does not alter that, the IRREF form is also unique.
It is not possible for an RREF form to be IREF without also being IRREF.
The last form to discuss is the HNF. This one's constraints
and so it is always integer, and always REF, and therefore always IREF, but it is not necessarily identical to the IRREF which is the IREF you find by converting the RREF.
And so, four different states are possible: ### give the specifics for each example
- The RREF, IRREF, and HNF are all different. Example: porcupine.
- The RREF, IRREF, and HNF are all the same. Example: meantone.
- The RREF and IRREF are the same, but the HNF is different. Example: [⟨2 3 5] ⟨7 6 13]⟩ (I haven't found a realistic one yet)
- The IRREF and HNF are the same, but the RREF is different. Example: hanson.
Canonical comma-bases
DC form is not only for mappings. Comma-bases — the duals of mappings — may also be put into DC form, as long as they are first antitransposed[10], and then antitransposed again at the end, or in other words, you sandwich the defactoring and HNF operations between antitransposes.
DC form is arguably even more important for comma-bases than it is for mappings, because enfactored mappings at least have clear musical meaning, while enfactored comma-bases are little but a wellspring of confusion. In other words, ⟨24 38 56] may not be a true temperament, but it still represents a temperoid and an EDO. However, {{
- ↑ According to the Wikipedia page for canonical form, 'the distinction between "canonical" and "normal" forms varies from subfield to subfield. In most fields, a canonical form specifies a unique representation for every object, while a normal form simply specifies its form, without the requirement of uniqueness.'
- ↑ According to saturation, "...if [an RTT matrix] isn't saturated the supposed temperament it defines may be regarded as pathological..."
- ↑ See: https://yahootuninggroupsultimatebackup.github.io/tuning-math/topicId_18026.html and https://doc.sagemath.org/html/en/reference/search.html?q=index_in_saturation
- ↑ See https://math.stackexchange.com/questions/1964814/linear-transformation-of-a-saturated-vector and https://faculty.uml.edu//thu/tcs01-june.pdf
- ↑ Here is the tuning list post where it was coined by Paul Erlich: https://yahootuninggroupsultimatebackup.github.io/tuning-math/topicId_2033.html#2456
- ↑ See: https://yahootuninggroupsultimatebackup.github.io/tuning-math/topicId_2937 which is also referred to here http://tonalsoft.com/enc/t/torsion.aspx
- ↑ See: https://yahootuninggroupsultimatebackup.github.io/tuning-math/topicId_2033.html#2405
- ↑ but the name comes from a different Smith: Henry John Stephen Smith, for whom the Smith normal form is named, which this method uses
- ↑ named for Charles Hermite, who was French, by the way, and so his name is pronounced more like err-MEET, not like HER-might
- ↑ See a discussion of the antitranspose here: https://en.xen.wiki/w/User:Cmloegcmluin/Sandbox#null-space