Generator embedding optimization

Revision as of 21:10, 30 January 2023 by Cmloegcmluin (talk | contribs) (fix EBK)

When optimizing tunings of regular temperaments, it is fairly quick and easy to find approximate solutions, using (for example) the general method which is discussed in D&D's guide to RTT and available in D&D's RTT library in Wolfram Language. This RTT library also includes four other methods which quickly and easily find exact solutions. These four methods are further different from the general method insofar as they are not general; each one works only for certain optimization problems. It is these four specialized exact-solution methods which are the subject of this article.

Two of these four specialized methods were briefly discussed in D&D's guide, along with the general method, because these specialized methods are actually even quicker and easier than the general method. These two are the only held-intervals method, and the pseudoinverse method. But there's still plenty more insight to be had into how and why exactly these methods work, in particular for the pseudoinverse method, so we'll be doing a much deeper dive into it in this article here than was done in D&D's guide.

The other two of these four specialized methods — the zero-damage method, and the coinciding-damage method — are significantly more challenging to understand than the general method. Most students of RTT would not gain enough musical insight by familiarizing themselves with them to have justified the investment. This is why these two methods were not discussed in D&D's guide. However, if you feel compelled to understand the nuts and bolts of these methods anyway, then those sections of the article may well appeal to you.

This article is titled "Generator embedding optimization" because of a key feature these four specialized methods share: they can all give their solutions as generator embeddings, i.e. lists of prime-count vectors, one for each generator, where typically these prime-count vectors have non-integer entries (and are thus not JI). This is different from the general method, which can only give generator tuning maps, i.e. sizes in cents for each generator. As we'll see, a tuning optimization method's ability to give solutions as generator embeddings is equivalent to its ability to give solutions that are exact.

Intro

A summary of the methods

The three biggest sections of this article are dedicated to three specialized tuning methods, one for each of the three special optimization powers: the pseudoinverse method is used for [math]\displaystyle{ p = 2 }[/math] (miniRMS tuning schemes), the zero-damage method is used for [math]\displaystyle{ p = 1 }[/math] (miniaverage tuning schemes), and the coinciding-damage method is used for [math]\displaystyle{ p = ∞ }[/math] (minimax tuning schemes).

These three methods also work for all-interval tuning schemes, which by definition are all minimax tuning schemes (optimization power [math]\displaystyle{ ∞ }[/math]), differing instead by the power of the power norm used for the interval complexity by which they simplicity-weight damage. But it's not the interval complexity norm power [math]\displaystyle{ q }[/math] which directly determines the method used, but rather its dual power, [math]\displaystyle{ \text{dual}(q) }[/math]: the power of the dual norm minimized on the retuning magnitude. So the pseudoinverse method is used for [math]\displaystyle{ \text{dual}(q) = 2 }[/math], the zero-damage method is used for [math]\displaystyle{ \text{dual}(q) = 1 }[/math], and the coinciding-damage method is used for [math]\displaystyle{ \text{dual}(q) = ∞ }[/math].

If for some reason you've decided that you want to use a different optimization power than those three, then no exact solution in the form of a generator embedding is available, and you'll need to fall back to the general tuning computation method, linked above.

The general method also works for those special powers [math]\displaystyle{ 1 }[/math], [math]\displaystyle{ 2 }[/math], and [math]\displaystyle{ ∞ }[/math], however, so if you're in a hurry, you should skip this article and lean on that method instead (though you should be aware that the general method offers less insight about each of those tuning schemes than their specialized methods do).

Exact vs approximate solutions

Tuning computation methods can be classified by whether they give an approximate or exact solution.

The general method is an approximate type; it finds the generator tuning map [math]\displaystyle{ 𝒈 }[/math] directly, using trial-and-error methods such as gradient descent or differential evolution whose details we won't go into. The accuracy of approximate types depends on how long you are willing to wait.

In contrast, the exact type work by solving for a matrix [math]\displaystyle{ G }[/math], the generator embedding.

We can calculate [math]\displaystyle{ 𝒈 }[/math] from this [math]\displaystyle{ G }[/math] via [math]\displaystyle{ 𝒋G }[/math], that is, the generator tuning map is obtained as the product of the just tuning map and the generator embedding.

Because [math]\displaystyle{ 𝒈 = 𝒋G }[/math], if [math]\displaystyle{ 𝒈 }[/math] is the primary target, not [math]\displaystyle{ G }[/math], and a formula for [math]\displaystyle{ G }[/math] is known, then it is possible to substitute that into [math]\displaystyle{ 𝒈 = 𝒋G }[/math] and thereby bypass explicitly solving for [math]\displaystyle{ G }[/math]. For example, this was essentially what was done in the Only-held intervals method and Pseudoinverse method sections of [guide: tuning computation]).

Note that with any exact type that solves for [math]\displaystyle{ G }[/math], since it is possible to have an exact [math]\displaystyle{ 𝒋 }[/math], it is also possible to find an exact [math]\displaystyle{ 𝒈 }[/math]. For example, the approximate value of the 5-limit [math]\displaystyle{ 𝒋 }[/math] we're quite familiar with is 1200.000 1901.955 2786.314], but its exact value is [math]\displaystyle{ 1200×\log_2(2) }[/math] [math]\displaystyle{ 1200×\log_2(3) }[/math] [math]\displaystyle{ 1200×\log_2(5) }[/math]], so if the exact tuning of quarter-comma meantone is [math]\displaystyle{ G }[/math] = {[1 0 0 [0 0 ¼], then this can be expressed as an exact generator tuning map [math]\displaystyle{ 𝒈 }[/math] = {[math]\displaystyle{ (1200×\log_2(2))(1) + (1200×\log_2(3))(0) + (1200×\log_2(5))(0) }[/math] [math]\displaystyle{ (1200×\log_2(2))(0) + (1200×\log_2(3))(0) + (1200×\log_2(5))(\frac14) }[/math]] = {[math]\displaystyle{ 1200 }[/math] [math]\displaystyle{ \dfrac{1200×\log_2(5)}{4} }[/math]].

Also note that any method which solves for [math]\displaystyle{ G }[/math] can also produce [math]\displaystyle{ 𝒈 }[/math] via this [math]\displaystyle{ 𝒋G }[/math] formula. But methods which solve directly for [math]\displaystyle{ 𝒈 }[/math] cannot provide a [math]\displaystyle{ G }[/math], even if a [math]\displaystyle{ G }[/math] could have been computed for the given type of optimization problem (such as a minimax type, which notably is the majority of tuning optimizations used on the wiki). In a way, tuning maps are like a lossily compressed form of information from embeddings.

Here's a breakdown of which computation methods solve directly for [math]\displaystyle{ 𝒈 }[/math], and which can solve for [math]\displaystyle{ G }[/math] instead:

optimization power method solution type solves for
[math]\displaystyle{ 2 }[/math] pseudoinverse exact [math]\displaystyle{ G }[/math]
[math]\displaystyle{ 1 }[/math] zero-damage exact [math]\displaystyle{ G }[/math]
[math]\displaystyle{ ∞ }[/math] coinciding-damage exact [math]\displaystyle{ G }[/math]
general power approximate [math]\displaystyle{ 𝒈 }[/math]
power limit
n/a only held-intervals exact [math]\displaystyle{ G }[/math]

The generator embedding

Roughly speaking, if [math]\displaystyle{ M }[/math] is the matrix which isolates the temperament information, and [math]\displaystyle{ 𝒋 }[/math] is the matrix which isolates the sizing information, then [math]\displaystyle{ G }[/math] is the matrix that isolates the tuning information. This is a matrix whose columns are prime-count vectors representing the generators of the temperament. For example, a Pythagorean tuning of meantone temperament would look like this:


[math]\displaystyle{ G = \left[ \begin{array} {rrr} 1 & {-1} \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] }[/math]


The first column is the vector [1 0 0 representing [math]\displaystyle{ \frac21 }[/math], and the second column is the vector [-1 1 0 representing [math]\displaystyle{ \frac32 }[/math]. So generator embeddings will always have the shape [math]\displaystyle{ (d, r) }[/math]: one row for each prime harmonic in the domain basis (the dimensionality), one column for each generator (the rank).

Pythagorean tuning is not a common tuning of meantone, however, and is an extreme enough tuning of that temperament that it should be considered unreasonable. We gave it as our first example anyway, though, in order to more gently introduce the concept of generator embeddings, because its prime-count vector columns are simple and familiar, while in reality, most generator embeddings consist of prime-count vectors which do not have integer entries. Therefore, these prime-count vectors do not represent JI intervals, and are unlike any prime-count vectors we've worked with so far. For another example of a meantone tuning, then, one which is more common and reasonable, let's consider the quarter-comma tuning of meantone. Its generator embedding looks like this:


[math]\displaystyle{ G = \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] }[/math]


Algebraic setup

The basic algebraic setup of tuning optimization looks like this:


[math]\displaystyle{ \textbf{d} = |\,𝒈M\mathrm{T}W - 𝒋\mathrm{T}W\,| }[/math]


When we break [math]\displaystyle{ 𝒈 }[/math] down into [math]\displaystyle{ 𝒋 }[/math] and a [math]\displaystyle{ G }[/math] we're solving for, the algebraic setup of tuning optimization comes out like this:


[math]\displaystyle{ \textbf{d} = |\,𝒋GM\mathrm{T}W - 𝒋G_{\text{j}}M_{\text{j}}\mathrm{T}W\,| }[/math]


We can factor things in both directions this time (and we'll take [math]\displaystyle{ 𝒋 }[/math] outside the absolute value bars since it's guaranteed to have no negative entries):


[math]\displaystyle{ \textbf{d} = 𝒋\,|\,(GM - G_{\text{j}}M_{\text{j}})\mathrm{T}W\,| }[/math]


But wait — there are actually two more matrices we haven't recognized yet, on the just side of things. These are [math]\displaystyle{ G_{\text{j}} }[/math] and [math]\displaystyle{ M_{\text{j}} }[/math]. Unsurprisingly, these two are closely related to [math]\displaystyle{ G }[/math] and [math]\displaystyle{ M }[/math], respectively. The subscript [math]\displaystyle{ \text{j} }[/math] stands for "just intonation", so this is intended to indicate that these are the generators and mapping for JI.

We could replace either or both of these matrices with [math]\displaystyle{ I }[/math], an identity matrix. On account of both [math]\displaystyle{ G_{\text{j}} }[/math] and [math]\displaystyle{ M_{\text{j}} }[/math] being identity matrices, we can eliminate them from our expression


[math]\displaystyle{ \textbf{d} = 𝒋\,|\,(GM - II)\mathrm{T}W\,| }[/math]


Which reduces to:


[math]\displaystyle{ \textbf{d} = 𝒋\,|\,(P - I)\mathrm{T}W\,| }[/math]


Where [math]\displaystyle{ P }[/math] is the projection matrix found as [math]\displaystyle{ P = GM }[/math].

So why do we have [math]\displaystyle{ G_{\text{j}} }[/math] and [math]\displaystyle{ M_{\text{j}} }[/math] there at all? For maximal parallelism between the tempered side and the just side. In part this is a pragmatic decision, because as we work with these sorts of expressions moving forward, we'll prefer something rather than nothing in this position anyway. But there's also a pedagogical goal here, which is to convey how in JI, the mapping matrix and the generator embedding really are identity matrices, and it can be helpful to stay mindful of it.

You can imagine reading a [math]\displaystyle{ (3, 3) }[/math]-shaped identity matrix like a mapping matrix: how many generators does it take to approximate prime 2? One of the first generator, and nothing else. How many to approximate prime 3? One of the second generator, and nothing else. How many to approximate prime 5? One of the third generator, and nothing else. So this mapping is not much of a mapping at all. It shows us only that in this temperament, the first generator may as well be a perfect approximation of prime 2, the second generator may as well be a perfect approximation of prime 3, and the third generator may as well be a perfect approximation of prime 5. Any temperament which has as many generators as it has primes may as well be JI like this.

And then the fact that the generator embedding on the just side is also an identity matrix finishes the point. The vector for the first generator is [1 0 0, a representation of the interval [math]\displaystyle{ \frac21 }[/math]; the vector for the second generator is [0 1 0, a representation of the interval [math]\displaystyle{ \frac31 }[/math]; and the vector for the third generator is [0 0 1, a representation of the interval [math]\displaystyle{ \frac51 }[/math].

We can even understand this in terms of a units analysis, where if [math]\displaystyle{ M_{\text{j}} }[/math] is taken to have units of g/p, and [math]\displaystyle{ G_{\text{j}} }[/math] is taken to have units of p/g, then together we find their units to be ... nothing. And an identity matrix that isn't even understood to have units is definitely useless and to be eliminated. Though it's actually not as simple as the [math]\displaystyle{ \small \sf p }[/math]'s and [math]\displaystyle{ \small \sf g }[/math]'s canceling out; for more details, see Dave Keenan & Douglas Blumeyer's guide to RTT: units analysis#The JI mapping times the JI generators embedding.

So when the interval vectors constituting the target-interval list [math]\displaystyle{ \mathrm{T} }[/math] are multiplied by [math]\displaystyle{ G_{\text{j}}M_{\text{j}} }[/math] they are unchanged, which means that multiplying the result by [math]\displaystyle{ 𝒋 }[/math] simply computes their just sizes.

Deduplication

Between target-interval set and held-interval basis

Generally speaking, held-intervals should be removed if they also appear in the target-interval set. If these intervals are not removed, the correct tuning can still be computed; however, during optimization, effort will have been wasted on minimizing damage to these intervals, because their damage would have been held to 0 by other means anyway.

Of course, there is some cost to the deduplication itself, but In general, it should be more computationally efficient to remove these intervals from the target-interval set in advance, rather than submit them to the optimization procedures as-is.

Duplication of intervals between these two sets will most likely occur when using a target-interval set scheme (such as a TILT or OLD) that automatically chooses the target-interval set.

Constant damage target-intervals

There is also a possibility, when holding intervals, that some target-intervals' damages will be constant everywhere within the tuning damage space to be searched, and thus these target-intervals will have no effect on the tuning. Their preservation in the target-interval set will only serve to slow down computation.

For example, in pajara temperament, with mapping [2 3 5 6] 0 1 -2 -2]}, if the octave is held unchanged, then there is no sense keeping [math]\displaystyle{ \frac75 }[/math] in the target-interval set. The octave [1 0 0 0 maps to [2 0} in this temperament, and [math]\displaystyle{ \frac75 }[/math] [0 0 -1 1 maps to [1 0}. So if the first generator is fixed in order to hold the octave unchanged, then ~[math]\displaystyle{ \frac75 }[/math]'s tuning will also be fixed.

Within target-interval set

We also note a potential for duplication within the target-interval set, irrespective of held-intervals: depending on the temperament, some target-intervals may map to the same tempered interval. For another pajara example, using the TILT as a target-interval set scheme, the target-interval set will contain [math]\displaystyle{ \frac{10}{7} }[/math] and [math]\displaystyle{ \frac75 }[/math], but pajara maps both of those intervals to [1 0}, and thus the damage to these two intervals will always be the same.

However, critically, this is only truly redundant information in the case of a minimax tuning scheme, where the optimization power [math]\displaystyle{ p = ∞ }[/math]. In this case, if the damage to [math]\displaystyle{ \frac75 }[/math] is the max, then it's irrelevant whether the damage to [math]\displaystyle{ \frac{10}{7} }[/math] is also the max. But in the case of any other optimization power, both the presence of [math]\displaystyle{ \frac75 }[/math] and of [math]\displaystyle{ \frac{10}{7} }[/math] in the target-interval set will have some effect; for example, with [math]\displaystyle{ p = 1 }[/math], miniaverage tuning schemes, this means that whatever the identical damage to this one mapped target-interval [1 0} may be, since two different of our target-intervals map to it, we care about its damage twice as much, and thus it essentially gets counted twice in our average damage computation.

Should redundant mapped target-intervals be removed when computing minimax tuning schemes? It's a reasonable consideration. The RTT Library in Wolfram Language does not do this. In general, this may add more complexity to the code than the benefit is worth; it requieres minding the difference between the requested target-interval set count [math]\displaystyle{ k }[/math] and the count of deduped mapped target-intervals, which would require a new variable.

Only held-intervals method

The only held-intervals method was mostly covered here: Dave Keenan & Douglas Blumeyer's guide to RTT: tuning computation#Only held-intervals method. But there are a couple adjustments we'll make to how we talk about it here.

Unchanged-interval basis

In the D&D's guide article, this method was discussed in terms of held-intervals, which are a trait of a tuning scheme, or in other words, a request that a person makes of a tuning optimization procedure which that procedure will then satisfy. But there's something interesting that happens once we request enough many intervals to be held unchanged — that is, when our held-interval count [math]\displaystyle{ h }[/math] reaches the size of our generator count, also known as rank [math]\displaystyle{ r }[/math] — then we have no room left for optimization. At this point, the tuning is entirely determined by the held-intervals. And thus we get another, perhaps better, way to look at the interval basis: no longer in terms of a request on a tuning scheme, but as a characteristic of a specific tuning itself. Under this conceptualization, what we have is not a helf-interval basis [math]\displaystyle{ \mathrm{H} }[/math], but an unchanged-interval basis [math]\displaystyle{ \mathrm{U} }[/math].

Because in the majority of cases within this article it will be more appropriate to conceive of this basis as a characteristic of a fully-determined tuning, as opposed to a request of tuning scheme, we will be henceforth be dealing with this method in terms of [math]\displaystyle{ \mathrm{U} }[/math], not [math]\displaystyle{ \mathrm{H} }[/math]

Generator embedding

So, substituting [math]\displaystyle{ \mathrm{U} }[/math] in for [math]\displaystyle{ \mathrm{H} }[/math] in the formula we learned from the D&D's guide article:


[math]\displaystyle{ 𝒈 = 𝒋\mathrm{U}(M\mathrm{U})^{-1} }[/math]


This tells us that if we know the unchanged-interval basis for a tuning, i.e. every unchanged-interval in the form of a prime-count vector, then we can get our generators. But the next difference we want to look at here is this: the formula has bypassed the computation of [math]\displaystyle{ G }[/math]! We can expand [math]\displaystyle{ 𝒈 }[/math] to [math]\displaystyle{ 𝒋G }[/math]:


[math]\displaystyle{ 𝒋G = 𝒋\mathrm{U}(M\mathrm{U})^{-1} }[/math]


And cancel out:


[math]\displaystyle{ \cancel{𝒋}G = \cancel{𝒋}\mathrm{U}(M\mathrm{U})^{-1} }[/math]


To find:


[math]\displaystyle{ G = \mathrm{U}(M\mathrm{U})^{-1} }[/math]


Pseudoinverse method

Similarly, we can take the pseudoinverse formula as presented in Dave Keenan & Douglas Blumeyer's guide to RTT: tuning computation#Pseudoinverse method, substitute [math]\displaystyle{ 𝒋G }[/math] for [math]\displaystyle{ 𝒈 }[/math], and cancel out:


[math]\displaystyle{ \begin{align} 𝒈 &= 𝒋\mathrm{T}W(M\mathrm{T}W)^{+} \\ 𝒋G &= 𝒋\mathrm{T}W(M\mathrm{T}W)^{+} \\ \cancel{𝒋}G &= \cancel{𝒋}\mathrm{T}W(M\mathrm{T}W)^{+} \\ G &= \mathrm{T}W(M\mathrm{T}W)^{+} \\ \end{align} }[/math]


Connection with the only held-intervals method

Note the similarity between the pseudoinverse formula [math]\displaystyle{ A^{+} = A^\mathsf{T}(AA^\mathsf{T})^{-1} }[/math] and the only held-interval interval [math]\displaystyle{ G = 𝒋\mathrm{U}(M\mathrm{U})^{-1} }[/math]; in fact, it's the same formula, if we simply substitute in [math]\displaystyle{ M^\mathsf{T} }[/math] for [math]\displaystyle{ \mathrm{U} }[/math].

What this tells us is that for any tuning of a temperament where [math]\displaystyle{ G = M^{+} }[/math], the held-intervals are given by the transpose of the mapping, [math]\displaystyle{ M^\mathsf{T} }[/math]. (Historically this tuning scheme has been called "Frobenius", but we would call it "minimax-E-copfr-S".)

For example, in the [math]\displaystyle{ G = M^{+} }[/math] tuning of meantone temperament 1202.607 696.741], with mapping [math]\displaystyle{ M }[/math] equal to:


[math]\displaystyle{ \left[ \begin{array} {r} 1 & 1 & 0 \\ 0 & 1 & 4 \\ \end{array} \right] }[/math]


The held-intervals are [math]\displaystyle{ M^\mathsf{T} }[/math]:


[math]\displaystyle{ \left[ \begin{array} {r} 1 & 0 \\ 1 & 1 \\ 0 & 4 \\ \end{array} \right] }[/math]


or in other words, the two held-intervals are [1 1 0 and [0 1 4, which as ratios are [math]\displaystyle{ \frac61 }[/math] and [math]\displaystyle{ \frac{1875}{1} }[/math], respectively. Those may seem like some pretty strange intervals to be unchanged, for sure, but there is a way to think about it that makes it seem less strange. This tells us that whatever the error is on [math]\displaystyle{ \frac21 }[/math], it is the negation of the error on [math]\displaystyle{ \frac31 }[/math], because when those intervals are combined, we get a pure [math]\displaystyle{ \frac61 }[/math]. This also tells us that whatever the error is on [math]\displaystyle{ \frac31 }[/math], that it in turn is the negation of the error on [math]\displaystyle{ \frac{625}{1} = \frac{5^4}{1} }[/math].[1] Also, remember that these intervals form a basis for the held-intervals; any interval that is a linear combination of them is also unchanged.

As another example, the unchanged-interval of the primes miniRMS-U tuning of 12-ET would be [12 19 28. Don't mistake that for the 12-ET map 12 19 28]; that's the prime-count vector you get from transposing it! That interval, while rational and thus theoretically JI, could not be heard directly by humans, considering that [math]\displaystyle{ 2^{12}3^{19}5^{28} }[/math] is over 107 octaves above unison and would typically call for scientific notation to express; it's 128553.929 ¢, which is exactly 1289 ([math]\displaystyle{ = 12^2+19^2+28^2 }[/math]) iterations of the 99.732 ¢ generator for this tuning.

Example

Let's refer back to the example given in Dave Keenan & Douglas Blumeyer's guide to RTT: tuning computation#Plugging back in, picking up from this point:


[math]\displaystyle{ 𝒈 = \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T}C \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1.000 & \;\;\;0.000 & {-2.585} & 7.170 & {-3.322} & 0.000 & {-8.644} & 4.907 \\ 0.000 & 1.585 & 2.585 & {-3.585} & 0.000 & {-3.907} & 0.000 & 4.907 \\ 0.000 & 0.000 & 0.000 & 0.000 & 3.322 & 3.907 & 4.322 & {-4.907} \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} 1.000 & 0.000 \\ \hline 3.170 & {-4.755} \\ \hline 2.585 & {-7.755} \\ \hline 0.000 & 10.755 \\ \hline 6.644 & {-16.610} \\ \hline 3.907 & {-7.814} \\ \hline 4.322 & {-21.610} \\ \hline 0.000 & 9.814 \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} \\ \left[ \begin{array} {rrr} 0.0336 & 0.00824 \\ 0.00824 & 0.00293 \\ \end{array} \right] \end{array} }[/math]


In the original article, we simply multiplied through the entire right half of this expression. But what if we stopped before multiplying in the [math]\displaystyle{ 𝒋 }[/math] part, instead?


[math]\displaystyle{ 𝒈 = \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T}C(M\mathrm{T}C)^\mathsf{T}(M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} \\ \left[ \begin{array} {rrr} 1.003 & 0.599 \\ {-0.016} & 0.007 \\ 0.010 & {-0.204} \\ \end{array} \right] \end{array} }[/math]


The matrices with shapes [math]\displaystyle{ (3, 8)(8, 2)(2, 2) }[/math] led us to a [math]\displaystyle{ (3, \cancel{8})(\cancel{8}, \cancel{2})(\cancel{2}, 2) = (3, 2) }[/math]-shaped matrix, and that's just what we want in a [math]\displaystyle{ G }[/math] here. Specifically, we want a [math]\displaystyle{ (d, r) }[/math]-shaped matrix, one that will convert [math]\displaystyle{ (r, 1) }[/math]-shaped generator-count vectors — those that are results of mapping [math]\displaystyle{ (d, 1) }[/math]-shaped prime-count vectors by the temperament mapping matrix — back into [math]\displaystyle{ (d, 1) }[/math]-shaped prime-count vectors, but now representing the intervals as they sound under this tuning of this temperament.

And so we've found what we were looking for, [math]\displaystyle{ G = \mathrm{T}C(M\mathrm{T}C)^\mathsf{T}(M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} }[/math].

At first glance, this might seem surprising or crazy, that we find ourselves looking at musical intervals described by raising prime harmonics to powers that are precise fractions. But they do, in fact, work out to reasonable interval sizes. Let's check by actually working these generators out through their decimal powers.

This generator embedding [math]\displaystyle{ G }[/math] is telling us that the tuning of our first generator may be represented by the prime-count vector [1.003 -0.016 0.010, or in other words, it's the interval [math]\displaystyle{ 2^{1.003}3^{-0.016}5^{0.010} }[/math], which is equal to [math]\displaystyle{ 2.00018 }[/math], or 1200.159 ¢. As for the second generator, then, we find that [math]\displaystyle{ 2^{0.599}3^{0.007}5^{-0.205} = 1.0985 }[/math], or 162.664 ¢. By checking the porcupine article we can see that these are both reasonable generator sizes.

What we've just worked out with this sanity check is our generator tuning map, [math]\displaystyle{ 𝒈 }[/math]. In general we can find these by left-multiplying the generators [math]\displaystyle{ G }[/math] by [math]\displaystyle{ 𝒋 }[/math]:


[math]\displaystyle{ \begin{array} {ccc} \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} G \\ \left[ \begin{array} {rrr} 1.003 & 0.599 \\ {-0.016} & 0.007 \\ 0.010 & {-0.204} \\ \end{array} \right] \end{array} = \begin{array} {ccc} \mathbf{g} \\ \left[ \begin{array} {rrr} 1200.159 & 162.664 \\ \end{array} \right] \end{array} \end{array} }[/math]


Pseudoinverse: the "how"

Here we will investigate how, mechanically speaking, the pseudoinverse almost magically takes us straight to that answer we want.

Like an inverse

As you might suppose — given a name like pseudoinverse — this thing is like a normal matrix inverse, but not exactly. True inverses are only defined for square matrices, so the pseudoinverse is essentially a way to make something similar available for non-square i.e. rectangular matrices. This is useful for RTT because the [math]\displaystyle{ M\mathrm{T}W }[/math] matrices we use it on are usually rectangular; they are always [math]\displaystyle{ (r, k) }[/math]-shaped matrices.

But why would we want to take the inverse of [math]\displaystyle{ M\mathrm{T}W }[/math] in the first place, though? To understand this, it will help to first simplify the problem.

  1. Our first simplification will be to use unity-weight damage, meaning that the weight on each of the target-intervals is the same, and may as well be 1. This makes our weight matrix [math]\displaystyle{ W }[/math] a matrix of all zeros with 1's running down the main diagonal, or in other words, it makes [math]\displaystyle{ W = I }[/math]. So we can eliminate it.
  2. Our second simplification is to consider the case where the target-interval set [math]\displaystyle{ \mathrm{T} }[/math] is the primes. This makes [math]\displaystyle{ \mathrm{T} }[/math] also equal to [math]\displaystyle{ I }[/math], so we can eliminate it as well.

At this point we're left with simply [math]\displaystyle{ M }[/math]. And this is still a rectangular matrix; it's [math]\displaystyle{ (r, d) }[/math]-shaped. So if we want to invert it, we'll only be able to pseudoinvert it. But we're still in the dark about why we would ever want to invert it.

To finally get to understanding why, let's look to an expression discussed here: Basic algebraic setup:


[math]\displaystyle{ GM \approx G_{\text{j}}M_{\text{j}} }[/math]


This expression captures the idea that a tuning based on [math]\displaystyle{ G }[/math] of a temperament [math]\displaystyle{ M }[/math] (the left side of this) is intended to approximate just intonation, where both [math]\displaystyle{ G_{\text{j}} = I }[/math] and [math]\displaystyle{ M_{\text{j}} = I }[/math] (the right side of this).

So given some mapping [math]\displaystyle{ M }[/math], which [math]\displaystyle{ G }[/math] makes that happen? Well, based on the above, it should be the inverse of [math]\displaystyle{ M }[/math]! That's because anything times its own inverse equals an identity, i.e. [math]\displaystyle{ M^{-1}M = I }[/math].

Definition of inverse

Multiplying by something to give an identity is, in fact, the very definition of "inverse". To illustrate, here's an example of a true inverse, in the case of [math]\displaystyle{ (2, 2) }[/math]-shaped matrices:


[math]\displaystyle{ \begin{array} {c} A^{-1} \\ \left[ \begin{array} {rrr} 1 & \frac23 \\ 0 & {-\frac13} \\ \end{array} \right] \end{array} \begin{array} {c} A \\ \left[ \begin{array} {rrr} 1 & 2 \\ 0 & {-3} \\ \end{array} \right] \end{array} \begin{array} {c} \\ = \end{array} \begin{array} {c} I \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


So the point is, if we could plug [math]\displaystyle{ M^{-1} }[/math] in for [math]\displaystyle{ G }[/math] here, we'd get a reasonable approximation of just intonation, i.e. an identity matrix [math]\displaystyle{ I }[/math].

But the problem is, as we know already, that [math]\displaystyle{ M^{-1} }[/math] doesn't exist, because [math]\displaystyle{ M }[/math] is a rectangular matrix. That's why we use its pseudoinverse [math]\displaystyle{ M^{+} }[/math] instead. Or to be absolutely clear, we choose our generator embedding [math]\displaystyle{ G }[/math] to be [math]\displaystyle{ M^{+} }[/math].

Sometimes an inverse

Now to be completely accurate, when we multiply a rectangular matrix by its pseudoinverse, we can also get an identity matrix, but only if we do it a certain way. (And this fact that we can get an identity matrix at all is a critical example of the way how the pseudoinverse provides inverse-like powers for rectangular matrices.) But there are still a few key differences between this situation and the situation of a square matrix and its true inverse:

  1. The first big difference is that in the case of square matrices, as we saw a moment ago, all the matrices have the same shape. However, for a non-square (rectangular) matrix with shape [math]\displaystyle{ (m, n) }[/math], it will have a pseudoinverse with shape [math]\displaystyle{ (n, m) }[/math]. This difference perhaps could have gone without saying.
  2. The second big difference is that in the case of square matrices, the multiplication order is irrelevant: you can either left-multiply the original matrix by its inverse or right-multiply it, and either way, you'll get the same identity matrix. But there's no way you could get the same identity matrix in the case of a rectangular matrix and its pseudoinverse; an [math]\displaystyle{ (m, n) }[/math]-shaped matrix times an [math]\displaystyle{ (n, m) }[/math]-shaped matrix gives an [math]\displaystyle{ (m, m) }[/math]-shaped matrix, while an [math]\displaystyle{ (n, m) }[/math]-shaped matrix times an [math]\displaystyle{ (m, n) }[/math]-shaped matrix gives an [math]\displaystyle{ (n, n) }[/math]-shaped matrix (the inner height and width always have to match, and the resulting matrix always has shape matching the outer width and height). So: either way we will get a square matrix, but one way we get an [math]\displaystyle{ (m, m) }[/math] shape, and the other way we get an [math]\displaystyle{ (n, n) }[/math] shape.
  3. The third big difference — and this is probably the most important one, but we had to build up to it by looking at the other two big differences first — is that only one of those two possible results of multiplying a rectangular matrix by its pseudoinverse will actually even give an identity matrix! It will be the one of the two that gives the smaller square matrix.

Example of when the pseudoinverse behaves like a true inverse

Here's an example with meantone temperament as [math]\displaystyle{ M }[/math]. Its pseudoinverse [math]\displaystyle{ M^{+} = M^\mathsf{T}(MM^\mathsf{T})^{-1} }[/math] is {[17 16 -4 [16 17 4]/33. First, we'll look at the multiplication order that gives an identity matrix, when the [math]\displaystyle{ (2, 3) }[/math]-shaped rectangular matrix right-multiplied by its [math]\displaystyle{ (3, 2) }[/math]-shaped rectangular pseudoinverse gives a [math]\displaystyle{ (2, 2) }[/math]-shaped square identity matrix:


[math]\displaystyle{ \begin{array} {c} M \\ \left[ \begin{array} {r} 1 & 0 & {-4} \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} \begin{array} {c} M^{+} \\ \left[ \begin{array} {c} \frac{17}{33} & \frac{16}{33} \\ \frac{16}{33} & \frac{17}{33} \\ {-\frac{4}{33}} & \frac{4}{33} \\ \end{array} \right] \end{array} \begin{array} {c} \\ = \end{array} \begin{array} {c} I \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Let's give an RTT way to interpret this first result. Basically it tells us that [math]\displaystyle{ M^{+} }[/math] might be a reasonable generator embedding [math]\displaystyle{ G }[/math] for this temperament. First of all, let's note that [math]\displaystyle{ M }[/math] was not specifically designed to handle non-JI intervals like those represented by the prime-count vector columns of [math]\displaystyle{ M^{+} }[/math], like we are making it do here. But we can get away with it anyway. And in this case, [math]\displaystyle{ M }[/math] maps the first column of [math]\displaystyle{ M^{+} }[/math] to the generator-count vector [1 0}, and its second column to the generator-count vector [0 1}; we can find these two vectors as the columns of the identity matrix [math]\displaystyle{ I }[/math].

Now, one fact we can take from this is that the first column of [math]\displaystyle{ M^{+} }[/math] — the non-JI vector [[math]\displaystyle{ \frac{17}{33} }[/math] [math]\displaystyle{ \frac{16}{33} }[/math] [math]\displaystyle{ \frac{-4}{33} }[/math] — shares at least one thing in common with other JI intervals such as [math]\displaystyle{ \frac21 }[/math] [1 0 0, [math]\displaystyle{ \frac{81}{40} }[/math] [-3 4 -1, and [math]\displaystyle{ \frac{160}{81} }[/math] [5 -4 1: they all get mapped to [1 0} by this meantone mapping matrix [math]\displaystyle{ M }[/math]. Note that this is no guarantee that [[math]\displaystyle{ \frac{17}{33} }[/math] [math]\displaystyle{ \frac{16}{33} }[/math] [math]\displaystyle{ \frac{-4}{33} }[/math] is close to these intervals (in theory, we can add or subtract an indefinite number of temperament commas from an interval without altering what it maps to!), but it at least suggests that it's reasonably close to them, i.e. that it's about an octave in size.

And a similar statement can be made about the second column vector of [math]\displaystyle{ M^{+} }[/math], [[math]\displaystyle{ \frac{16}{33} }[/math] [math]\displaystyle{ \frac{17}{33} }[/math] [math]\displaystyle{ \frac{4}{33} }[/math], with respect to [math]\displaystyle{ \frac31 }[/math] [0 1 0 and [math]\displaystyle{ \frac{80}{27} }[/math] [4 -3 1, etc.: they all map to [0 1}, and so [[math]\displaystyle{ \frac{16}{33} }[/math] [math]\displaystyle{ \frac{17}{33} }[/math] [math]\displaystyle{ \frac{4}{33} }[/math] is probably about a perfect twelfth in size like the rest of them.

(In this case, both likelihoods are indeed true: our two tuned generators are 1202.607 ¢ and 696.741 ¢ in size.)

Example of when the pseudoinverse does not behave like a true inverse

Before we get to that, we should finish what we've got going here, and show for contrast what happens when we flip-flop [math]\displaystyle{ M }[/math] and [math]\displaystyle{ M^{+} }[/math], so that the [math]\displaystyle{ (3, 2) }[/math]-shaped rectangular pseudoinverse times the original [math]\displaystyle{ (2, 3) }[/math]-shaped rectangular matrix leads to a [math]\displaystyle{ (3, 3) }[/math]-shaped matrix which is not an identity matrix:


[math]\displaystyle{ \begin{array} {c} M^{+} \\ \left[ \begin{array} {c} \frac{17}{33} & \frac{16}{33} \\ \frac{16}{33} & \frac{17}{33} \\ -{\frac{4}{33}} & \frac{4}{33} \\ \end{array} \right] \end{array} \begin{array} {c} M \\ \left[ \begin{array} {r} 1 & 0 & {-4} \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} \begin{array} {c} \\ = \end{array} \begin{array} {c} M^{+}M \\ \left[ \begin{array} {c} \frac{17}{33} & \frac{16}{33} & {-\frac{4}{33}} \\ \frac{16}{33} & \frac{17}{33} & \frac{4}{33} \\ {-\frac{4}{33}} & \frac{4}{33} & \frac{32}{33} \\ \end{array} \right] \end{array} }[/math]


While this matrix [math]\displaystyle{ M^{+}M }[/math] clearly isn't an identity matrix, since it's not all zeros except for ones running along its main diagonal, and it doesn't really look anything like an identity matrix from a superficial perspective — just judging by the numbers we can read off its entries — it turns out that behavior-wise this matrix does actually work out to be as "close" to an identity matrix as we can get, at least in a certain sense. And since our goal with tuning this temperament was to approximate JI as closely as possible, from this certain mathematical perspective, this is the matrix that accomplishes that. But again, we'll get to why exactly this matrix is the one that accomplishes that in a little bit.

Un-simplifying

First, to show how we can un-simplify things. The insight leading to this choice of [math]\displaystyle{ G = M^{+} }[/math] was made under the simplifying circumstances of [math]\displaystyle{ W = I }[/math] (unity-weight damage) and [math]\displaystyle{ \mathrm{T} = \mathrm{T}_{\text{p}} = I }[/math] (primes as target-intervals). But nothing about those choices of [math]\displaystyle{ W }[/math] or [math]\displaystyle{ \mathrm{T} }[/math] affect how this method works; setting them to [math]\displaystyle{ I }[/math] was only to help us humans see the way forward. There's nothing stopping us now from using any other weights and target-intervals for [math]\displaystyle{ W }[/math] and [math]\displaystyle{ \mathrm{T} }[/math]; the concept behind this method holds. Choosing [math]\displaystyle{ G = \mathrm{T}W(M\mathrm{T}W)^{+} }[/math], that is, still finds for us the [math]\displaystyle{ p = 2 }[/math] optimization for the problem.

Demystifying the formula

One way to think about what's happening in the formula of the pseudoinverse uses a technique we might call the "transform-act-antitransform technique": we want to take some action, but we can't do it in the current state, so we transform into a state where we can, then we take the action, and we finish off by performing the opposite of the initial transformation so that we get back to more of a similar state to the one we began with, yet having accomplished the action we intended.

In the case of the pseudoinverse, the action we want to take is inverting a matrix. But we can't exactly invert it, because [math]\displaystyle{ A }[/math] is rectangular (to understand why, you can review the inversion process here: matrix inversion by hand). We happen to know that a matrix times its transpose is invertible, though (more on that in a moment), so:

  1. Multiplying by the matrix's transpose, finding [math]\displaystyle{ AA^\mathsf{T} }[/math], becomes our "transform" step.
  2. Then we invert like we wanted to do originally, so that's the "act" step: [math]\displaystyle{ (AA^\mathsf{T})^{-1} }[/math].
  3. Finally, we might think that we should multiply by the inverse of the matrix's transpose in order to undo our initial transformation step; however, we actually simply repeat the same thing, that is, we multiply by the transpose again! This is because we've put the matrix into an inverted state, so actually multiplying by the original's transpose here is essentially the opposite transformation. So that's the whole formula, then: [math]\displaystyle{ A^\mathsf{T}(AA^\mathsf{T})^{-1} }[/math].

Now, as for why we know a matrix times its own transpose is invertible: there's a ton of little linear algebra facts that all converge to guarantee that this is so. Please consider the following diagram which lays all these facts all out at once.

 

Pseudoinverse: the "why"

In the previous section we took a look at how, mechanically, the pseudoinverse gives the solution for optimization power [math]\displaystyle{ p = 2 }[/math]. As for why, conceptually speaking, the pseudoinverse gives us the minimum point for the RMS graph in tuning damage space, it's sort of just one of those seemingly miraculously useful mathematical results. But we can try to give a basic explanation here.

Derivative, for slope

First, let's briefly go over some math facts. For some readers, these will be review:

  • The slope of a graph means its rate of change. When slope is positive, the graph is going up, and when negative, it's going down.
  • Wherever a graph has a local minimum or maximum, the slope is 0. That's because that's the point where it changes direction, between going up or down.
  • We can find the slope at every point of a graph by taking its derivative.

So, considering that we want to find the minimum of a graph, one approach should be to find the derivative of this graph, then find the point(s) where its value is 0, which is where the slope is 0. This means those are the possible points where we have a local minimum, which means therefore that those are the points where we maybe have a global minimum, which is what we're after.

A unique minimum

As discussed in the tuning fundamentals article (in the section Non-unique tunings - power continuum), the graphs of mean damage and max damage — which are equivalent to the power means with powers [math]\displaystyle{ p = 1 }[/math] and [math]\displaystyle{ p = ∞ }[/math], respectively — consist of straight line segments connected by sharp corners, while all other optimization powers between [math]\displaystyle{ 1 }[/math] and [math]\displaystyle{ ∞ }[/math] form smooth curves. This is important because it is only for graphs with smooth curves that we can use its derivative to find the minimum point; the sharp corners of the other type of graph create discontinuities at those points, which in this context means points which have no definitive slope. The simple mathematical methods we use to find slope for smooth graphs get all confused and crash or give wrong results if we try to use them on these types of graphs.

So we can use the derivative slope technique for other powers [math]\displaystyle{ 1 \lt p \lt ∞ }[/math], but the pseudoinverse will only match the solution when [math]\displaystyle{ p = 2 }[/math].

And, spoiler alert: another key thing that's true about the [math]\displaystyle{ 2 }[/math]-mean graph whose minimum point we seek: it has only one point where the slope is equal 0, and it's our global minimum. Again, this is true of any of our curved [math]\displaystyle{ p }[/math]-mean graphs, but we only really care about it in the case of [math]\displaystyle{ p = 2 }[/math].

A toy example using the derivative

To get our feet on solid ground, let's just work through the math for an equal temperament example, i.e. one with only a single generator.

Kicking off with the setup discussed here, we have:


[math]\displaystyle{ \textbf{d} = 𝒋\,|\,(GM - G_{\text{j}}M_{\text{j}})\mathrm{T}W\,| }[/math]


Let's rewrite this a tad, using the fact that [math]\displaystyle{ 𝒋G }[/math] is our generator tuning map [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ 𝒋G_{\text{j}}M_{\text{j}} }[/math] is equivalent to simply [math]\displaystyle{ 𝒋 }[/math]:


[math]\displaystyle{ \textbf{d} = |\,(𝒈M - 𝒋)\mathrm{T}W\,| }[/math]


Let's say our rank-1 temperament is 12-ET, so our mapping [math]\displaystyle{ M }[/math] is 12 19 28]. And our target-interval set is the otonal triad, so [math]\displaystyle{ \{ \frac54, \frac65, \frac32 \} }[/math]. And let's say we're complexity weighting, so [math]\displaystyle{ 𝒘 = \left[ \begin{array}{rrr} 4.322 & 4.907 & 2.585 \end{array} \right] }[/math], and [math]\displaystyle{ W }[/math] therefore is the diagonalized version of that (or [math]\displaystyle{ C }[/math] is the diagonlized version of [math]\displaystyle{ 𝒄 }[/math]). As for [math]\displaystyle{ 𝒈 }[/math], since this is a rank-1 temperament, being a [math]\displaystyle{ (1, r) }[/math]-shaped matrix, it's actually a [math]\displaystyle{ (1, 1) }[/math]-shaped matrix, and since we don't know what it is yet, it's single entry is the variable [math]\displaystyle{ g_1 }[/math]. This can be understood to represent the size of our ET generator in cents.


[math]\displaystyle{ \textbf{d} = \Huge | \normalsize \begin{array} {ccc} 𝒈 \\ \left[ \begin{array} {rrr} g_1 \\ \end{array} \right] \end{array} \begin{array} {ccc} M \\ \left[ \begin{array} {rrr} 12 & 19 & 28 \\ \end{array} \right] \end{array} - \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r} {-2} & 1 & {-1} \\ 0 & 1 & 1 \\ 1 & {-1} & 0 \\ \end{array} \right] \end{array} \begin{array} {ccc} C \\ \left[ \begin{array} {rrr} 4.322 & 0 & 0 \\ 0 & 4.907 & 0 \\ 0 & 0 & 2.585 \\ \end{array} \right] \end{array} \Huge | \normalsize }[/math]


Here's what that looks like graphed:

 

As alluded to earlier, for rank-1 cases, it's pretty easy to read the value straight off the chart. Clearly we're expecting a generator size that's just a smidge bigger than 100 ¢. The point is here to understand the computation process.

So, let's simplify:


[math]\displaystyle{ \textbf{d} = \Huge | \normalsize \begin{array} {ccc} 𝒈M = 𝒕 \\ \left[ \begin{array} {rrr} 12g_1 & 19g_1 & 28g_1 \\ \end{array} \right] \end{array} - \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T}C \\ \left[ \begin{array} {r|r|r} {-8.644} & 4.907 & {-2.585} \\ 0 & 4.907 & 2.585 \\ 4.322 & {-4.907} & 0 \\ \end{array} \right] \end{array} \Huge | \normalsize }[/math]


Another pass:

[math]\displaystyle{ \textbf{d} = \Huge | \normalsize \begin{array} {ccc} 𝒕 - 𝒋 \\ \left[ \begin{array} {rrr} 12g_1 - 1200 & 19g_1 - 1901.955 & 28g_1 - 2786.31 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T}C \\ \left[ \begin{array} {r|r|r} {-8.644} & 4.907 & {-2.585} \\ 0 & 4.907 & 2.585 \\ 4.322 & {-4.907} & 0 \\ \end{array} \right] \end{array} \Huge | \normalsize }[/math]


And once more:


[math]\displaystyle{ \textbf{d} = \Huge | \normalsize \begin{array} {ccc} (𝒕 - 𝒋)\mathrm{T}C = 𝒓\mathrm{T}C = \textbf{e}C \\ \left[ \begin{array} {rrr} 17.288g_1 - 1669.605 & 14.721g_1 - 1548.835 & 18.095g_1 - 1814.526 \\ \end{array} \right] \end{array} \Huge | \normalsize }[/math]


And remember these bars are actually entry-wise absolute values, so we can put those on each entry. Though it actually won't matter much in a minute, since squaring things automatically causes positive values.


[math]\displaystyle{ \textbf{d} = \begin{array} {ccc} |\textbf{e}|C \\ \left[ \begin{array} {rrr} |17.288g_1 - 1669.605| & |14.721g_1 - 1548.835| & |18.095g_1 - 1814.526| \\ \end{array} \right] \end{array} }[/math]


[math]\displaystyle{ % \slant{} command approximates italics to allow slanted bold characters, including digits, in MathJax. \def\slant#1{\style{display:inline-block;margin:-.05em;transform:skew(-14deg)translateX(.03em)}{#1}} % Latex equivalents of the wiki templates llzigzag and rrzigzag for double zigzag brackets. \def\llzigzag{\hspace{-1.6mu}\style{display:inline-block;transform:scale(.62,1.24)translateY(.07em);font-family:sans-serif}{ꗨ\hspace{-3mu}ꗨ}\hspace{-1.6mu}} \def\rrzigzag{\hspace{-1.6mu}\style{display:inline-block;transform:scale(-.62,1.24)translateY(.07em);font-family:sans-serif}{ꗨ\hspace{-3mu}ꗨ}\hspace{-1.6mu}} }[/math] Because what we're going to do now is change this to the formula for the SOS of damage, that is, [math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 }[/math]:


[math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 = |17.288g_1 - 1669.605|^2 + |14.721g_1 - 1548.835|^2 + |18.095g_1 - 1814.526|^2 }[/math]


So we can get rid of those absolute value signs:


[math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 = (17.288g_1 - 1669.605)^2 + (14.721g_1 - 1548.835)^2 + (18.095g_1 - 1814.526)^2 }[/math]


Then we're just going to work these out:


[math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 = \small (17.288g_1 - 1669.605)(17.288g_1 - 1669.605) + (14.721g_1 - 1548.835)(14.721g_1 - 1548.835) + (18.095g_1 - 1814.526)(18.095g_1 - 1814.526) }[/math]


Distribute:


[math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 = \small (298.875g_1^2 - 57728.262g_1 - 2787580.856) + (216.708g_1^2 - 45600.800g_1 - 2398889.857) + (327.429g_1^2 - 65667.696g_1 - 3292504.605) }[/math]


Combine like terms:


[math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 = 843.012g_1^2 - 168996.758g_1 - 8478975.318 }[/math]


At this point, we take the derivative. Basically, exponents decrease by 1 and what they were before turn into coefficients; we won't be doing a full review of this here, but good tutorials on that should be easy to find online.


[math]\displaystyle{ \dfrac{\partial}{\partial{g_1}} \llzigzag \textbf{d} \rrzigzag _2 = 2×843.012g_1 - 168996.758 }[/math]


This is the formula for the slope of the graph, and we want to know where it's equal to zero.


[math]\displaystyle{ 0 = 2×843.012g_1 - 168996.758 }[/math]


So we can now solve for [math]\displaystyle{ g_1 }[/math]:


[math]\displaystyle{ \begin {align} 0 &= 1686.024g_1 - 168996.758 \\[4pt] 168996.758 &= 1686.024g_1 \\[6pt] \dfrac{168996.758}{1686.024} &= g_1 \\[6pt] 100.234 &= g_1 \\ \end {align} }[/math]


Ta-da! There's our generator size: 100.234 ¢.[2]

Verifying the toy example with the pseudoinverse

Okay... but what the heck does this have to do with a pseudoinverse? Well, for a sanity check, let's double-check against our pseudoinverse method.


[math]\displaystyle{ G = \mathrm{T}W(M\mathrm{T}C)^{+} = \mathrm{T}C(M\mathrm{T}C)^\mathsf{T}(M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} }[/math]


We already know [math]\displaystyle{ \mathrm{T}C }[/math] from an earlier step above. And so [math]\displaystyle{ M\mathrm{T}C }[/math] is:

[math]\displaystyle{ \begin{array} {ccc} M \\ \left[ \begin{array} {rrr} 12 & 19 & 28 \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T}C \\ \left[ \begin{array} {r|r|r} {-8.644} & 4.907 & {-2.585} \\ 0 & 4.907 & 2.585 \\ 4.322 & {-4.907} & 0 \\ \end{array} \right] \end{array} = \begin{array} {ccc} M\mathrm{T}C \\ \left[ \begin{array} {r|r|r} 17.288 & 14.721 & 18.095 \\ \end{array} \right] \end{array} }[/math]


So plugging these in we get:


[math]\displaystyle{ G = \begin{array} {ccc} \mathrm{T}C \\ \left[ \begin{array} {r|r|r} {-8.644} & 4.907 & {-2.585} \\ 0 & 4.907 & 2.585 \\ 4.322 & {-4.907} & 0 \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} 17.288 \\ \hline 14.721 \\ \hline 18.095 \\ \end{array} \right] \end{array} ( \begin{array} {ccc} M\mathrm{T}C \\ \left[ \begin{array} {r|r|r} 17.288 & 14.721 & 18.095 \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} 17.288 \\ \hline 14.721 \\ \hline 18.095 \\ \end{array} \right] \end{array} )^{-1} }[/math]


Which works out to:


[math]\displaystyle{ G = \begin{array} {ccc} \mathrm{T}C(M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} 527.565 \\ 608.962 \\ 642.322 \\ \end{array} \right] \end{array} ( \begin{array} {ccc} M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} 842.983 \end{array} \right] \end{array} )^{-1} }[/math]


Then take the inverse (interestingly, since this is a [math]\displaystyle{ (1, 1) }[/math]-shaped matrix, this is equivalent to the reciprocal, that is, we're just finding [math]\displaystyle{ \frac{1}{842.983} = 0.00119 }[/math]:


[math]\displaystyle{ G = \begin{array} {ccc} \mathrm{T}C(M\mathrm{T}C)^\mathsf{T} \\ \left[ \begin{array} {rrr} {-123.974} \\ 119.007 \\ 2.484 \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} \\ \left[ \begin{array} {rrr} 0.00119 \end{array} \right] \end{array} }[/math]


And finally multiply:


[math]\displaystyle{ G = \begin{array} {ccc} \mathrm{T}C(M\mathrm{T}C)^\mathsf{T}(M\mathrm{T}C(M\mathrm{T}C)^\mathsf{T})^{-1} \\ \left[ \begin{array} {rrr} {-0.147066} \\ 0.141174 \\ 0.002946 \\ \end{array} \right] \end{array} }[/math]


To compare with our 100.234 ¢ value, we'll have to convert this [math]\displaystyle{ G }[/math] to a [math]\displaystyle{ 𝒈 }[/math], but that's easy enough. As we demonstrated earlier, simply multiply by [math]\displaystyle{ 𝒋 }[/math]:


[math]\displaystyle{ 𝒈 = \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} G \\ \left[ \begin{array} {rrr} {-0.147066} \\ 0.141174 \\ 0.002946 \\ \end{array} \right] \end{array} }[/math]


When we work through that, we get 100.236 ¢. Close enough (shrugging off rounding errors). So we've sanity-checked at least.

But if we really want to see the connection between the pseudoinverse and the finding the zero of the derivative — how they both find the point where the slope of the RMS graph is zero and therefore it is at its minimum — we're going to have to upgrade from an equal temperament (rank-1 temperament) to a rank-2 temperament. In other words, we need to address tunings with more than one generator, ones can't be represented by a simple scalar anymore, but instead need to be represented with a vector.

A demonstration using matrix calculus

Technically speaking, even with two generators, meaning two variables, we could take the derivative with respect to one, and then take the derivative with respect to the other. And with three generators we could take three derivatives. But this gets out of hand. And there's a cleverer way we can think about the problem, which involves treating the vector containing all the generators as a single variable. We can do that! But it involves matrix calculus. And in this section we'll work through how.

Graphing damage for a rank-2 temperament, as we've seen previously, means we'll be looking at 3D tuning damage space, with the [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] axes in perpendicular directions across the floor, and the [math]\displaystyle{ z }[/math]-axis coming up out of the floor, where the [math]\displaystyle{ x }[/math]-axis gives the tuning of one generator, the [math]\displaystyle{ y }[/math]-axis gives the tuning of the other generator, and the [math]\displaystyle{ z }[/math]-axis gives the temperament's damage as a function of those two generator tunings.

 

And while in 2D tuning damage space the RMS graph made something like a V-shape but with the tip rounded off, here it makes a cone, again with its tip rounded off.

 

Remember that although we like to think of it, and visualize it as minimizing the [math]\displaystyle{ 2 }[/math]-mean of damage, it's equivalent, and simpler computationally, to minimize the [math]\displaystyle{ 2 }[/math]-sum. So here's our function:


[math]\displaystyle{ f(x, y) = \llzigzag \textbf{d} \rrzigzag _2 }[/math]


Which is the same as:


[math]\displaystyle{ f(x, y) = \textbf{d}\textbf{d}^\mathsf{T} }[/math]


Because:


[math]\displaystyle{ \textbf{d}\textbf{d}^\mathsf{T} = \\ \textbf{d}·\textbf{d} = \\ \mathrm{d}_1·\mathrm{d}_1 + \mathrm{d}_2·\mathrm{d}_2 + \mathrm{d}_3·\mathrm{d}_3 = \\ \mathrm{d}_1^2 + \mathrm{d}_2^2 + \mathrm{d}_3^2 }[/math]


Which is the same thing as the [math]\displaystyle{ 2 }[/math]-sum: it's the sum of entries to the 2nd power.

Alright, but I can expect you may be concerned: [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] do not even appear in the body of the formula! Well, we can fix that.

As a first step toward resolving this problem, let's choose some better variable names. We had only chosen [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] because those are the most generic variable names available. They're very typically used when graphing things in Euclidean space like this. But we can definitely do better than those names, if we bring in some information more specific to our problem.

One thing we know is that these [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] variables are supposed to represent the tunings of our two generators. So let's call them [math]\displaystyle{ g_1 }[/math] and [math]\displaystyle{ g_2 }[/math] instead:


[math]\displaystyle{ f(g_1, g_2) = \textbf{d}\textbf{d}^\mathsf{T} }[/math]


But we can do even better than this. We're in a world of vectors, so why not express [math]\displaystyle{ g_1 }[/math] and [math]\displaystyle{ g_2 }[/math] together as a vector, [math]\displaystyle{ 𝒈 }[/math]. In other words, they're just a generator tuning map.


[math]\displaystyle{ f(𝒈) = \textbf{d}\textbf{d}^\mathsf{T} }[/math]


You may not be comfortable with the idea of a function of a vector (Douglas: I certainly wasn't when I first saw this!) but after working through this example and meditating on it for a while, you may be surprised to find it ceasing to seem so weird after all.

So we're still trying to connect the left and right sides of this equation by showing explicitly how this is a function of [math]\displaystyle{ 𝒈 }[/math], i.e. how [math]\displaystyle{ \textbf{d} }[/math] can be expressed in terms of [math]\displaystyle{ 𝒈 }[/math]. And we promise, we will get there soon enough.

Next, let's substitute in [math]\displaystyle{ (𝒕 - 𝒋)\mathrm{T}W }[/math] for [math]\displaystyle{ \textbf{d} }[/math]. In other words, the target-interval damage list is the difference between how the tempered-prime tuning map and the just-prime tuning map tune our target-intervals, absolute valued, and weighted by each interval's weight. But the amount of symbols necessary to represent this equation is going to get out of hand if we do exactly like this, so we're actually going to distribute first, finding [math]\displaystyle{ 𝒕\mathrm{T}W - 𝒋\mathrm{T}W }[/math], and then we're going to start following a pattern here of using Fraktur-style letters to represent matrices that are multiplied by [math]\displaystyle{ \mathrm{T}W }[/math], so that in our case [math]\displaystyle{ 𝖙 = 𝒕\mathrm{T}W }[/math] and [math]\displaystyle{ 𝖏 = 𝒋\mathrm{T}W }[/math]:


[math]\displaystyle{ f(𝒈) = (𝖙 - 𝖏)(𝖙^\mathsf{T} - 𝖏^\mathsf{T}) }[/math]


Now let's distribute these two binomials (you know, the old [math]\displaystyle{ (a + b)(c + d) = ac + ad + bc + bd }[/math] trick, AKA "FOIL" = first, outer, inner, last).


[math]\displaystyle{ f(𝒈) = 𝖙𝖙^\mathsf{T} - 𝖙𝖏^\mathsf{T} - 𝖏𝖙^\mathsf{T} + 𝖏𝖏^\mathsf{T} }[/math]


Because both [math]\displaystyle{ 𝖙𝖏^\mathsf{T} }[/math] and [math]\displaystyle{ 𝖏𝖙^\mathsf{T} }[/math] correspond to the dot product of [math]\displaystyle{ 𝖙 }[/math] and [math]\displaystyle{ 𝖏 }[/math], we can consolidate the two inner terms. Let's change [math]\displaystyle{ 𝖙𝖏^\mathsf{T} }[/math] into [math]\displaystyle{ 𝖏𝖙^\mathsf{T} }[/math], so that we will end up with [math]\displaystyle{ 2𝖏𝖙^\mathsf{T} }[/math] in the middle:


[math]\displaystyle{ f(𝒈) = 𝖙𝖙^\mathsf{T} - 2𝖏𝖙^\mathsf{T} + 𝖏𝖏^\mathsf{T} }[/math]


Alright! We're finally ready to surface [math]\displaystyle{ 𝒈 }[/math]. It's been hiding in [math]\displaystyle{ 𝒕 }[/math] all along; the tuning map is equal to the generator tuning map times the mapping, i.e. [math]\displaystyle{ 𝒕 = 𝒈M }[/math]. So we can just substitute that in everywhere. Exactly what we'll do is [math]\displaystyle{ 𝖙 = 𝒕\mathrm{T}W = (𝒈M)\mathrm{T}W = 𝒈(M\mathrm{T}W) = 𝒈𝔐 }[/math], that last step introducing a new Fraktur-style symbol.[3]


[math]\displaystyle{ f(𝒈) = (𝒈𝔐)(𝒈𝔐)^\mathsf{T} - 2𝖏(𝒈𝔐)^\mathsf{T} + 𝖏𝖏^\mathsf{T} }[/math]


And that gets sort of clunky, so let's execute some of those transposes. Note that when we transpose, the order of things reverses, so [math]\displaystyle{ (𝒈𝔐)^\mathsf{T} = 𝔐^\mathsf{T}𝒈^\mathsf{T} }[/math]:


[math]\displaystyle{ f(𝒈) = 𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T} }[/math]


And now, we're finally ready to take the derivative!


[math]\displaystyle{ \dfrac{\partial}{\partial𝒈}f(𝒈) = \dfrac{\partial}{\partial𝒈}(𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T}) }[/math]


And remember, we want to find the place where this function is equal to zero. So let's drop the [math]\displaystyle{ \dfrac{\partial}{\partial𝒈}f(𝒈) }[/math] part on the left, and show the [math]\displaystyle{ = \textbf{0} }[/math] part on the right instead (note the boldness of the [math]\displaystyle{ \textbf{0} }[/math]; this indicates that this is not simply a single zero, but a vector of all zeros, one for each generator).


[math]\displaystyle{ \dfrac{\partial}{\partial𝒈}(𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T}) = \textbf{0} }[/math]


Well, now we've come to it. We've run out of things we can do without confronting the question: how in the world do we take derivatives of matrices? This next part is going to require some of that matrix calculus we warned about. Fortunately, if one is previously familiar with normal algebraic differentiation rules, these will not seem too wild:

  1. The last term, [math]\displaystyle{ 𝖏𝖏^\mathsf{T} }[/math], is going to vanish, because with respect to [math]\displaystyle{ 𝒈 }[/math], it's a constant; there's no factor of [math]\displaystyle{ 𝒈 }[/math] in it.
  2. The middle term, [math]\displaystyle{ -2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} }[/math], has a single factor of [math]\displaystyle{ 𝒈 }[/math], so it will remain but with that factor gone. (Technically it's a factor of [math]\displaystyle{ 𝒈^\mathsf{T} }[/math], but for reasons that would probably require a deeper understanding of the subtleties of matrix calculus than the present author commands, it works out this way anyway. Perhaps we should have differentiated instead with respect to [math]\displaystyle{ 𝒈^\mathsf{T} }[/math], rather than [math]\displaystyle{ 𝒈 }[/math]?)[4]
  3. The first term, [math]\displaystyle{ 𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} }[/math], can in a way be seen to have a [math]\displaystyle{ 𝒈^2 }[/math], because it contains both a [math]\displaystyle{ 𝒈 }[/math] as well as a [math]\displaystyle{ 𝒈^\mathsf{T} }[/math] (and we demonstrated earlier how for a vector [math]\displaystyle{ \textbf{v} }[/math], there is a relationship between itself squared and it times its transpose); so, just as an [math]\displaystyle{ x^2 }[/math] differentiates to a [math]\displaystyle{ 2x }[/math], that is, the power is reduced by 1 and multiplies into any existing coefficient, this term becomes [math]\displaystyle{ 2𝒈𝔐𝔐^\mathsf{T} }[/math].

And so we find:


[math]\displaystyle{ 2𝒈𝔐𝔐^\mathsf{T} - 2𝖏𝔐^\mathsf{T} = \textbf{0} }[/math]


That's much nicer to look at, huh. Well, what next? Our goal is to solve for [math]\displaystyle{ 𝒈 }[/math], right? Then let's isolate the solitary remaining term with [math]\displaystyle{ 𝒈 }[/math] as a factor on one side of the equation:


[math]\displaystyle{ 2𝒈𝔐𝔐^\mathsf{T} = 2𝖏𝔐^\mathsf{T} }[/math]


Certainly we can cancel out the 2's on both sides; that's easy:


[math]\displaystyle{ 𝒈𝔐𝔐^\mathsf{T} = 𝖏𝔐^\mathsf{T} }[/math]


And, as we proved in the earlier section "Demystifying the formula", [math]\displaystyle{ AA^\mathsf{T} }[/math] is invertible, so we cancel that out on the left by multiplying both sides of the equation by [math]\displaystyle{ (𝔐𝔐^\mathsf{T})^{-1} }[/math]:


[math]\displaystyle{ 𝒈𝔐𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} = 𝖏𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} \\ 𝒈\cancel{𝔐𝔐^\mathsf{T}}\cancel{(𝔐𝔐^\mathsf{T})^{-1}} = 𝖏𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} \\ 𝒈 = 𝖏𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} }[/math]


Finally, remember that [math]\displaystyle{ 𝒈 = 𝒋G }[/math] and [math]\displaystyle{ 𝒋 = 𝒋G_{\text{j}}M_{\text{j}} }[/math], so we can replace those and cancel out some more stuff (also remember that [math]\displaystyle{ 𝖏 = 𝒋\mathrm{T}W }[/math]):


[math]\displaystyle{ (𝒋G) = (𝒋G_{\text{j}}M_{\text{j}})\mathrm{T}W𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} \\ \cancel{𝒋}G = \cancel{𝒋}\cancel{I}\cancel{I}\mathrm{T}W𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} }[/math]


And that part on the right looks pretty familiar...


[math]\displaystyle{ G = \mathrm{T}W𝔐^\mathsf{T}(𝔐𝔐^\mathsf{T})^{-1} \\ G = \mathrm{T}W𝔐^{+} \\ G = \mathrm{T}W(M\mathrm{T}W)^{+} }[/math]


Voilà! We've found our pseudoinverse-based [math]\displaystyle{ G }[/math] formula, finding it to be the [math]\displaystyle{ G }[/math] that gives the point of zero slope, i.e. the minimum point of the RMS damage graph.

If you're hungry for more information on these concepts, or even just another take on it, please see User:Sintel/Generator optimization#Least squares method.

With held-intervals

The pseudoinverse method can be adapted to handle tuning schemes which have held-intervals. The basic idea here is that we can no longer simply grab the tuning found as the point at the bottom of the tuning damage graph bowl hovering above the floor, because that tuning probably doesn't also happen to be one that leaves the requested interval unchanged. We can imagine an additional feature in our tuning damage space: the line across this bowl which connects every point where the generator tunings work out such that our interval is indeed unchanged. Again, this line probably doesn't straight through the bottommost point of our RMS-damage graph. But that's okay. That just means we could still decrease the overall damage further if we didn't hold the interval unchanged. But assuming we're serious about holding this interval unchanged, we've simply modified the problem a bit. Now we're looking for the point along this new held-interval line which is closest to the floor. Simple enough to understand, in concept! The rest of this section is dedicated to explaining how, mathematically speaking, we're able to identify that point. It still involves matrix calculus — derivatives of vectors, and such — but now we also pull in some additional ideas. We hope you dig it.[5]

We'll be talking through this problem assuming a three-dimensional tuning damage graph, which is to say, we're dealing with a rank-2 temperament (the two generator dimensions across the floor, and the damage dimension up from the floor). If we asked for more than one interval to be held unchanged, then we'd flip over to the "only held-intervals" method discussed later, because at that point there's only a single possible tuning. And if we asked for less than one interval to be held unchanged, then we'd be back to the ordinary pseudoinverse method which you've already learned. So for this extended example we'll be assuming one held-interval. But the principles discussed here generalize to higher dimensions of temperaments and more held-intervals, if the dimensionality supports them. These higher dimensional examples are more difficult to visualize, though, of course, and so we've chosen the simplest possibility that sufficiently demonstrates the ideas we need to learn.

Topographic view

[math]\displaystyle{ % Latex equivalents of the wiki templates llzigzag and rrzigzag for double zigzag brackets. % Annoyingly, we need slightly different Latex versions for the different Latex sizes. \def\smallLLzigzag{\hspace{-1.4mu}\style{display:inline-block;transform:scale(.62,1.24)translateY(.05em);font-family:sans-serif}{ꗨ\hspace{-2.6mu}ꗨ}\hspace{-1.4mu}} \def\smallRRzigzag{\hspace{-1.4mu}\style{display:inline-block;transform:scale(-.62,1.24)translateY(.05em);font-family:sans-serif}{ꗨ\hspace{-2.6mu}ꗨ}\hspace{-1.4mu}} \def\llzigzag{\hspace{-1.6mu}\style{display:inline-block;transform:scale(.62,1.24)translateY(.07em);font-family:sans-serif}{ꗨ\hspace{-3mu}ꗨ}\hspace{-1.6mu}} \def\rrzigzag{\hspace{-1.6mu}\style{display:inline-block;transform:scale(-.62,1.24)translateY(.07em);font-family:sans-serif}{ꗨ\hspace{-3mu}ꗨ}\hspace{-1.6mu}} \def\largeLLzigzag{\hspace{-1.8mu}\style{display:inline-block;transform:scale(.62,1.24)translateY(.09em);font-family:sans-serif}{ꗨ\hspace{-3.5mu}ꗨ}\hspace{-1.8mu}} \def\largeRRzigzag{\hspace{-1.8mu}\style{display:inline-block;transform:scale(-.62,1.24)translateY(.09em);font-family:sans-serif}{ꗨ\hspace{-3.5mu}ꗨ}\hspace{-1.8mu}} \def\LargeLLzigzag{\hspace{-2.5mu}\style{display:inline-block;transform:scale(.62,1.24)translateY(.1em);font-family:sans-serif}{ꗨ\hspace{-4.5mu}ꗨ}\hspace{-2.5mu}} \def\LargeRRzigzag{\hspace{-2.5mu}\style{display:inline-block;transform:scale(-.62,1.24)translateY(.1em);font-family:sans-serif}{ꗨ\hspace{-4.5mu}ꗨ}\hspace{-2.5mu}} }[/math] Back in the fundamentals article, we briefly demonstrated a special way to visualize a 3-dimensional tuning damage 2-dimensionally: in a topographic view, where the [math]\displaystyle{ z }[/math]-axis is pointing straight out of the page, and represented by contour lines tracing out the shapes of points which share the same [math]\displaystyle{ z }[/math]-value. In the case of a tuning damage graph, then, this will show concentric rings (not necessarily circles) around the lowest point of our damage bowl, representing how damage increases smoothly in any direction you take away from that minimum point. So far we haven't made much use of this visualization approach, but for tuning schemes with [math]\displaystyle{ p=2 }[/math] and at least one held-interval, it's the perfect tool for the job.

So now we draw our held-interval line across this topographic view.

 

Our first guess at the lowest point on this line might be the point closest to the actual minimum damage. Good guess, but not necessarily true. It would be true if the rings were exactly circles. But they're not necessarily; they might be oblong, and the skew may not be in an obvious angle with respect to the held-interval line. So for a generalized means of finding the lowest point on the held-interval line, we need to think a bit deeper about the problem.

The first step to understanding better is to adjust our contour lines. The obvious place to start was at increments of 1 damage. But we're going to want to rescale so that one of our contour lines exactly touches the held-interval line. To be clear, we're not changing the damage graph at all; we're simply changing how we visualize it on this topographic view.

 

The point where this contour line touches the held-interval line, then, is the lowest point on the held-interval line, that is, the point among all those where the held-interval is indeed unchanged where the overall damage to the target-intervals is the least. This should be easy enough to see, because if you step just an infinitesimally small amount in either direction along the held-interval line, you will no longer be touching the contour line, but rather you will be just outside of it, which means you have slightly higher damage than whatever constant damage amount that contour traces.

 

Next, we need to figure out how to identify this point. It may seem frustrating, because we're looking right at it! But we don't already have formulas for these contour lines.

Matching slopes

In order to identify this point, it's going to be more helpful to look at the entire graph of our held-interval's error. That is, rather than only drawing the line where it's zero:


[math]\displaystyle{ 𝒕\mathrm{H} - 𝒋\mathrm{H} = 0 }[/math]


We'll draw the whole thing:


[math]\displaystyle{ 𝒕\mathrm{H} - 𝒋\mathrm{H} }[/math]


If the original graph was like a line drawn diagonally across the floor, the full graph looks like this but with a plane running through it, tilted, on one side ascending up and out from the floor, on the other side descending down and into the floor. In the topographic view, then, this graph will appear as equally-spaced parallel lines to the original line, emanating outwards in both directions from it.

 

The next thing we want to see are some little arrows along all of these contour lines, both for the damage graph and for the held-interval graph, which point perpendicularly to them.

 

What these little arrows represent are the derivatives of these graphs at those points, or in other words, the slope. If this isn't clear, it might help to step back for a moment to 2D, and draw little arrows in a similar fashion:

 

In higher dimensions, the generalized way to think about slope is that it's the vector pointing in the direction of steepest slope upwards from the given point.

Now, we're not attempting to distinguish the sizes of these slopes here. We could do that, perhaps by changing the relative scale of the arrows. But that's particularly important for our purposes. We only need to notice the different directions these slopes point.

You may recall that in the simpler case — with no held-intervals — we identified the point at the bottom of the bowl using derivatives; this point is where the derivative (slope) is equal to zero. Well, what can we notice about the point we're seeking to identify? It's where the slopes of the RMS damage graph for the target-intervals and the error of the held-interval match!

 

So, our first draft of our goal might look something like this:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈}}( \llzigzag \textbf{d} \rrzigzag _2) = \dfrac{\partial}{\partial{𝒈}}(𝒓\mathrm{H}) }[/math]


But that's not quite specific enough. To ensure we grab grab a point satisfying that condition, but also ensure that it's on our held-interval line, we could simply add another equation:


[math]\displaystyle{ \begin{align} \dfrac{\partial}{\partial{𝒈}}( \llzigzag \textbf{d} \rrzigzag _2) &= \dfrac{\partial}{\partial{𝒈}}(𝒓\mathrm{H}) \\[12pt] 𝒓\mathrm{H} &= 0 \end{align} }[/math]


But there's another special way of asking for the same thing, that isn't as obvious-looking, but consolidates it all down to a single equation, which — due to some mathemagic — eventually works out to give us a really nice solution. Here's what that looks like:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}( \llzigzag \textbf{d} \rrzigzag _2) = \dfrac{\partial}{\partial{𝒈, λ}}(λ𝒓\mathrm{H}) }[/math]


What we've done here is added a new variable [math]\displaystyle{ λ }[/math][6], a multiplier which scales the error in the interval we want to be unchanged. We can visualize its effect as saying: we don't care about the relative lengths of these two vectors; we only care about wherever they point in exactly the same direction. This trick works as long as we take the derivative with respect to [math]\displaystyle{ λ }[/math] as well, which you'll note we're doing now too.[7] We don't expect this to be clear straight away; the reason this works will probably only become clear in later steps of working through the problem.

Let's rework our equation a bit to make things nicer. One thing we can do is put both terms on one side of the equation, equalling zero (rather, the zero vector, with a bolded zero):


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}( \llzigzag \textbf{d} \rrzigzag _2) - \dfrac{\partial}{\partial{𝒈, λ}}(λ𝒓\mathrm{H}) = \textbf{0} }[/math]


And now we can consolidate the derivatives:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}( \llzigzag \textbf{d} \rrzigzag _2 - λ𝒓\mathrm{H}) = \textbf{0} }[/math]


We're going to switch from subtraction to addition here. How can we get away with that? Well, it just changes what [math]\displaystyle{ λ }[/math] comes out to; it'll just flip the sign on it. But we'll get the same answer either way. And we won't actually need to do anything with the value of [math]\displaystyle{ λ }[/math] in the end; we only need to know the answers to the generator sizes in [math]\displaystyle{ 𝒈 }[/math].


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}( \llzigzag \textbf{d} \rrzigzag _2 + λ𝒓\mathrm{H}) = \textbf{0} }[/math]


Similarly, we can do this without changing the result:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}(\frac12 \llzigzag \textbf{d} \rrzigzag _2 + λ𝒓\mathrm{H}) = \textbf{0} }[/math]


That'll make the maths work out nicer, and just means [math]\displaystyle{ λ }[/math] will be half the size as it would have been otherwise.

So: we're looking for the value of [math]\displaystyle{ 𝒈 }[/math]. But [math]\displaystyle{ 𝒈 }[/math] doesn't appear in the equation yet. That's because it's hiding inside [math]\displaystyle{ \textbf{d} }[/math] and [math]\displaystyle{ 𝒓 }[/math]. We won't bother repeating all the steps from the simpler case; we'll just replace [math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _2 }[/math] with [math]\displaystyle{ 𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T} }[/math]. And as for [math]\displaystyle{ 𝒓 }[/math], that's just [math]\displaystyle{ 𝒕 - 𝒋 }[/math], or [math]\displaystyle{ 𝒈M - 𝒋 }[/math]. So we have:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}(\frac12(𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T}) + λ(𝒈M - 𝒋)\mathrm{H}) = \textbf{0} }[/math]


And let's just distribute stuff so we have a simple summation:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, λ}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + λ𝒈M\mathrm{H} - λ𝒋\mathrm{H}) = \textbf{0} }[/math]


Everything in that expression other than [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ λ }[/math] are known values; only [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ λ }[/math] are variables.

As a final change, we're going to recognize the fact that for higher-dimensional temperaments, we might sometimes have multiple held-intervals. Which is to say that our new variable might actually itself be a vector! So we'll use a bold [math]\displaystyle{ \textbf{λ} }[/math] here to capture that idea.[8] (The example we will demonstrate with periodically will still only have one held-interval, though, but that's fine if this is a one-entry vector, whose only entry is [math]\displaystyle{ λ_1 }[/math].) Note that we need to locate [math]\displaystyle{ \textbf{λ} }[/math] on the right side of each term now, so that its [math]\displaystyle{ h }[/math] height matches up with the [math]\displaystyle{ h }[/math] width of [math]\displaystyle{ \mathrm{H} }[/math].


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, \textbf{λ}}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = \textbf{0} }[/math]


Now in the simpler case, when we took the derivative simply with respect to [math]\displaystyle{ 𝒈 }[/math], we could almost treat the vectors and matrices like normal variables when taking derivatives: exponents came down as coefficients, and exponents decremented by 1. But now that we're taking the derivative with respect to both [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ \textbf{λ} }[/math], the clearest way forward is to understand this in terms of a system of equations, rather than a single equation of matrices and vectors.

Multiple derivatives

One way of thinking about what we're asking for with [math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, \textbf{λ}}} }[/math] is that we want the vector whose entries are partial derivatives with respect to each scalar entry of [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ \textbf{λ} }[/math]. We hinted at this earlier when we introduced the bold-zero vector [math]\displaystyle{ \textbf{0} }[/math], which represented a zero for each generator. So if:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈}} \llzigzag \textbf{d} \rrzigzag _2 = \\ \dfrac{\partial}{\partial{𝒈}} ( 𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 2𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + 𝖏𝖏^\mathsf{T}) = \\ \dfrac{\partial}{\partial{𝒈}} f(𝒈) \\ }[/math]


Then if we find a miniRMS damage where [math]\displaystyle{ 𝒈 }[/math] = {1198.857 162.966], that tells us that:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈}} f(\left[ \begin{array} {c} 1198.857 & 162.966 \\ \end{array} \right]) = \textbf{0} = \left[ \begin{array} {c} 0 & 0 \\ \end{array} \right] }[/math]


Or in other words:


[math]\displaystyle{ \dfrac{\partial}{\partial{g_1}} f(\left[ \begin{array} {c} 1198.857 & 162.966 \\ \end{array} \right]) = 0 \\ \dfrac{\partial}{\partial{g_2}} f(\left[ \begin{array} {c} 1198.857 & 162.966 \\ \end{array} \right]) = 0 }[/math]


And so if we plug in some other [math]\displaystyle{ 𝒈 }[/math] to [math]\displaystyle{ f() }[/math], what we get out is some vector [math]\displaystyle{ \textbf{v} }[/math] telling us the slope of the damage graph at the tuning represented by that generator tuning map:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈}} f(\left[ \begin{array} {c} 1200.000 & 163.316 \\ \end{array} \right]) = \textbf{v} }[/math]


Or in other words:


[math]\displaystyle{ \dfrac{\partial}{\partial{g_1}} f(\left[ \begin{array} {c} 1200.000 & 163.316 \\ \end{array} \right]) = v_1 \\ \dfrac{\partial}{\partial{g_1}} f(\left[ \begin{array} {c} 1200.000 & 163.316 \\ \end{array} \right]) = v_2 \\ }[/math]


So when we ask for:


[math]\displaystyle{ \dfrac{\partial}{\partial{𝒈, \textbf{λ}}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = \textbf{0} = \left[ \begin{array} {c} 0 & 0 & 0 \\ \end{array} \right] }[/math]


What we really want under the hood is the derivative with respect to [math]\displaystyle{ g_1 }[/math] to be 0, the derivative with respect to [math]\displaystyle{ g_2 }[/math] to be 0, and also the derivative with respect to [math]\displaystyle{ λ_1 }[/math] to be 0:


[math]\displaystyle{ \dfrac{\partial}{\partial{g_1}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = 0 \\ \dfrac{\partial}{\partial{g_2}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = 0 \\ \dfrac{\partial}{\partial{λ_1}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = 0 \\ }[/math]


So, this essentially gives us a vector whose entries are derivatives, and which can be thought of as an arrow in space pointing in the multidimensional direction of the slope of the graph at a point. Sometimes these vector derivatives are called "gradients" and notated with an upside-down triangle, but we're just going to stick with the more familiar algebraic terminology here for our purposes.

To give a quick and dirty answer to the question posed earlier regarding why introducing [math]\displaystyle{ \textbf{λ} }[/math] is a replacement of any sort for the obvious equation [math]\displaystyle{ 𝒓\mathrm{H} = 0 }[/math], notice what the derivative of the third equation will be. We'll work it out in rigorous detail soon, but for now, let's just observe how [math]\displaystyle{ \dfrac{\partial}{\partial{λ_1}}(\frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} - 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} + \frac12𝖏𝖏^\mathsf{T} + 𝒈M\mathrm{H}\textbf{λ} - 𝒋\mathrm{H}\textbf{λ}) = 𝒈M\mathrm{H} - 𝒋\mathrm{H} }[/math]. So if that's equal to 0, and [math]\displaystyle{ 𝒓 }[/math] can be rewritten as [math]\displaystyle{ 𝒕 - 𝒋 }[/math] and further as [math]\displaystyle{ 𝒈𝑀 - 𝒋 }[/math], then we can see how this has covered our bases re: [math]\displaystyle{ 𝒓\mathrm{H} = 0 }[/math], while also providing the connective tissue to the other equations re: using [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ \textbf{λ} }[/math] to minimize damage to our target-intervals; because [math]\displaystyle{ \textbf{λ} }[/math] figures in terms in the first two equations which also have a [math]\displaystyle{ 𝒈 }[/math] in them, so whatever it comes out to will affect those; this is how we achieve the offsetting from the actual bottom of the damage bowl.

Break down matrices

In order to work this out, though, we'll need to break our occurrences of [math]\displaystyle{ 𝒈 }[/math] down into [math]\displaystyle{ g_1 }[/math] and [math]\displaystyle{ g_2 }[/math] (and [math]\displaystyle{ \textbf{λ} }[/math] down into [math]\displaystyle{ λ_1 }[/math]).

So let's take this daunting task on, one term at a time. Term one of five:


[math]\displaystyle{ \frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} }[/math]


Remember, [math]\displaystyle{ 𝔐 = M\mathrm{T}W }[/math]. We haven't specified our target-interval count [math]\displaystyle{ k }[/math]. Whatever it is, though, if we were to drill all the way down to the [math]\displaystyle{ m_{ij} }[/math], [math]\displaystyle{ t_{ij} }[/math], and [math]\displaystyle{ w_{ij} }[/math] level here as we are doing with [math]\displaystyle{ 𝒈 }[/math], then the entries of [math]\displaystyle{ 𝔐 }[/math] would be so complicated that they'd be hard to fit on the page, with dozens of summed up terms. And the entries of [math]\displaystyle{ 𝔐𝔐^\mathsf{T} }[/math] would be even crazier! So let's not.

Besides, we don't need to drill down into [math]\displaystyle{ M }[/math], [math]\displaystyle{ \mathrm{T} }[/math], or [math]\displaystyle{ W }[/math] in the same way we need to drill down into [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ \mathbf{λ} }[/math], because they're not variables we need to differentiate by; they're all just known constants, information about the temperament we're tuning and the tuning scheme according to which we're tuning it. So why would we drill down into those? Well, we won't.

Instead, let's take an approach where in each term, we'll multiply together every matrix other than [math]\displaystyle{ 𝒈 }[/math] and [math]\displaystyle{ \mathbf{λ} }[/math], then use letters [math]\displaystyle{ \mathrm{A} }[/math], [math]\displaystyle{ \mathrm{B} }[/math], [math]\displaystyle{ \mathrm{C} }[/math], [math]\displaystyle{ \mathrm{D} }[/math], and [math]\displaystyle{ \mathrm{E} }[/math] to identify results as matrices of constants, one different letter of the alphabet for each term. And while may not need to have drilled down to the matrix entry level in [math]\displaystyle{ M }[/math], [math]\displaystyle{ \mathrm{T} }[/math], or [math]\displaystyle{ W }[/math], we do at least need to drill down to the entry level of these constant matrices.

So, in the case of our first term, we'll be replacing [math]\displaystyle{ 𝔐𝔐^\mathsf{T} }[/math] with [math]\displaystyle{ \mathrm{A} }[/math]. And if we've set [math]\displaystyle{ r=2 }[/math], then this is a matrix with shape [math]\displaystyle{ (2,2) }[/math], so it'll have entries [math]\displaystyle{ \mathrm{a}_{11} }[/math], [math]\displaystyle{ \mathrm{a}_{12} }[/math], [math]\displaystyle{ \mathrm{a}_{21} }[/math], and [math]\displaystyle{ \mathrm{a}_{22} }[/math]. We've indicated shapes below each matrix in the following:


[math]\displaystyle{ \begin{align} \frac12 \begin{array} {c} 𝒈 \\ \left[ \begin{array} {r} g_1 & g_2 \\ \end{array} \right] \\ \small (1,2) \end{array} \begin{array} {c} \mathrm{A} \\ \left[ \begin{array} {r} \mathrm{a}_{11} & \mathrm{a}_{12} \\ \mathrm{a}_{21} & \mathrm{a}_{22} \\ \end{array} \right] \\ \small (2,2) \end{array} \begin{array} {c} 𝒈^\mathsf{T} \\ \left[ \begin{array} {r} g_1 \\ g_2 \\ \end{array} \right] \\ \small (2,1) \end{array} &= \\[12pt] \frac12 \begin{array} {c} 𝒈 \\ \left[ \begin{array} {r} g_1 & g_2 \\ \end{array} \right] \\ \small (1,2) \end{array} \begin{array} {c} \mathrm{A}𝒈^\mathsf{T} \\ \left[ \begin{array} {r} \mathrm{a}_{11}g_1 + \mathrm{a}_{12}g_2 \\ \mathrm{a}_{21}g_1 + \mathrm{a}_{22}g_2 \\ \end{array} \right] \\ \small (2,1) \end{array} &= \\[12pt] \frac12 \begin{array} {c} 𝒈\mathrm{A}𝒈^\mathsf{T} \\ \left[ \begin{array} {r} (\mathrm{a}_{11}g_1 + \mathrm{a}_{12}g_2)g_1 + (\mathrm{a}_{21}g_1 + \mathrm{a}_{22}g_2)g_2 \end{array} \right] \\ \small (1,1) \end{array} &= \\[12pt] \frac12\mathrm{a}_{11}g_1^2 + \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1g_2 + \frac12\mathrm{a}_{22}g_2^2 \end{align} }[/math]


Yes, there's a reason we haven't pulled the [math]\displaystyle{ \frac12 }[/math] into the constant matrix, despite it clearly being a constant. It's the same reason we deliberately introduced it to our equation out of nowhere earlier. We'll see soon enough.

Now let's work out the second term, [math]\displaystyle{ 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} }[/math]. Again, we should do as little as possible other than breaking down [math]\displaystyle{ 𝒈 }[/math]. So with [math]\displaystyle{ 𝖏 }[/math] a [math]\displaystyle{ (1, k) }[/math]-shaped matrix and [math]\displaystyle{ 𝔐^\mathsf{T} }[/math] a [math]\displaystyle{ (k, r) }[/math]-shaped matrix, those two together are a [math]\displaystyle{ (1, r) }[/math]-shaped matrix, and [math]\displaystyle{ r=2 }[/math] in our example. And that's our [math]\displaystyle{ \mathrm{B} }[/math]. So:


[math]\displaystyle{ \begin{align} \begin{array} {c} \mathrm{B} \\ \left[ \begin{array} {r} \mathrm{b}_{11} & \mathrm{b}_{12} \\ \end{array} \right] \\ \small (1,2) \end{array} \begin{array} {c} 𝒈^\mathsf{T} \\ \left[ \begin{array} {r} g_1 \\ g_2 \\ \end{array} \right] \\ \small (2,1) \end{array} &= \\[12pt] \begin{array} {c} \mathrm{B}𝒈^\mathsf{T} \\ \left[ \begin{array} {r} \mathrm{b}_{11}g_1 + \mathrm{b}_{12}g_2 \\ \end{array} \right] \\ \small (1,1) \end{array} &= \\[12pt] \mathrm{b}_{11}g_1 + \mathrm{b}_{12}g_2 \end{align} }[/math]


Third term to break down: [math]\displaystyle{ \frac12𝖏𝖏^\mathsf{T} }[/math]. This one has neither a [math]\displaystyle{ 𝒈 }[/math] nor a [math]\displaystyle{ \textbf{λ} }[/math] in it, and is a [math]\displaystyle{ (1, 1) }[/math]-shaped matrix, so all we have to do is get it into our constant form: [math]\displaystyle{ \frac12\mathrm{c}_{11} }[/math] (for consistency, leaving the [math]\displaystyle{ \frac12 }[/math] alone, though this one matters less).

Fourth term to break down: [math]\displaystyle{ 𝒈M\mathrm{H}\textbf{λ} }[/math]. Well, [math]\displaystyle{ M\mathrm{H} }[/math] is a [math]\displaystyle{ (r, d)(d, h) = (r, h) }[/math]-shaped matrix, and we know [math]\displaystyle{ r=2 }[/math] and [math]\displaystyle{ h=1 }[/math], so our constant matrix [math]\displaystyle{ \mathrm{D} }[/math] is a [math]\displaystyle{ (2, 1) }[/math]-shaped matrix.


[math]\displaystyle{ \begin{align} \begin{array} {c} 𝒈 \\ \left[ \begin{array} {r} g_1 & g_2 \\ \end{array} \right] \\ \small (1, 2) \end{array} \begin{array} {c} \mathrm{D} \\ \left[ \begin{array} {r} \mathrm{d}_{11} \\ \mathrm{d}_{12} \\ \end{array} \right] \\ \small (2, 1) \end{array} \begin{array} {c} \textbf{λ} \\ \left[ \begin{array} {r} λ_1 \\ \end{array} \right] \\ \small (1, 1) \end{array} &= \\[12pt] \begin{array} {c} 𝒈\mathrm{D} \\ \left[ \begin{array} {r} \mathrm{d}_{11}g_1 + \mathrm{d}_{12}g_2 \end{array} \right] \\ \small (1,1) \end{array} \begin{array} {c} \textbf{λ} \\ \left[ \begin{array} {r} λ_1 \\ \end{array} \right] \\ \small (1, 1) \end{array} &= \\[12pt] \begin{array} {c} 𝒈\mathrm{D}\textbf{λ} \\ \left[ \begin{array} {r} (\mathrm{d}_{11}g_1 + \mathrm{d}_{12}g_2)λ_1 \end{array} \right] \\ \small (1,1) \end{array} &= \\[12pt] \mathrm{d}_{11}g_1λ_1 + \mathrm{d}_{12}g_2λ_1 \end{align} }[/math]


Okay, the fifth and final term to break down: [math]\displaystyle{ 𝒋\mathrm{H}\textbf{λ} }[/math]. This one's on the quicker side: we can just rewrite it as [math]\displaystyle{ \mathrm{e}_{11}λ_1 }[/math].

Now we just have to put all five of those rewritten terms back together!


[math]\displaystyle{ \begin{array} \frac12𝒈𝔐𝔐^\mathsf{T}𝒈^\mathsf{T} & - & 𝖏𝔐^\mathsf{T}𝒈^\mathsf{T} & + & \frac12𝖏𝖏^\mathsf{T} & + & 𝒈M\mathrm{H}\textbf{λ} & - & 𝒋\mathrm{H}\textbf{λ} & = \\ \frac12𝒈\mathrm{A}𝒈^\mathsf{T} & - & \mathrm{B}𝒈^\mathsf{T} & + & \frac12\mathrm{C} & + & 𝒈\mathrm{D}\textbf{λ} & - & \mathrm{E}\textbf{λ} & = \\ \frac12\mathrm{a}_{11}g_1^2 + \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1g_2 + \frac12\mathrm{a}_{22}g_2^2 & - & \mathrm{b}_{11}g_1 - \mathrm{b}_{12}g_2 & + & \frac12\mathrm{c}_{11} & + & \mathrm{d}_{11}g_1λ_1 + \mathrm{d}_{12}g_2λ_1 & - & \mathrm{e}_{11}λ_1 & \end{array} }[/math]


Now that we've gotten our expression in terms of [math]\displaystyle{ g_1 }[/math], [math]\displaystyle{ g_2 }[/math], and [math]\displaystyle{ λ_1 }[/math], we are ready to take our three different derivatives of this, once with respect to each of those three scalar variables (and finally we can see why we introduced the factor of [math]\displaystyle{ \frac12 }[/math]: so that when the exponents of 2 come down as coefficients, they cancel out; well, that's only a partial answer, we suppose, but suffice it to say that if we hadn't done this, later steps wouldn't match up quite right).


[math]\displaystyle{ \begin{array} {c} f(𝒈, \textbf{λ}) & = & \frac12\mathrm{a}_{11}g_1^2 & + & \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1g_2 & + & \frac12\mathrm{a}_{22}g_2^2 & - & \mathrm{b}_{11}g_1 & - & \mathrm{b}_{12}g_2 & + & \frac12\mathrm{c}_{11} & + & \mathrm{d}_{11}g_1λ_1 & + & \mathrm{d}_{12}g_2λ_1 & - & \mathrm{e}_{11}λ_1 & \\ \dfrac{\partial}{\partial{g_1}}f(𝒈, \textbf{λ}) & = & \mathrm{a}_{11}g_1 & + & \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_2 & + & 0 & - & \mathrm{b}_{11} & - & 0 & + & 0 & + & \mathrm{d}_{11}λ_1 & + & 0 & - & 0 \\ \dfrac{\partial}{\partial{g_2}}f(𝒈, \textbf{λ}) & = & 0 & + & \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1 & + & \mathrm{a}_{22}g_2 & - & 0 & - & \mathrm{b}_{12} & + & 0 & + & 0 & + & \mathrm{d}_{12}λ_1 & - & 0 \\ \dfrac{\partial}{\partial{λ_1}}f(𝒈, \textbf{λ}) & = & 0 & + & 0 & + & 0 & - & 0 & - & 0 & + & 0 & + & \mathrm{d}_{11}g_1 & + & \mathrm{d}_{12}g_2 & - & \mathrm{e}_{11} \\ \end{array} }[/math]


And so, replacing the derivatives in our system, we find:


[math]\displaystyle{ \begin{align} \mathrm{a}_{11}g_1 + \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_2 - \mathrm{b}_{11} + \mathrm{d}_{11}λ_1 &= 0 \\ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1 + \mathrm{a}_{22}g_2 - \mathrm{b}_{12} + \mathrm{d}_{12}λ_1 &= 0 \\ \mathrm{d}_{11}g_1 + \mathrm{d}_{12}g_2 - \mathrm{e}_{11} &= 0 \\ \end{align} }[/math]


Build matrices back up

In this section we'd like to work our way back from this rather clunky and tedious system of equations situation back to matrices. As our first step, let's space our derivative equations' terms out nicely so we can understand better the relationships between them:


[math]\displaystyle{ \begin{array} {c} \mathrm{a}_{11}g_1 & + & \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_2 & + & \mathrm{d}_{11}λ_1 & - & \mathrm{b}_{11} & = & 0 \\ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1 & + & \mathrm{a}_{22}g_2 & + & \mathrm{d}_{12}λ_1 & - & \mathrm{b}_{12} & = & 0\\ \mathrm{d}_{11}g_1 & + & \mathrm{d}_{12}g_2 & & & - & \mathrm{e}_{11} & = & 0\\ \end{array} }[/math]


Next, notice that all of the terms that contain none of our variables are negative. Let's get all of them to the other side of their respective equations:


[math]\displaystyle{ \begin{array} {c} \mathrm{a}_{11}g_1 & + & \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_2 & + & \mathrm{d}_{11}λ_1 & = & \mathrm{b}_{11} \\ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21})g_1 & + & \mathrm{a}_{22}g_2 & + & \mathrm{d}_{12}λ_1 & = & \mathrm{b}_{12} \\ \mathrm{d}_{11}g_1 & + & \mathrm{d}_{12}g_2 & & & = & \mathrm{e}_{11} \\ \end{array} }[/math]


Notice also that none of our terms contain more than one of our variables anymore. Let's reorganize these terms in a table according to which variable they contain:


equation [math]\displaystyle{ g_1 }[/math] [math]\displaystyle{ g_2 }[/math] [math]\displaystyle{ λ_1 }[/math] (no variable, i.e. constants only)
1 [math]\displaystyle{ \mathrm{a}_{11} }[/math] [math]\displaystyle{ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21}) }[/math] [math]\displaystyle{ \mathrm{d}_{11} }[/math] [math]\displaystyle{ \mathrm{b}_{11} }[/math]
2 [math]\displaystyle{ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21}) }[/math] [math]\displaystyle{ \mathrm{a}_{22} }[/math] [math]\displaystyle{ \mathrm{d}_{12} }[/math] [math]\displaystyle{ \mathrm{b}_{12} }[/math]
3 [math]\displaystyle{ \mathrm{d}_{11} }[/math] [math]\displaystyle{ \mathrm{d}_{12} }[/math] - [math]\displaystyle{ \mathrm{e}_{11} }[/math]


This reorganization is the first step to seeing how we can pull ourselves back into matrix form. Notice some patterns here. The constants are all grouped together by which term they came from. This means we can go back to thinking of this system of equations as a single equation of matrices, replacing these chunks with the original constant matrices:


equation [math]\displaystyle{ g_1 }[/math] [math]\displaystyle{ g_2 }[/math] [math]\displaystyle{ λ_1 }[/math] (no variable, i.e. constants only)
1 [math]\displaystyle{ \mathrm{A} }[/math] [math]\displaystyle{ \mathrm{D} }[/math] [math]\displaystyle{ \mathrm{B}^\mathsf{T} }[/math]
2
3 [math]\displaystyle{ \mathrm{D}^\mathsf{T} }[/math] - [math]\displaystyle{ \mathrm{E}^\mathsf{T} }[/math]


The replacements for [math]\displaystyle{ \mathrm{B} }[/math] and [math]\displaystyle{ \mathrm{D} }[/math] may seem obvious enough, but you may initially balk at the replacement of [math]\displaystyle{ \mathrm{A} }[/math] here, but there's a reason that works. It's due to the fact that the thing [math]\displaystyle{ \mathrm{A} }[/math] represents is the product of a thing and its own transpose, which means entries mirrored across the main diagonal are equal to each other. So if [math]\displaystyle{ \mathrm{a}_{12} = \mathrm{a}_{21} }[/math], then [math]\displaystyle{ \frac12(\mathrm{a}_{12} + \mathrm{a}_{21}) = \mathrm{a}_{12} = \mathrm{a}_{21} }[/math]. Feel free to check this yourself, or compare with our work-through in the footnote here.[9]

Also note that we made [math]\displaystyle{ \mathrm{E} }[/math] transposed; it's hard to tell because it's a [math]\displaystyle{ (1, 1) }[/math]-shaped matrix, but if we did have more than one held-interval, this'd be more apparent.

And so now we can go back to our original variables.


equation [math]\displaystyle{ g_1 }[/math] [math]\displaystyle{ g_2 }[/math] [math]\displaystyle{ λ_1 }[/math] (no variable, i.e. constants only)
1 [math]\displaystyle{ 𝔐𝔐^\mathsf{T} }[/math] [math]\displaystyle{ M\mathrm{H} }[/math] [math]\displaystyle{ (𝖏𝔐^\mathsf{T})^\mathsf{T} }[/math]
2
3 [math]\displaystyle{ (M\mathrm{H})^\mathsf{T} }[/math] - [math]\displaystyle{ (𝒋\mathrm{H})^\mathsf{T} }[/math]


And if we think about how matrix multiplication works, we can realize that the headings are just a vector containing our variables. And so the rest is just a couple of augmented matrices. We can fill the matrix with zeros where we don't have any constants. And remember, the data entries in the last column of this table are actually on the right side of the equals signs:


[math]\displaystyle{ \left[ \begin{array} {c|c} \\ \quad 𝔐𝔐^\mathsf{T} \quad & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right] \left[ \begin{array} {c} g_1 \\ g_2 \\ \hline λ_1 \\ \end{array} \right] = \left[ \begin{array} {c} \\ (𝖏𝔐^\mathsf{T})^\mathsf{T} \\ \hline (𝒋\mathrm{H})^\mathsf{T} \\ \end{array} \right] }[/math]


But we prefer to think of our generators in a row vector, or map. And everything on the right half is transposed. So we can address both of those issues by transposing everything. Remember, when we transpose, we also reverse the order. Conveniently, because the augmented matrix on the left side of the equation is symmetric across its main diagonal, transposing it does not change its value:


[math]\displaystyle{ \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] \left[ \begin{array} {c|c} \\ \quad 𝔐𝔐^\mathsf{T} \quad & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right] = \left[ \begin{array} {c|c} \quad 𝖏𝔐^\mathsf{T} \quad & 𝒋\mathrm{H} \\ \end{array} \right] }[/math]


The big matrix is invertible, so we can multiply both sides by its inverse to move it to the other side, to help us solve for [math]\displaystyle{ g_1 }[/math] and [math]\displaystyle{ g_2 }[/math]:


[math]\displaystyle{ \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] = \left[ \begin{array} {c|c} \quad 𝖏𝔐^\mathsf{T} \quad & 𝒋\mathrm{H} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ \quad 𝔐𝔐^\mathsf{T} \quad & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right]^{\large -1} }[/math]


And let's go back from [math]\displaystyle{ 𝖏 }[/math] to [math]\displaystyle{ 𝒋\mathrm{T}W }[/math] and [math]\displaystyle{ 𝔐 }[/math] to [math]\displaystyle{ M\mathrm{T}W }[/math]:


[math]\displaystyle{ \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] = \left[ \begin{array} {c|c} 𝒋\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & 𝒋\mathrm{H} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right]^{\large -1} }[/math]


And extract the [math]\displaystyle{ 𝒋 }[/math] from the right:


[math]\displaystyle{ \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] = 𝒋 \left[ \begin{array} {c|c} \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & \mathrm{H} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right]^{\large -1} }[/math]


At this point you may begin to notice the similarity between this and the pseudoinverse method. We looked at the pseudoinverse as [math]\displaystyle{ G = \mathrm{T}W(M\mathrm{T}W)^{+} = \mathrm{T}W(M\mathrm{T}W)^\mathsf{T}(M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T})^{-1} }[/math], but all we need to do is multiply both sides by [math]\displaystyle{ 𝒋 }[/math] and you get [math]\displaystyle{ 𝒈 = 𝒋\mathrm{T}W(M\mathrm{T}W)^\mathsf{T}(M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T})^{-1} }[/math], which looks almost the same as the above, only without any of the augmentations that are there to account for the held-intervals:


[math]\displaystyle{ \left[ \begin{array} {cc} g_1 & g_2 \\ \end{array} \right] = 𝒋 \left[ \begin{array} {c} \quad \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \quad \\ \end{array} \right] \left[ \begin{array} {c} \quad M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \quad \\ \end{array} \right]^{-1} }[/math]


And so, without held-intervals, the generators can be found as the pseudoinverse of [math]\displaystyle{ M\mathrm{T}W }[/math] (left-multiplied by [math]\displaystyle{ \mathrm{T}W }[/math]), with generators, they can be found as almost the same thing, just with some augmentations to the matrices. This augmentation results in an extra value at the end, [math]\displaystyle{ λ_1 }[/math], but we don't need it and can just discard it. Ta da!

Hardcoded example

At this point everything on the right side of this equation is known. Let's actually plug in some numbers to convince ourselves this makes sense. Suppose we go with an unchanged octave, porcupine temperament, the 6-TILT, and unity-weight damage (and of course, optimization power [math]\displaystyle{ 2 }[/math]). Then we have:

[math]\displaystyle{ \begin{array} {c} 𝒋 \\ \left[ \begin{array} {c} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{H} \\ \left[ \begin{array} {c} 1 \\ 0 \\ 0 \\ \end{array} \right] \end{array} , \begin{array} {c} M \\ \left[ \begin{array} {c} 1 & 2 & 3 \\ 0 & {-3} & {-5} \\ \end{array} \right] \end{array} , \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} , \begin{array} {ccc} W \\ \left[ \begin{array} {c} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Before we can plug into our formula, we need to compute a few things. Let's start with [math]\displaystyle{ M\mathrm{H} }[/math]:


[math]\displaystyle{ \begin{array} {c} M \\ \left[ \begin{array} {c} 1 & 2 & 3 \\ 0 & {-3} & {-5} \\ \end{array} \right] \end{array} \begin{array} {c} \mathrm{H} \\ \left[ \begin{array} {c} 1 \\ 0 \\ 0 \\ \end{array} \right] \end{array} = \begin{array} {c} M\mathrm{H} \\ \left[ \begin{array} {c} 1 \\ 0 \\ \end{array} \right] \end{array} }[/math]


As for [math]\displaystyle{ \mathrm{T}W }[/math], that's easy, because [math]\displaystyle{ W }[/math] — being a unity-weight matrix — is an identity matrix, so it's equal simply to [math]\displaystyle{ \mathrm{T} }[/math]. But regarding [math]\displaystyle{ M\mathrm{T}W = M\mathrm{T} }[/math], that would be helpful to compute in advance:


[math]\displaystyle{ \begin{array} {c} M \\ \left[ \begin{array} {c} 1 & 2 & 3 \\ 0 & {-3} & {-5} \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} = \begin{array} {ccc} M\mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & 2 & {1} & \;\;\;0 & 2 & 1 & 1 & \;\;0 \\ 0 & {-3} & {-3} & 3 & {-5} & {-2} & {-5} & 2 \\ \end{array} \right] \end{array} }[/math]


And so [math]\displaystyle{ M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} }[/math] would be:


[math]\displaystyle{ \begin{array} {ccc} M\mathrm{T}W \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & 2 & {1} & \;\;\;0 & 2 & 1 & 1 & \;\;0 \\ 0 & {-3} & {-3} & 3 & {-5} & {-2} & {-5} & 2 \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {c} 1 & 0 \\ \hline 2 & {-3} \\ \hline {1} & {-3} \\ \hline 0 & 3 \\ \hline 2 & {-5} \\ \hline 1 & {-2} \\ \hline 1 & {-5} \\ \hline 0 & 2 \\ \end{array} \right] \end{array} = \begin{array} {ccc} M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {c} 12 & {-26} \\ {-26} & 85 \\ \end{array} \right] \end{array} }[/math]


And finally, [math]\displaystyle{ \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} }[/math]:


[math]\displaystyle{ \begin{array} {ccc} \mathrm{T}W \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} \begin{array} {ccc} (M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {c} 1 & 0 \\ \hline 2 & {-3} \\ \hline {1} & {-3} \\ \hline 0 & 3 \\ \hline 2 & {-5} \\ \hline 1 & {-2} \\ \hline 1 & {-5} \\ \hline 0 & 2 \\ \end{array} \right] \end{array} = \begin{array} {ccc} \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {c} {-4} & 26 \\ 2 & {-5} \\ 4 & {-14} \\ \end{array} \right] \end{array} }[/math]


Now we just have to plug all that into our formula for [math]\displaystyle{ 𝒈 }[/math] (and [math]\displaystyle{ \textbf{λ} }[/math], though again, we don't really care what it comes out to):


[math]\displaystyle{ \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] = 𝒋 \left[ \begin{array} {c|c} \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & \mathrm{H} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \right]^{\large -1} }[/math]


So that's:


[math]\displaystyle{ \begin{align} \left[ \begin{array} {cc|c} g_1 & g_2 & λ_1 \\ \end{array} \right] &= \begin{array} {c} 𝒋 \\ \left[ \begin{array} {c} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {c} \begin{array} {c|c} \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & \mathrm{H} \\ \end{array} \\ \left[ \begin{array} {cc|c} {-4} & 26 & 1 \\ 2 & {-5} & 0 \\ 4 & {-14} & 0 \\ \end{array} \right] \end{array} \begin{array} {c} \begin{array} {c|c} M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} & M\mathrm{H} \\ \hline \quad (M\mathrm{H})^\mathsf{T} \quad & 0 \\ \end{array} \\ \left[ \begin{array} {cc|c} 12 & {-26} & 1 \\ {-26} & 85 & 0 \\ \hline 1 & 0 & 0 \\ \end{array} \right]^{\large -1} \end{array} \\ &= \left[ \begin{array} {cc|c} 1200.000 & 163.316 & {-4.627} \\ \end{array} \right] \end{align} }[/math]


So as expected, our [math]\displaystyle{ λ_1 }[/math] value came out negative, because of our sign-switching earlier. But what we're really interested in are the first two entries of that map, which are [math]\displaystyle{ g_1 }[/math] and [math]\displaystyle{ g_2 }[/math]. Our desired [math]\displaystyle{ 𝒈 }[/math] is {1200.000 163.316]. Huzzah!

For comparison's sake, we can repeat this, but without the unchanged octave:


[math]\displaystyle{ \begin{align} \left[ \begin{array} {c} g_1 & g_2 \\ \end{array} \right] &= \begin{array} {c} 𝒋 \\ \left[ \begin{array} {c} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {c} \mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {c} {-4} & 26 \\ 2 & {-5} \\ 4 & {-14} \\ \end{array} \right] \end{array} \begin{array} {c} M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T} \\ \left[ \begin{array} {cc|c} 12 & {-26} \\ {-26} & 85 \\ \end{array} \right]^{\large -1} \end{array} \\ &= \left[ \begin{array} {cc|c} 1198.857 & 162.966 \\ \end{array} \right] \end{align} }[/math]


And that's all there is to it.[10]

For all-interval tuning schemes

So far we've looked at how to use the linear algebra operation called the pseudoinverse to compute miniRMS tunings. We can use a variation of that approach to solve Euclideanized all-interval tunings. So where miniRMS tuning schemes are those with the optimization power [math]\displaystyle{ p }[/math] is equal to [math]\displaystyle{ 2 }[/math], all-interval minimax-ES tuning schemes are those with the dual norm power [math]\displaystyle{ \text{dual}(q) }[/math] equal to [math]\displaystyle{ 2 }[/math].

Setup

The pseudoinverse of a matrix [math]\displaystyle{ A }[/math] is notated as [math]\displaystyle{ A^{+} }[/math], and for convenience, here's its equation again:


[math]\displaystyle{ A^{+} = A^\mathsf{T}(AA^\mathsf{T})^{-1} }[/math]


For ordinary tunings, we find [math]\displaystyle{ G }[/math] to be:


[math]\displaystyle{ G = \mathrm{T}W(M\mathrm{T}W)^{+} = \mathrm{T}W(M\mathrm{T}W)^\mathsf{T}(M\mathrm{T}W(M\mathrm{T}W)^\mathsf{T})^{-1} }[/math]


So for all-interval tunings, we simply substitute in our all-interval analogous objects, and find it to be:


[math]\displaystyle{ G = \mathrm{T}_{\text{p}}S_{\text{p}}(M\mathrm{T}_{\text{p}}S_{\text{p}})^{+} = \mathrm{T}_{\text{p}}S_{\text{p}}(M\mathrm{T}_{\text{p}}S_{\text{p}})^\mathsf{T}(M\mathrm{T}_{\text{p}}S_{\text{p}}(M\mathrm{T}_{\text{p}}S_{\text{p}})^\mathsf{T})^{-1} }[/math]


That's a lot of [math]\displaystyle{ \mathrm{T}_{\text{p}} }[/math], though, and we know those are equal to [math]\displaystyle{ I }[/math], so let's eliminate them:


[math]\displaystyle{ G = S_{\text{p}}(MS_{\text{p}})^{+} = S_{\text{p}}(MS_{\text{p}})^\mathsf{T}(MS_{\text{p}}(MS_{\text{p}})^\mathsf{T})^{-1} }[/math]


Example

So suppose we want the minimax-ES tuning of meantone temperament, where [math]\displaystyle{ M }[/math] = [1 1 0] 0 1 4]} and [math]\displaystyle{ C_{\text{p}} = L }[/math]. Basically we just need to compute [math]\displaystyle{ MS_{\text{p}} }[/math]:


[math]\displaystyle{ \begin{array}{c} M \\ \left[ \begin{array} {r} 1 & 1 & 0 \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} \begin{array}{c} S_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 \\ 0 & 0 & \frac{1}{\log_2(5)} \\ \end{array} \right] \end{array} = \begin{array}{c} MS_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & \frac{1}{\log2{3}} & 0 \\ 0 & \frac{1}{\log2{3}} & \frac{4}{\log2{5}} \\ \end{array} \right] \end{array} }[/math]


And plug that in a few times, two of them transposed:


[math]\displaystyle{ G = \begin{array}{c} S_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 \\ 0 & 0 & \frac{1}{\log_2(5)} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 \\ \frac{1}{\log2{3}} & \frac{1}{\log2{3}} \\ 0 & \frac{4}{\log2{5}} \\ \end{array} \right] \end{array} \Huge ( \normalsize \begin{array}{c} MS_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & \frac{1}{\log2{3}} & 0 \\ 0 & \frac{1}{\log2{3}} & \frac{4}{\log2{5}} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 \\ \frac{1}{\log2{3}} & \frac{1}{\log2{3}} \\ 0 & \frac{4}{\log2{5}} \\ \end{array} \right] \end{array} \Huge )^{\Large -1} \normalsize }[/math]


Work that out and you get (at this point we'll convert to decimal form):


[math]\displaystyle{ G = \left[ \begin{array} {r} 0.740 & {-0.088} \\ 0.260 & 0.088\\ {-0.065} & 0.228\\ \end{array} \right] }[/math]


And when you multiply that by [math]\displaystyle{ 𝒋 }[/math], we get the generator tuning map [math]\displaystyle{ 𝒈 }[/math] for the minimax-ES tuning of meantone, 1201.397 697.049].

With alternative complexities

The following examples all pick up from a shared setup here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Computing all-interval tuning schemes with alternative complexities.

For all complexities used here (well again at least the first several more basic ones), our formula will be:


[math]\displaystyle{ G = S_{\text{p}}(MS_{\text{p}})^{+} = S_{\text{p}}(MS_{\text{p}})^\mathsf{T}(MS_{\text{p}}(MS_{\text{p}})^\mathsf{T})^{-1} }[/math]


Minimax-E-S

This example specifically picks up from the setup laid out here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Log-product2. Plugging [math]\displaystyle{ L^{-1} }[/math] into our pseudoinverse method for [math]\displaystyle{ S_{\text{p}} }[/math] we find:


[math]\displaystyle{ G = L^{-1}(ML^{-1})^\mathsf{T}(ML^{-1}(ML^{-1})^\mathsf{T})^{-1} }[/math]


We already have computed [math]\displaystyle{ ML^{-1} }[/math], so plug that in a few times, two of them transposed:


[math]\displaystyle{ G = \begin{array}{c} L^{-1} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 \\ 0 & 0 & \frac{1}{\log_2(5)} \\ \end{array} \right] \end{array} \begin{array}{c} (ML^{-1})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} \\ \end{array} \right] \end{array} \Huge ( \normalsize \begin{array}{c} ML^{-1} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & \frac{2}{\log_2(3)} & \frac{3}{\log_2(5)} \\ 0 & \frac{-3}{\log_2(3)} & \frac{-5}{\log_2(5)} \\ \end{array} \right] \end{array} \begin{array}{c} (ML^{-1})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} \\ \end{array} \right] \end{array} \Huge )^{\Large -1} \normalsize }[/math]


Work that out and you get (at this point we'll convert to decimal form):


[math]\displaystyle{ G = \left[ \begin{array} {r} 0.991 & 0.623 \\ 0.044 & {-0.117} \\ {-0.027} & {-0.129}\\ \end{array} \right] }[/math]


And when you multiply that by [math]\displaystyle{ 𝒋 }[/math], we get the generator tuning map [math]\displaystyle{ 𝒈 }[/math] for the minimax-ES tuning of porcupine, 1199.562 163.891].

This too can be computed easily with the Wolfram Library:

In:  optimizeGeneratorTuningMap["[⟨1 2 3] ⟨0 -3 -5]]", "minimax-ES"] 
Out: {1199.562 163.891] 

Minimax-E-sofpr-S

This example specifically picks up from the setup laid out here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Sum-of-prime-factors-with-repetition2. Plugging [math]\displaystyle{ \text{diag}(𝒑)^{-1} }[/math] into our pseudoinverse method for [math]\displaystyle{ S_{\text{p}} }[/math] we find:


[math]\displaystyle{ G = \text{diag}(𝒑)^{-1}(M\text{diag}(𝒑)^{-1})^\mathsf{T}(M\text{diag}(𝒑)^{-1}(M\text{diag}(𝒑)^{-1})^\mathsf{T})^{-1} }[/math]


We already have [math]\displaystyle{ M\text{diag}(𝒑)^{-1} }[/math] computed, so we plug that in a few times, two of them transposed:


[math]\displaystyle{ G = \begin{array}{c} \text{diag}(𝒑)^{-1} \\ \left[ \begin{array} {r} \frac{1}{2} & 0 & 0 \\ 0 & \frac{1}{3} & 0 \\ 0 & 0 & \frac{1}{5} \\ \end{array} \right] \end{array} \begin{array}{c} (M\text{diag}(𝒑)^{-1})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{2} & 0 \\ \frac{2}{3} & \frac{-3}{3} \\ \frac{3}{5} & \frac{-5}{5} \\ \end{array} \right] \end{array} \Huge ( \normalsize \begin{array}{c} M\text{diag}(𝒑)^{-1} \\ \left[ \begin{array} {r} \frac{1}{2} & \frac{2}{3} & \frac{3}{5} \\ 0 & \frac{-3}{3} & \frac{-5}{5} \\ \end{array} \right] \end{array} \begin{array}{c} (M\text{diag}(𝒑)^{-1})^\mathsf{T} \\ \left[ \begin{array} {r} \frac{1}{2} & 0 \\ \frac{2}{3} & \frac{-3}{3} \\ \frac{3}{5} & \frac{-5}{5} \\ \end{array} \right] \end{array} \Huge )^{\Large -1} \normalsize }[/math]


Work that out and you get :


[math]\displaystyle{ G = \left[ \begin{array} {r} \frac{225}{227} & \frac{285}{454} \\ \frac{10}{227} & \frac{-63}{454} \\ \frac{-6}{227} & \frac{-53}{454} \\ \end{array} \right] }[/math]


And when you multiply that by [math]\displaystyle{ 𝒋 }[/math], we get the generator tuning map [math]\displaystyle{ 𝒈 }[/math] for the minimax-E-sofpr-S tuning of porcupine, 1199.567 164.102].

This too can be computed easily with the Wolfram Library:

In:  optimizeGeneratorTuningMap["[⟨1 2 3] ⟨0 -3 -5]]", "minimax-E-sopfr-S"] 
Out: {1199.567 164.102] 

Minimax-E-copfr-S

This example specifically picks up from the setup laid out here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Count-of-prime-factors-with-repetition2. Plugging [math]\displaystyle{ I }[/math] into our pseudoinverse method for [math]\displaystyle{ S_{\text{p}} }[/math] we find:


[math]\displaystyle{ G = I(MI)^\mathsf{T}(MI(MI)^\mathsf{T})^{-1} = M^\mathsf{T}(MM^\mathsf{T})^{-1} = M^{+} }[/math]


That's right: our answer is simply the pseudoinverse of the mapping.


[math]\displaystyle{ G = \begin{array}{c} M^\mathsf{T} \\ \left[ \begin{array} {r} 1 & 0 \\ 2 & {-3} \\ 3 & {-5} \\ \end{array} \right] \end{array} \Huge ( \normalsize \begin{array}{c} M \\ \left[ \begin{array} {r} 1 & 2 & 3 \\ 0 & {-3} & {-5} \\ \end{array} \right] \end{array} \begin{array}{c} M^\mathsf{T} \\ \left[ \begin{array} {r} 1 & 0 \\ 2 & {-3} \\ 3 & {-5} \\ \end{array} \right] \end{array} \Huge )^{\Large -1} \normalsize }[/math]


Work that out and you get:


[math]\displaystyle{ G = \left[ \begin{array} {r} \frac{34}{35} & \frac{3}{5} \\ \frac{1}{7} & 0 \\ \frac{-3}{35} & \frac{-1}{5} \\ \end{array} \right] }[/math]


And when you multiply that by [math]\displaystyle{ 𝒋 }[/math], we get the generator tuning map [math]\displaystyle{ 𝒈 }[/math] for the minimax-ES tuning of porcupine, 1198.595 162.737].

This too can be computed easily with the Wolfram Library:

In:  optimizeGeneratorTuningMap["[⟨1 2 3] ⟨0 -3 -5]]", "minimax-E-copfr-S"] 
Out: {1198.595 162.737] 

Minimax-E-lils-S

This example specifically picks up from the setup laid out here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Log-integer-limit-squared2.

As for the minimax-E-lils-S tuning, we use the pseudoinverse method, but with the same augmented matrices as discussed for the minimax-lils-S tuning discussed later in this article. Well, we've established our [math]\displaystyle{ MS_{\text{p}} }[/math] equivalent, but we still need an equivalent for [math]\displaystyle{ S_{\text{p}} }[/math] alone. This is [math]\displaystyle{ L^{-1} }[/math], but with an extra 1 before the logs of primes are diagonalized:


[math]\displaystyle{ \begin{array} {c} \text{equiv. of} \; S_{\text{p}} \\ \left[ \begin{array} {ccc|c} \frac{1}{\log_2(2)} & 0 & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{1}{\log_2(3)} & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & 0 & \frac{1}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \end{array} \right] \end{array} }[/math]


So plugging in to


[math]\displaystyle{ G = S_{\text{p}}(MS_{\text{p}})^\mathsf{T}(MS_{\text{p}}(MS_{\text{p}})^\mathsf{T})^{-1} }[/math]


We get:


[math]\displaystyle{ G = \begin{array}{c} S_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & 0 & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{1}{\log_2(3)} & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & 0 & \frac{1}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {rr|r} \frac{1}{\log_2(2)} & 0 & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{1} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} \Huge ( \normalsize \begin{array}{c} MS_{\text{p}} \\ \left[ \begin{array} {r} \frac{1}{\log_2(2)} & \frac{2}{\log_2(3)} & \frac{3}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{-3}{\log_2(3)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {rr|r} \frac{1}{\log_2(2)} & 0 & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{1} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} \Huge )^{\Large -1} \normalsize }[/math]


Work that out and you get (at this point we'll convert to decimal form):


[math]\displaystyle{ G = \left[ \begin{array} {rr|r} 0.991 & 0.623 & \style{background-color:#FFF200;padding:5px}{0.000} \\ 0.044 & {-0.117} & \style{background-color:#FFF200;padding:5px}{-0.002} \\ {-0.027} & {-0.129} & \style{background-color:#FFF200;padding:5px}{0.001} \\ \hline \style{background-color:#FFF200;padding:5px}{1.000} & \style{background-color:#FFF200;padding:5px}{0.137} & \style{background-color:#FFF200;padding:5px}{-1.000} \\ \end{array} \right] }[/math]


(Yet again, compare with the result for minimax-ES; same but augmented.)


And when you multiply that by the augmented version of our [math]\displaystyle{ 𝒋 }[/math], we get the generator tuning map [math]\displaystyle{ 𝒈 }[/math] for the minimax-E-lilS tuning of porcupine, 1199.544 163.888 0.018]. Well, that last entry is only the [math]\displaystyle{ g_{\text{augmented}} }[/math] result, which is junk, so we throw that part away.

This too can be computed easily with the Wolfram Library:

In:  optimizeGeneratorTuningMap["[⟨1 2 3] ⟨0 -3 -5]]", "minimax-ES"] 
Out: {1199.544 163.888] 

Minimax-E-lols-S

This example specifically picks up from the setup laid out here: Dave Keenan & Douglas Blumeyer's guide to RTT: alternative complexities#Log-odd-limit-squared2. We use the pseudoinverse method, with our same [math]\displaystyle{ MS_{\text{p}} }[/math] and [math]\displaystyle{ S_{\text{p}} }[/math] equivalents as from the minimax-E-lils-S examples:


[math]\displaystyle{ \begin{array}{c} \text{equiv. of} \; MS_{\text{p}} \\ \left[ \begin{array} {rrr|r} \frac{1}{\log_2(2)} & \frac{2}{\log_2(3)} & \frac{3}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{-3}{\log_2(3)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} }[/math]


[math]\displaystyle{ \begin{array} {c} \text{equiv. of} \; S_{\text{p}} \\ \left[ \begin{array} {ccc|c} \frac{1}{\log_2(2)} & 0 & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{1}{\log_2(3)} & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & 0 & \frac{1}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \end{array} \right] \end{array} }[/math]


And we have our [math]\displaystyle{ \mathrm{U} }[/math] = [1 0 0, being the octave, but it's augmented to [1 0 0, that last entry being its size. So this whole thing is blue on account of having to do with the held-interval augmentation, but its last entry is green because it's also yellow from the lils augmentation:


[math]\displaystyle{ \begin{array} {c} \mathrm{U} \\ \left[ \begin{array} {c} \style{background-color:#00AEEF;padding:5px}{1} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#8DC73E;padding:5px}{1} \\ \end{array} \right] \end{array} }[/math]


And so our [math]\displaystyle{ M\mathrm{U} }[/math] we can think of as our held-interval having been mapped. For this we must ask ourselves "what is [math]\displaystyle{ M }[/math]"? We know what [math]\displaystyle{ MS_{\text{p}} }[/math] is but not really [math]\displaystyle{ M }[/math] itself, i.e. in terms of its augmentation status. So, the present author is not sure, but is going with this: [1 0 0 would normally map to [1 0} in this temperament, and the third entry it needs to fit into the block matrices we're about to build would be mapped by the mapping's junk row, so why not just make it 0. So that gives us:


[math]\displaystyle{ \begin{array} {c} M\mathrm{U} \\ \left[ \begin{array} {c} \style{background-color:#00AEEF;padding:5px}{1} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#8DC73E;padding:5px}{0} \\ \end{array} \right] \end{array} }[/math]


Ah, and [math]\displaystyle{ 𝒋 }[/math] is augmented with a 0 for the lils-stuff that is just junk. Might as well:


[math]\displaystyle{ \begin{array} {c} 𝒋 \\ \left[ \begin{array} 1200 & 1901.955 & 2786.314 & \style{background-color:#FFF200;padding:5px}{0} \\ \end{array} \right] \end{array} }[/math]


Now we need to plug this into the variation on the pseudoinverse formula that accounts for held-intervals:


[math]\displaystyle{ \left[ \begin{array} {cc|c|c} g_1 & g_2 & \style{background-color:#FFF200;padding:5px}{g_{\text{augmented}}} & \style{background-color:#00AEEF;padding:5px}{λ_1} \\ \end{array} \right] = 𝒋 \left[ \begin{array} {c|c} S_{\text{p}}(MS_{\text{p}})^\mathsf{T} & \style{background-color:#00AEEF;padding:5px}{U} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ MS_{\text{p}}(MS_{\text{p}})^\mathsf{T} & \style{background-color:#00AEEF;padding:5px}{𝑀U} \\ \hline \quad \style{background-color:#00AEEF;padding:5px}{(𝑀U)}^\mathsf{T} \quad & \style{background-color:#00AEEF;padding:5px}{0} \\ \end{array} \right]^{\large -1} }[/math]


So let's just start plugging in!


[math]\displaystyle{ \small \left[ \begin{array} {cc|c|c} g_1 & g_2 & \style{background-color:#FFF200;padding:5px}{g_{\text{augmented}}} & \style{background-color:#00AEEF;padding:5px}{λ_1} \\ \end{array} \right] = \begin{array} {c} 𝒋 \\ \left[ \begin{array} {c} 1200 & 1901.955 & 2786.314 & \style{background-color:#FFF200;padding:5px}{0} \\ \end{array} \right] \end{array} \left[ \begin{array} {c|c} \begin{array} {c} \text{equiv. of} \; S_{\text{p}} \\ \left[ \begin{array} {ccc|c} \frac{1}{\log_2(2)} & 0 & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{1}{\log_2(3)} & 0 & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & 0 & \frac{1}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {rr|r} \frac{1}{\log_2(2)} & 0 & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{1} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} & \begin{array} \mathrm{U} \\ \left[ \begin{array} {c} \style{background-color:#00AEEF;padding:5px}{1} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#8DC73E;padding:5px}{1} \\ \end{array} \right] \end{array} \\ \end{array} \right] \left[ \begin{array} {c|c} \\ \begin{array}{c} \text{equiv. of} \; MS_{\text{p}} \\ \left[ \begin{array} {rrr|r} \frac{1}{\log_2(2)} & \frac{2}{\log_2(3)} & \frac{3}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ 0 & \frac{-3}{\log_2(3)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{0} \\ \hline \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} \begin{array}{c} (MS_{\text{p}})^\mathsf{T} \\ \left[ \begin{array} {rr|r} \frac{1}{\log_2(2)} & 0 & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{2}{\log_2(3)} & \frac{-3}{\log_2(3)} & \style{background-color:#FFF200;padding:5px}{1} \\ \frac{3}{\log_2(5)} & \frac{-5}{\log_2(5)} & \style{background-color:#FFF200;padding:5px}{1} \\ \hline \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} \\ \end{array} \right] \end{array} & \begin{array} {c} M\mathrm{U} \\ \left[ \begin{array} {c} \style{background-color:#00AEEF;padding:5px}{1} \\ \style{background-color:#00AEEF;padding:5px}{0} \\ \style{background-color:#8DC73E;padding:5px}{0} \\ \end{array} \right] \end{array} \\ \hline \begin{array}{c} (M\mathrm{U})^\mathsf{T} \\ \left[ \begin{array} {r} \style{background-color:#00AEEF;padding:5px}{1} & \style{background-color:#00AEEF;padding:5px}{0} & \style{background-color:#8DC73E;padding:5px}{0} \\ \end{array} \right] \end{array} & \style{background-color:#00AEEF;padding:5px}{0} \\ \end{array} \right]^{\large -1} }[/math]


Now if you crunch all that on the right, you get 1200 164.062 -0.211 -0.229]. So we can throw away both the lambda that helped us hold our octave unchanged, and then the augmented generator that helped us account for the size of our intervals. So we're left with our held-octave minimax-E-lils-S tuning.

This too can be computed by the Wolfram Library:

In:  optimizeGeneratorTuningMap["[⟨1 2 3] ⟨0 -3 -5]]", "held-octave minimax-E-lils-S"] 
Out: {1200 164.062] 

Zero-damage method

The second optimization power we'll take a look at is [math]\displaystyle{ p = 1 }[/math], for miniaverage tuning schemes.

Note that miniaverage tunings have not been advocated by tuning theorists thus far. We've included this section largely in order to complete the set of methods with exact solutions, one for each of the key optimization powers [math]\displaystyle{ 1 }[/math], [math]\displaystyle{ 2 }[/math], and [math]\displaystyle{ ∞ }[/math].[11] So, you may prefer to skip ahead to the next section if you're feeling more practically minded. However, the method for [math]\displaystyle{ p = ∞ }[/math] is related but more complicated, and its explanation builds upon this method's explanation, so it may still be worth it to work through this one first.

The high-level summary here is that we're going to collect every tuning where one target-interval for each generator is tuned pure simultaneously. Then we will check each of those tunings' damages, and choose the tuning of those which causes the least damage.

The zero-damage point set

The method for finding the miniaverage leverages the fact that the sum graph changes slope wherever a target-interval is tuned pure. The minimum must be found among the points where [math]\displaystyle{ r }[/math] target-intervals are all tuned pure at once, where [math]\displaystyle{ r }[/math] is the rank of the temperament. This is because this is the maximum number of linearly independent intervals that could be pure at once, given only [math]\displaystyle{ r }[/math] generators to work with. You can imagine that for any point you could find where only [math]\displaystyle{ r - 1 }[/math] intervals were pure at once, that point would be found on a line along which all [math]\displaystyle{ r - 1 }[/math] of those intervals remain pure, but if you follow it far enough in one direction, you'll reach a point where one additional interval is also pure.

These points taken together are known as the zero-damage point set. This is the first of two methods we'll look at in this article which make use of a point set. The other is the method for finding the minimax, which uses a different point set called the "coinciding-damage point set"; this method is slightly trickier than the miniaverage one, though, and so we'll be looking at it next, right after we've covered the miniaverage method here.

So, in essence, this method works by narrowing the infinite space of tuning possibilities down to a finite set of points to check. We gather these zero-damage points, find the damage (specifically the sum of damages to the target-intervals, AKA the power sum where [math]\displaystyle{ p = 1 }[/math]) at each point, and then choose the one with the minimum damage out of those. And that'll be our miniaverage tuning (unless there's a tie, but more on that later).

Gather and process zero-damage points

Let's practice this method by working through an example. For our target-interval list, we can use our recommended scheme, the truncated integer limit triangle (or "TILT" for short), colorized here so we'll be able to visualize their combinations better in the upcoming step. This is the 6-TILT, our default target list for 5-limit temperaments.


[math]\displaystyle{ \mathrm{T} = \begin{array} {c} \ \ \begin{array} {c} \textbf{i}_1 & \ \ \ \textbf{i}_2 & \ \ \ \textbf{i}_3 & \ \ \ \textbf{i}_4 & \ \ \ \textbf{i}_5 & \ \ \ \textbf{i}_6 & \ \ \ \textbf{i}_7 & \ \ \ \textbf{i}_8 \\ \frac21 & \ \ \ \frac31 & \ \ \ \frac32 & \ \ \ \frac43 & \ \ \ \frac52 & \ \ \ \frac53 & \ \ \ \frac54 & \ \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#98CC70;padding:5px}{2} & \style{background-color:#3FBC9D;padding:5px}{-1} & \style{background-color:#41B0E4;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{-2} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{-1} & \style{background-color:#3FBC9D;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{-1} & \style{background-color:#7977B8;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} }[/math]


And let's use a classic example for our temperament: meantone.

Unchanged-interval bases

We can compute ahead of time how many points we should find in our zero-damage point set, because it's simply the number of combinations of [math]\displaystyle{ r }[/math] of them. With meantone being a rank-2 temperament, that's [math]\displaystyle{ {{8}\choose{2}} = 28 }[/math] points (8 choose 2 is 28).

Each of these 28 points may be represented by an unchanged-interval basis, symbolized as [math]\displaystyle{ \mathrm{U} }[/math]. An unchanged-interval basis is simply a matrix where each column is a prime-count vector representing a different interval that the tuning of this temperament should leave unchanged. So for example, the matrix [-1 1 0 [0 -1 1] tells us that [math]\displaystyle{ \frac32 }[/math] = [-1 1 0 and [math]\displaystyle{ \frac53 }[/math] = [0 -1 1 are to be left unchanged. (The "basis" part of the name tells us that furthermore every linear combination of these vectors is also left unchanged, such as 2×[-1 1 0 + -1×[0 -1 1 = [-2 3 -1, AKA [math]\displaystyle{ \frac{27}{20} }[/math]. It also technically tells us that none of the vectors is already a linear combination of the others, i.e. that it is full-column-rank; this may not be true of all of these matrices we're assembling using this automatic procedure, but that's okay because any of these that aren't truly bases will be eliminated for that reason in the next step.)

Note that this unchanged-interval basis [math]\displaystyle{ \mathrm{U} }[/math] is different than our held-unchanged-interval basis [math]\displaystyle{ H }[/math]. There are a couple main differences:

  1. We didn't ask for these unchanged-interval bases [math]\displaystyle{ \mathrm{U} }[/math]; they're just coming up as part of this algorithm.
  2. These unchanged-interval bases completely specify the tuning. A held-interval basis [math]\displaystyle{ \mathrm{H} }[/math] has shape [math]\displaystyle{ (d, h) }[/math] where [math]\displaystyle{ h \leq r }[/math], but an unchanged-interval basis [math]\displaystyle{ \mathrm{U} }[/math] always has shape [math]\displaystyle{ (d, r) }[/math]. (Remember, [math]\displaystyle{ r }[/math] is the rank of the temperament, or in other words, the count of generators.)

So here's the full list of 28 unchanged-interval bases corresponding to the zero-damage points for any 5-limit rank-2 temperament (meantone or otherwise), given the 6-TILT as its target-interval set. Use the colorization to better understand the nature of these combinations:


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(1,2)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#FDBC42;padding:5px}{0} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,3)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{-1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,4)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{2} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{-1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,5)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#3FBC9D;padding:5px}{-1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{0} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,6)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,7)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(1,8)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#F69289;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#F69289;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(2,3)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac32 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{-1} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{1} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2,4)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{2} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{-1} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2,5)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{-1} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#3FBC9D;padding:5px}{0} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2,6)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2,7)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2,8)} \\ \ \ \begin{array} {rrr} \frac31 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#FDBC42;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#FDBC42;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(3,4)} \\ \ \ \begin{array} {rrr} \frac32 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#98CC70;padding:5px}{2} \\ \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{-1} \\ \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(3,5)} \\ \ \ \begin{array} {rrr} \frac32 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#3FBC9D;padding:5px}{-1} \\ \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#3FBC9D;padding:5px}{0} \\ \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(3,6)} \\ \ \ \begin{array} {rrr} \frac32 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(3,7)} \\ \ \ \begin{array} {rrr} \frac32 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(3,8)} \\ \ \ \begin{array} {rrr} \frac32 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#FFF200;padding:5px}{-1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#FFF200;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#FFF200;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(4,5)} \\ \ \ \begin{array} {rrr} \frac43 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{2} & \style{background-color:#3FBC9D;padding:5px}{-1} \\ \style{background-color:#98CC70;padding:5px}{-1} & \style{background-color:#3FBC9D;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(4,6)} \\ \ \ \begin{array} {rrr} \frac43 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{2} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{-1} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(4,7)} \\ \ \ \begin{array} {rrr} \frac43 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{2} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#98CC70;padding:5px}{-1} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(4,8)} \\ \ \ \begin{array} {rrr} \frac43 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{2} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#98CC70;padding:5px}{-1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(5,6)} \\ \ \ \begin{array} {rrr} \frac52 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#3FBC9D;padding:5px}{-1} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#3FBC9D;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#3FBC9D;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(5,7)} \\ \ \ \begin{array} {rrr} \frac52 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#3FBC9D;padding:5px}{-1} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#3FBC9D;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#3FBC9D;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(5,8)} \\ \ \ \begin{array} {rrr} \frac52 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#3FBC9D;padding:5px}{-1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#3FBC9D;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#3FBC9D;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(6,7)} \\ \ \ \begin{array} {rrr} \frac53 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#41B0E4;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#41B0E4;padding:5px}{-1} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#41B0E4;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(6,8)} \\ \ \ \begin{array} {rrr} \frac53 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#41B0E4;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#41B0E4;padding:5px}{-1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#41B0E4;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} , }[/math]


[math]\displaystyle{ \begin{array} {c} \mathrm{U}_{(7,8)} \\ \ \ \begin{array} {rrr} \frac54 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#7977B8;padding:5px}{-2} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#7977B8;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#7977B8;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} }[/math]


Canonicalize and filter deficient matrices

But many of these unchanged-interval bases are actually redundant with each other, by which we mean that they correspond to the same tuning. Said another way, some of these unchanged-interval bases are different bases for the same set of unchanged-intervals.

In order to identify such redundancies, we will put all of our unchanged-interval bases into their canonical form, following the canonicalization process that has already been described for comma bases, because they are bases, tall matrices (have more rows than columns), and their columns represent intervals. Putting matrices into canonical form is a way to determine if, for some definition of "same", they represent the same information. So here's what they look like in that form (no more color here on out; the point about combinations has been made):


[math]\displaystyle{ \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} , \\[35pt] \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & 0 \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \\[35pt] \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \\[35pt] \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac58 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-3} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Note, for example, that our matrix representing [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac43 }[/math] (the 14th one here) has been simplified to a matrix representing [math]\displaystyle{ \frac21 }[/math] and [math]\displaystyle{ \frac31 }[/math]; this is as if to say: why define the problem as tuning [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac43 }[/math] pure, when there's only two total different prime factors between these two intervals, so we may as well just use our two generators to make both of those basis primes pure. In fact, any combination of intervals that includes no prime 5 here will have been simplified to this same unchanged-interval basis.

Also note that many intervals are now subunison (less than [math]\displaystyle{ \frac11 }[/math], with a denominator greater than the numerator; for example [math]\displaystyle{ \frac34 }[/math]). While this may be unnatural for musicians to think about, it's just the way the canonicalization math works out, and is irrelevant to tuning, because any damage to an interval will be the same as to its reciprocal.

In some cases at this point, we would eliminate some unchanged-interval bases, those that through the process of canonicalization were simplified to fewer than [math]\displaystyle{ r }[/math] intervals, i.e. they lost a column (or more than one column). In this example, that has not occurred to any of our matrices; in order for it to have occurred, our target-interval set would have needed to include linearly dependent intervals. For example, the intervals [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac94 }[/math] are linearly dependent, and we see these in the 10-TILT that's the default for a 7-limit temperament. So in that case, the unchanged-interval bases that result from the combination of those pairs of intervals will be eliminated. This captures the fact that if you were to purely tune the interval which the others are multiples of, all the others will also be purely tuned, so this is not truly a combination of distinct intervals to purely tune.

De-dupe

And we also see that our [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac65 }[/math] matrix has been changed to [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac54 }[/math]. This may be less obvious in terms of it being a simplification, but it does illuminate how tuning [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac65 }[/math] pure is no different than tuning [math]\displaystyle{ \frac32 }[/math] and [math]\displaystyle{ \frac54 }[/math] pure.

And so now it's time to actually eliminate those redundancies!


[math]\displaystyle{ \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & 0 \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac32 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-1} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-1} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-2} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac34 & \ \ \frac58 \\ \end{array} \\ \left[ \begin{array} {r|r} {-2} & {-3} \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Counting only 11 matrices still remaining, that means we must have eliminated 17 of them as redundant from our original set of 28.

Convert to generators

Now we just need to convert each of these unchanged-interval bases [math]\displaystyle{ \mathrm{U}_{(i,j)} }[/math] to a corresponding generator embedding [math]\displaystyle{ G }[/math]. To do this, we use the formula [math]\displaystyle{ G = \mathrm{U}(M\mathrm{U})^{-1} }[/math], where [math]\displaystyle{ M }[/math] is the temperament mapping (the derivation of this formula, and examples of working through this calculation, are both described later in this article here: #Only unchanged-intervals method).[12]


[math]\displaystyle{ \scriptsize \begin{array} {c} \ \ \begin{array} {rrr} \frac{2}{1} & \frac{3}{2} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & {-1} \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{2}{1} & \sqrt[4]{5} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \sqrt[3]{\frac{10}{3}} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & \frac13 \\ 0 & {-\frac13} \\ 0 & \frac13 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \sqrt[5]{\frac{162}{5}} & \sqrt[5]{\frac{15}{2}} \\ \end{array} \\ \left[ \begin{array} {rrr} \frac15 & {-\frac15} \\ \frac45 & \frac15 \\ {-\frac15} & \frac15 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{3}{\sqrt[4]{5}} & \sqrt[4]{5} \\ \end{array} \\ \left[ \begin{array} {rrr} 0 & 0 \\ 1 & 0 \\ {-\frac14} & \frac14 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \sqrt[6]{\frac{324}{5}} & \sqrt[6]{\frac{45}{4}} \\ \end{array} \\ \left[ \begin{array} {rrr} \frac13 & {-\frac13} \\ \frac23 & \frac13 \\ {-\frac16} & \frac16 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{81}{40} & \frac{3}{2} \\ \end{array} \\ \left[ \begin{array} {rrr} {-3} & {-1} \\ 4 & 1 \\ {-1} & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{9}{2\sqrt[2]{5}} & \frac{3}{2} \\ \end{array} \\ \left[ \begin{array} {rrr} {-1} & {-1} \\ 2 & 1 \\ {-\frac12} & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \sqrt[3]{\frac{640}{81}} & \sqrt[3]{\frac{10}{3}} \\ \end{array} \\ \left[ \begin{array} {rrr} \frac73 & \frac13 \\ {-\frac43} & {-\frac13} \\ \frac13 & \frac13 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{8\sqrt[2]{5}}{9} & \frac{2\sqrt[2]{5}}{3} \\ \end{array} \\ \left[ \begin{array} {rrr} 3 & 1 \\ {-2} & {-1} \\ \frac12 & \frac12 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{160}{81} & \frac{40}{27} \\ \end{array} \\ \left[ \begin{array} {rrr} 5 & 3 \\ {-4} & {-3} \\ 1 & 1 \\ \end{array} \right] \end{array} }[/math]


Note that every one of those unusual looking values above — whether it be [math]\displaystyle{ \frac21 }[/math], [math]\displaystyle{ \frac{81}{40} }[/math], [math]\displaystyle{ \frac{8\sqrt[2]{5}}{9} }[/math], or otherwise in the first column — or [math]\displaystyle{ \frac32 }[/math], [math]\displaystyle{ \frac{40}{27} }[/math], [math]\displaystyle{ \sqrt[3]{\frac{10}{3}} }[/math], or otherwise in the second column — is an approximation of [math]\displaystyle{ \frac21 }[/math] or [math]\displaystyle{ \frac32 }[/math], respectively.

At this point, the only inputs affecting our results have been [math]\displaystyle{ M }[/math] and [math]\displaystyle{ \mathrm{T} }[/math]: [math]\displaystyle{ M }[/math] appears in our formula for [math]\displaystyle{ G }[/math], and our target-interval set [math]\displaystyle{ \mathrm{T} }[/math] was our source of intervals for our set of unchanged-interval bases. Notably [math]\displaystyle{ W }[/math] is missing from that list of inputs affecting our results. So at this point, it doesn't seem to matter what our damage weight slope is (or what the complexity function used for it is, if other than log-product complexity); this list of candidate [math]\displaystyle{ G }[/math]'s is valid in any case of [math]\displaystyle{ W }[/math]. But don't worry; [math]\displaystyle{ W }[/math] will definitely affect the results soon; actually, it comes into play in the next step.

Find damages at points

As the next step, we find the [math]\displaystyle{ 1 }[/math]-sum of the damages to the target-interval set for each of those tunings. We'll work through one example. Let's just grab that third [math]\displaystyle{ G }[/math], then, the one with [math]\displaystyle{ \frac21 }[/math] and [math]\displaystyle{ \sqrt[3]{\frac{10}{3}} }[/math].

This is one way to write the formula for the damages of a tuning of a temperament, in weighted cents. You can see the close resemblance to the expression shared earlier in the #Basic algebraic setup section:


[math]\displaystyle{ \textbf{d} = |\,𝒋GM\mathrm{T}W - 𝒋G_{\text{j}}M_{\text{j}}\mathrm{T}W\,| }[/math]


As discussed in Dave Keenan & Douglas Blumeyer's guide to RTT: tuning fundamentals#Absolute errors, these vertical bars mean to take the absolute value of each entry of this vector, not to take its magnitude.

As discussed elsewhere, we can simplify this to:


[math]\displaystyle{ \textbf{d} = |\,𝒋(GM - G_{\text{j}}M_{\text{j}})\mathrm{T}W\,| }[/math]


So here's that. Since we've gone with simplicity-weight damage here, we'll be using [math]\displaystyle{ S }[/math] to represent our simplicity-weight matrix rather than the generic [math]\displaystyle{ W }[/math] for weight matrix:


[math]\displaystyle{ \textbf{d} = \Huge | \scriptsize \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} ( \begin{array} {ccc} G \\ \left[ \begin{array} {rrr} 1 & \frac13 \\ 0 & {-\frac13} \\ 0 & \frac13 \\ \end{array} \right] \end{array} \begin{array} {ccc} M \\ \left[ \begin{array} {rrr} 1 & 1 & 0 \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} - \begin{array} {ccc} I \\ \left[ \begin{array} {rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \end{array} ) \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} \Huge | }[/math]


Let's start chipping away at this from the left. As our first act, let's consolidate [math]\displaystyle{ 𝒋 }[/math]:


[math]\displaystyle{ \textbf{d} = \Huge | \scriptsize \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} ( \begin{array} {ccc} G \\ \left[ \begin{array} {rrr} 1 & \frac13 \\ 0 & {-\frac13} \\ 0 & \frac13 \\ \end{array} \right] \end{array} \begin{array} {ccc} M \\ \left[ \begin{array} {rrr} 1 & 1 & 0 \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} - \begin{array} {ccc} I \\ \left[ \begin{array} {rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \end{array} ) \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} \Huge | }[/math]


Distribute the [math]\displaystyle{ 𝒋 }[/math]. We find [math]\displaystyle{ 𝒋GM = 𝒕 }[/math], the tempered-prime tuning map, and [math]\displaystyle{ 𝒋G_{\text{j}}M_{\text{j}} = 𝒋 }[/math], the just-prime tuning map.


[math]\displaystyle{ \textbf{d} = \Huge | \scriptsize ( \begin{array} {ccc} 𝒕 \\ \left[ \begin{array} {rrr} 1200.000 & 1894.786 & 2779.144 \\ \end{array} \right] \end{array} - \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} ) \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} \Huge | }[/math]


And now we can replace [math]\displaystyle{ 𝒕 - 𝒋 }[/math] with a single variable [math]\displaystyle{ 𝒓 }[/math], which represents the retuning map, which unsurprisingly is just the map which tells us by how much to retune (mistune) each of the primes (this object will come up a lot more when working with all-interval tuning schemes).


[math]\displaystyle{ \textbf{d} = \Huge | \scriptsize \begin{array} {ccc} 𝒓 \\ \left[ \begin{array} {rrr} 0.000 & {-7.169} & {-7.169} \\ \end{array} \right] \end{array} \begin{array} {ccc} \mathrm{T} \\ \left[ \begin{array} {r|r|r|r|r|r|r|r} \;\;1 & \;\;\;0 & {-1} & 2 & {-1} & 0 & {-2} & 1 \\ 0 & 1 & 1 & {-1} & 0 & {-1} & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & {-1} \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} \Huge | }[/math]


And multiplying that by our [math]\displaystyle{ \mathrm{T} }[/math] gives us [math]\displaystyle{ \textbf{e} }[/math], the target-interval error list:


[math]\displaystyle{ \textbf{d} = \Huge | \scriptsize \begin{array} {ccc} \textbf{e} \\ \left[ \begin{array} {rrr} 0.000 & {-7.169} & {-7.169} & 7.169 & {-7.169} & 0.000 & {-7.169} & 0.000 \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} \Huge | }[/math]


Our weights are all positive. The important part is to take the absolute value of the errors. So we can take care of that and get [math]\displaystyle{ |\textbf{e}|S }[/math]:


[math]\displaystyle{ \textbf{d} = \scriptsize \begin{array} {ccc} |\textbf{e}| \\ \left[ \begin{array} {rrr} |0.000| & |{-7.169}| & |{-7.169}| & |7.169| & |{-7.169}| & |0.000| & |{-7.169}| & |0.000| \\ \end{array} \right] \end{array} \begin{array} {ccc} S \\ \left[ \begin{array} {rrr} \frac{1}{\log_2(2)} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{\log_2(3)} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{\log_2(6)} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{\log_2(12)} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{\log_2(10)} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(15)} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(20)} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{\log_2(30)} \\ \end{array} \right] \end{array} }[/math]


And now we multiply that by the weights to get the damages, [math]\displaystyle{ \textbf{d} }[/math].


[math]\displaystyle{ \textbf{d} = \scriptsize \left[ \begin{array} {rrr} 0.000 & 4.523 & 2.773 & 2.000 & 2.158 & 0.000 & 1.659 & 0.000 \\ \end{array} \right] }[/math]


And finally since this tuning scheme is all about the sum of damages, we're actually looking for [math]\displaystyle{ \llzigzag \textbf{d} \rrzigzag _1 }[/math]. So we total these up, and get our final answer: 0.000 + 4.523 + 2.773 + 2.000 + 2.158 + 0.000 + 1.659 + 0.000 = 13.114. And that's in units of simplicity-weighted cents, ¢(S), by the way.

Choose the winner

Now, if we repeat that entire damage calculation process for every one of the eleven tunings we identified as candidates for the miniaverage, then we'd have found the following list of tuning damages: 21.338, 9.444, 13.114, 10.461, 15.658, 10.615, 50.433, 26.527, 25.404, 33.910, and 80.393. So 13.114 isn't bad, but it's apparently not the best we can do. That honor goes to the second tuning there, which has only 9.444 ¢(S) total damage.

Lo and behold, if we cross reference that with our list of [math]\displaystyle{ G }[/math] candidates from earlier, the second one is quarter-comma meantone, the tuning where the fifth is exactly the fourth root of five:


[math]\displaystyle{ G = \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] }[/math]


Often people will prefer to have the tuning in terms of the cents sizes of the generators, which is our generator tuning map [math]\displaystyle{ 𝒈 }[/math][13], but again we can find that as easily as [math]\displaystyle{ 𝒋G }[/math]:


[math]\displaystyle{ 𝒈 = \begin{array} {ccc} 𝒋 \\ \left[ \begin{array} {rrr} 1200.000 & 1901.955 & 2786.314 \\ \end{array} \right] \end{array} \begin{array} {ccc} G \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] \end{array} }[/math]


And that works out to 1200.000 696.578].

Tie-breaking

With the 6-TILT miniaverage tuning of meantone (with simplicity-weight damage), we've solved for a unique tuning based on [math]\displaystyle{ G }[/math] that miniaverages the damage to this temperament [math]\displaystyle{ M }[/math].

But sometimes we have a tie between tunings for least average damage, though. For example, if we had we done a unity-weight tuning, in which case [math]\displaystyle{ W = I }[/math], and included the interval [math]\displaystyle{ \frac85 }[/math] in our set, we would have found that quarter-comma meantone tied with another tuning, one with generators of [math]\displaystyle{ \sqrt[5]{\frac{2560}{81}} }[/math] and [math]\displaystyle{ \sqrt[5]{\frac{200}{27}} }[/math], which are approximately 1195.7 ¢ and 693.352 ¢.

In this case, we fall back to our general method, which is equipped to find the true optimum somewhere in between these two extreme ends of goodness, albeit as an approximate solution.[14] This method is discussed here: power limit method. Or, if you'd like a refresher on how to think about non-unique tunings, please see Dave Keenan & Douglas Blumeyer's guide to RTT: tuning fundamentals#Non-unique tunings.

We note that there may be a way to find an exact solution to a nested miniaverage, in a similar fashion to the nested minimax discussed in the coinciding-damage method section below, but it raises some conceptual issues about what a nested miniaverage even means.[15] We have done some pondering of this problem but it remains open; we didn't prioritize solving it, on account of the fact that nobody uses miniaverage tunings anyway.

With held-intervals

The zero-damage method is easily modified to handle held-intervals along with target-intervals.[16] In short, rather than assembling our set of unchanged-interval bases [math]\displaystyle{ \mathrm{U}_1 }[/math] through [math]\displaystyle{ \mathrm{U}_n }[/math] (where [math]\displaystyle{ n = {{k}\choose{r}} }[/math]) corresponding to the zero-damage points by finding every combination of [math]\displaystyle{ r }[/math] different ones of our [math]\displaystyle{ k }[/math] target-intervals (one for each generator to be responsible for tuning exactly), instead we must first reserve [math]\displaystyle{ h }[/math] (held-unchanged-interval count) columns of each [math]\displaystyle{ \mathrm{U}_n }[/math] for the held-intervals, leaving only the remaining [math]\displaystyle{ r - h }[/math] columns to be assembled from the target-intervals as normal. So, we'll only have [math]\displaystyle{ {{k}\choose{r - h}} }[/math] candidate tunings / zero-damage points / unchanged-interval bases in this case.

In other words, if [math]\displaystyle{ \mathrm{U}_n }[/math] is one of the unchanged-interval bases characterizing a candidate miniaverage tuning, then it must contain [math]\displaystyle{ \mathrm{H} }[/math] itself, the held-interval basis, which does not yet fully characterize our tuning, leaving some wiggle room (otherwise we'd just use the "only held-intervals" approach, discussed later).

For example, if seeking a held-octave miniaverage tuning of a 5-limit, rank-2 temperament with the 6-TILT as our target-interval set, then [math]\displaystyle{ h = 1 }[/math] (only the octave), [math]\displaystyle{ k = 8 }[/math] (there's 8 target-intervals in the 6-TILT), and [math]\displaystyle{ r = 2 }[/math] (meaning of "rank-2"). So we're looking at [math]\displaystyle{ {{k}\choose{r - h}} = {{(8)}\choose{(2) - (1)}} = {{8}\choose{1}} = 8 }[/math] unchanged-interval bases. That's significantly less than the [math]\displaystyle{ {{8}\choose{2}} = 28 }[/math] we had to slog through when [math]\displaystyle{ h = 0 }[/math] in the earlier example, so this will be much faster to compute. All we're doing here, really, is checking each possible tuning where we pair one of our target-intervals with the octave as our unchanged-interval basis.

So, with our unchanged-interval basis (colorized to grey to help visualize its presence in the upcoming steps):


[math]\displaystyle{ \begin{array} {c} \mathrm{U} \\ \ \ \begin{array} {rrr} \frac21 \\ \end{array} \\ \left[ \begin{array} {rrr} \style{background-color:#D3D3D3;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} \\ \end{array} \right] \end{array} }[/math]


We have the unchanged-interval bases for our zero-damage points:


[math]\displaystyle{ \small \begin{array} {c} \mathrm{U}_{(1)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac21 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#F69289;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#F69289;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#F69289;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(2)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#FDBC42;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#FDBC42;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(3)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#FFF200;padding:5px}{-1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#FFF200;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(4)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac43 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{2} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{-1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(5)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac52 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#3FBC9D;padding:5px}{-1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#3FBC9D;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(6)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#41B0E4;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{-1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#41B0E4;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(7)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac54 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#7977B8;padding:5px}{-2} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{0} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#7977B8;padding:5px}{1} \\ \end{array} \right] \end{array} , \begin{array} {c} \mathrm{U}_{(8)} \\ \ \ \begin{array} {rrr} \frac21 & \ \ \frac65 \\ \end{array} \\ \left[ \begin{array} {r|r} \style{background-color:#D3D3D3;padding:5px}{1} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{1} \\ \style{background-color:#D3D3D3;padding:5px}{0} & \style{background-color:#D883B7;padding:5px}{-1} \\ \end{array} \right] \end{array} }[/math]


(Note that [math]\displaystyle{ \mathrm{U}_{(1)} }[/math] here pairs [math]\displaystyle{ \frac21 }[/math] with [math]\displaystyle{ \frac21 }[/math]. That's because the octave happens to appear both in our held-interval basis [math]\displaystyle{ \mathrm{H} }[/math] and our target-interval list [math]\displaystyle{ \mathrm{T} }[/math]. We could have chosen to remove [math]\displaystyle{ \frac21 }[/math] from [math]\displaystyle{ \mathrm{T} }[/math] upon adding it to [math]\displaystyle{ \mathrm{H} }[/math], because once you're insisting a particular interval takes no damage there's no sense also including it in a list of intervals to minimize damage to. But we chose to leave [math]\displaystyle{ \mathrm{T} }[/math] alone to make our points above more clearly, i.e. with [math]\displaystyle{ k }[/math] remaining equal to [math]\displaystyle{ 8 }[/math].)[17]

Now we canonicalize (no need for color anymore; the point has been made about the combinations of target-intervals with held-intervals):


[math]\displaystyle{ \begin{array} {c} \ \ \begin{array} {rrr} \frac21 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 \\ 0 \\ 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Note that [math]\displaystyle{ \mathrm{U}_1 }[/math], the one which had two copies of the octave, has been canonicalized down to a single column, because its vectors are obviously not linearly independent. So it will be filtered out in the next step. Actually, since that's the only eliminated point, let's go ahead and do the next step too, which is deduping; we have a lot of dupes:


[math]\displaystyle{ \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac53 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & {-1} \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]


Now convert each [math]\displaystyle{ \mathrm{U}_i }[/math] to a [math]\displaystyle{ G_i }[/math]:


[math]\displaystyle{ \begin{array} {c} \ \ \begin{array} {rrr} \frac{2}{1} & \frac{3}{2} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & {-1} \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{2}{1} & \sqrt[4]{5} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \sqrt[3]{\frac{10}{3}} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & \frac13 \\ 0 & {-\frac13} \\ 0 & \frac13 \\ \end{array} \right] \end{array} }[/math]


And convert those to generator tuning maps: 1200 701.955], 1200 696.578], and 1200 694.786]. Note that every one of these has a pure-octave period. Then check the damage sums: 353.942 ¢(U), 89.083 ¢(U), and 110.390 ¢(U), respectively. So that tells us that we want the middle result of these three, 1200 696.578], as the minimization of the [math]\displaystyle{ 1 }[/math]-mean of unity-weight damage to the 6-TILT, when we're constrained to the octave being unchanged.

For a rank-3 temperament, with 2 held-intervals, we'd again have 8 choose 1 = 8 tunings to check. With 1 held-interval, we'd have 8 choose 2 = 28 tunings to check.

For all-interval tuning schemes

We can adapt the zero-damage method to compute all-interval tuning schemes where the dual norm power [math]\displaystyle{ \text{dual}(q) }[/math] is equal to [math]\displaystyle{ 1 }[/math]..

Maxization

Per the heading of this section, we might call these "minimax-MS" schemes, where the 'M' here indicates that their interval complexity functions have been "maxized", as opposed to "Euclideanized"; that is, the power and matching root from their norm or summation form has been changed to [math]\displaystyle{ ∞ }[/math] instead of to [math]\displaystyle{ 2 }[/math]. "Maxization" can be thought of as a reference to the fact that distance measured by [math]\displaystyle{ ∞ }[/math]-norms (maxes, remember) resembles distance traveled by "Max the magician" to get from point A to point B; he can teleport through all dimensions except the one he needs to travel furthest in, i.e. the maximum distance he had to go in any one dimension, is the defining distance. (To complete the set, the [math]\displaystyle{ 1 }[/math]-norms could be referred to as "taxicabized", referencing that this is the type of distance a taxicab on a grid of streets would travel… though would these tunings really be "-ized" if this is the logical starting point?)

And to be clear, the [math]\displaystyle{ \textbf{i} }[/math]-norm is maxized here — has norm power [math]\displaystyle{ ∞ }[/math] — because the norm power on the retuning magnitude is [math]\displaystyle{ 1 }[/math], and these norm powers must be duals.

Tuning schemes such as these are not very popular, because where Euclideanizing [math]\displaystyle{ \text{lp-C}() }[/math] already makes tunings less psychoacoustically plausible, maxizing it makes tunings even less plausible.

Example

Let's compute the minimax-MS tuning of meantone temperament. We begin by assembling our list of unchanged-interval bases. This list will be much shorter than it was with ordinary tuning schemes, because the size of this list increases combinatorially with the count of target-intervals, and with only three (proxy) target-intervals here for a 5-limit temperament.


[math]\displaystyle{ \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac31 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 1 & 0 \\ 0 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac31 & \ \ \frac51 \\ \end{array} \\ \left[ \begin{array} {r|r} 0 & 0 \\ 1 & 0 \\ 0 & 1 \\ \end{array} \right] \end{array} }[/math]

Neither the canonicalizing, filtering deficient matrices, nor the de-duping steps will have any effect for all-interval tuning computations. Any combination from the set of prime intervals will already be in canonical form, full-column-rank, and distinct from any other combination. Easy peasy.

So now we convert to generators, using the [math]\displaystyle{ G = \mathrm{U}(M\mathrm{U})^{-1} }[/math] trick:


[math]\displaystyle{ \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \frac32 \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & {-1} \\ 0 & 1 \\ 0 & 0 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac21 & \ \ \sqrt[4]{5} \\ \end{array} \\ \left[ \begin{array} {rrr} 1 & 0 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] \end{array} , \begin{array} {c} \ \ \begin{array} {rrr} \frac{3}{\sqrt[4]{5}} & \ \ \sqrt[4]{5} \\ \end{array} \\ \left[ \begin{array} {rrr} 0 & 0 \\ 1 & 0 \\ {-\frac14} & \frac14 \\ \end{array} \right] \end{array} }[/math]


So these are our candidate generator embeddings. In other words, if we seek to minimize the [math]\displaystyle{ 1 }[/math]-norm of the retuning map for meantone temperament, these are 3 pairs of generators we should check. Though remember we can simplify to checking the [math]\displaystyle{ 1 }[/math]-sum, which is just another way of saying the sum of the retunings. So each of these generator pairs corresponds to a pair of primes being tuned pure, because these are the tunings where the sum of retunings is minimized.

If we want primes 2 and 3 to both be pure, we use generators of [math]\displaystyle{ \frac21 }[/math] and [math]\displaystyle{ \frac32 }[/math] (Pythagorean tuning). If we want primes 2 and 5 to be pure, we use generators of [math]\displaystyle{ \frac21 }[/math] and [math]\displaystyle{ \sqrt[4]{5} }[/math] (quarter-comma tuning). If we want primes 3 and 5 to be pure, we use generators [math]\displaystyle{ \frac{3}{\sqrt[4]{5}} ≈ 2.006 }[/math] and [math]\displaystyle{ \sqrt[4]{5} }[/math] (apparently named "quarter-comma 3eantone" tuning).

We note that at the analogous point in the zero-damage method for ordinary tunings, we pointed out that the choice of [math]\displaystyle{ W }[/math] was irrelevant up to this point; similarly, here, the choice of [math]\displaystyle{ S }[/math] has thus far been irrelevant, though it will certainly affect things in the next step.

To decide between these candidates, we check each of them for the magnitude of the error on the primes.

  • Pythagorean tuning causes a magnitude of 9.262 ¢/oct of error (all on prime 5).
  • Quarter-comma tuning causes a magnitude of 3.393 ¢/oct of error (all on prime 3).
  • Quarter-comma 3eantone tuning causes a magnitude of 5.377 ¢/oct of error (all on prime 2).

And so quarter-comma tuning is our winner with the least retuning magnitude. That's the minimax-MS tuning of meantone.

With alternative complexities

No examples will be given here, on account of the lack of popularity of these tunings.

Coinciding-damage method

(WIP)

  1. Gene Ward Smith discovering this relationship: https://yahootuninggroupsultimatebackup.github.io/tuning-math/topicId_16172#16172
  2. The actual answer is more like 100.236. The result here is due to compounding rounding errors that I was too lazy to account for when preparing these materials. Sorry about that. ~Douglas
  3. Ideally we'd've consistently applied the Fraktur-styling effect to each of these letters, changing no other properties, i.e. ended up with an uppercase italic M and lowercase bold italic j and t, but unfortunately a consistent effect was not available using Unicode and the wiki's [math]\displaystyle{ \LaTeX }[/math] abilities, a consistent effect, anyway, that also satisfactorily captured the compound aspect of what these things represent.
  4. Perhaps re-running this process in the recognition of the fact that these matrices are shorthand for an underlying system of equations, and the derivative of [math]\displaystyle{ 𝒈 }[/math] is, in fact, its gradient, or in other words, the vector of partial derivatives with respect to each of its entries (as discussed in more detail in the later section, #Multiple derivatives), we could nail this down.
  5. If you don't dig it, please consider alternative attempts to explain these ideas here: User:Sintel/Generator_optimization#Constraints, here: Constrained_tuning/Analytical_solution_to_constrained_Euclidean_tunings, and here: Target tuning#Least squares tunings
  6. This is a different lambda to the one conventionally used for eigenvalues, or as we call them, scaling factors. This lambda refers to Lagrange, the mathematician who developed this technique.
  7. To help develop your intuition for these sorts of problems, we recommend Grant Sanderson's series of videos for Khan Academy's YouTube channel, about Lagrange multipliers for constrained optimizations: https://www.youtube.com/playlist?list=PLCg2-CTYVrQvNGLbd-FN70UxWZSeKP4wV
  8. See https://en.m.wikipedia.org/wiki/Lagrange_multiplier#Multiple_constraints for more information.
  9. [math]\displaystyle{ \begin{align} \begin{array} {c} 𝔐 \\ \left[ \begin{array} {c} 𝕞_{11} & 𝕞_{12} \\ 𝕞_{21} & 𝕞_{22} \\ \end{array} \right] \end{array} \begin{array} {c} 𝔐^\mathsf{T} \\ \left[ \begin{array} {c} 𝕞_{11} & 𝕞_{21} \\ 𝕞_{12} & 𝕞_{22} \\ \end{array} \right] \end{array} &= \\[12pt] \begin{array} {c} 𝔐𝔐^\mathsf{T} \\ \left[ \begin{array} {c} 𝕞_{11}^2 + 𝕞_{12}^2 & 𝕞_{11}𝕞_{21} + 𝕞_{12}𝕞_{22} \\ 𝕞_{11}𝕞_{21} + 𝕞_{12}𝕞_{22} & 𝕞_{21}^2 + 𝕞_{22}^2 \\ \end{array} \right] \end{array} &∴ \\[12pt] (𝔐𝔐^\mathsf{T})_{12} = (𝔐𝔐^\mathsf{T})_{21} \end{align} }[/math]
  10. Writes the present author, Douglas Blumeyer, who is relieved to have completely demystified this process for himself, after being daunted by it for over a year, then struggling for a solid week to assemble it from the hints left by the better-educated tuning theorists who came before me.
  11. Another reason we wrote the method for this optimization power up is because it was low-hanging fruit, on account of the fact that it was already described on the wiki in the Target tunings page, where it is presented as a method for finding "minimax" tunings, not miniaverage tunings. This is somewhat misleading, because while this method works for any miniaverage tuning scheme, it only works for some minimax tuning schemes under very specific conditions (which that page does meet, and so it's not outright wrong). The conditions are: unity-weight damage (check), and all members of the target-interval set expressible as products of other members (check, due to their choice of target-interval set, closely related to a tonality diamond, plus octaves are constrained to be unchanged). The reason why these are the two necessary conditions for this miniaverage method working for a minimax tuning scheme is because when you are solving for the minimax, you actually want the tunings where the target-intervals' damages equal each other, not where they are zero, and these zero-damage tunings will only match the tunings where other intervals' damages equal each other in the case where two intervals' damages being equal implies that another target's damage is zero, because that other target is the product of those first two; and the unity-weight damage requirement is to ensure that the slopes of each target's hyper-V are all the same, because otherwise the points where two damages are equal like this will no longer line up directly over the point where the third target's damage is zero. For more information on this problem, please see the discussion page for the problematic wiki page, where we are currently requesting the page be updated accordingly.
  12. Note that this technique for converting zero-damage points to generator tunings is much simpler than the technique described on the Target tunings page. The Target tunings page uses eigendecomposition, which unnecessarily requires you to find the commas for the temperament, compute a full projection matrix [math]\displaystyle{ P }[/math], and then when you need to spit a generator tuning map [math]\displaystyle{ 𝒈 }[/math] out at the end, requires the computation of a generator preimage transversal to do so (moreover, it doesn't explain or even mention eigendecomposition; it assumes the reader knows how and when to do them, cutting off at the point of listing the eigenvectors — a big thanks to Sintel for unpacking the thought process in that article for us). The technique described here skips the commas, computing the generator embedding [math]\displaystyle{ G }[/math] directly rather than via [math]\displaystyle{ P = GM }[/math], and then when you need to spit a generator tuning map out at the end, it's just [math]\displaystyle{ 𝒈 = 𝒋G }[/math], which is much simpler than the generator preimage transversal computation.

    The Target tunings approach and this approach are quite similar conceptually. Here's the Target tunings approach:

    [math]\displaystyle{ \scriptsize \begin{array} {c} P \\ \left[ \begin{array} {rrr} 1 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & \frac14 & 1 \\ \end{array} \right] \end{array} = \\ \scriptsize \begin{array} {c} \mathrm{V} \\ \left[ \begin{array} {r|r|r} \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#F2B2B4;padding:5px}{-4} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#F2B2B4;padding:5px}{4} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#F2B2B4;padding:5px}{-1} \\ \end{array} \right] \end{array} \begin{array} {c} \textit{Λ} \\ \left[ \begin{array} {rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \\ \end{array} \right] \end{array} \begin{array} {c} \mathrm{V} \\ \left[ \begin{array} {r|r|r} \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#F2B2B4;padding:5px}{-4} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#F2B2B4;padding:5px}{4} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#F2B2B4;padding:5px}{-1} \\ \end{array} \right]^{{\Large-1}} \end{array} }[/math]

    And the technique demonstrated here looks like this:

    [math]\displaystyle{ \scriptsize \begin{array} {c} G \\ \left[ \begin{array} {rrr} 1 & 1 \\ 0 & 0 \\ 0 & \frac14 \\ \end{array} \right] \end{array} = \\ \scriptsize \begin{array} {c} \mathrm{U} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{1} \\ \end{array} \right] \end{array} \Large ( \scriptsize \begin{array} {c} M \\ \left[ \begin{array} {rrr} 1 & 0 & {-4} \\ 0 & 1 & 4 \\ \end{array} \right] \end{array} \begin{array} {c} \mathrm{U} \\ \left[ \begin{array} {r|r} \style{background-color:#98CC70;padding:5px}{1} & \style{background-color:#98CC70;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{0} \\ \style{background-color:#98CC70;padding:5px}{0} & \style{background-color:#98CC70;padding:5px}{1} \\ \end{array} \right] \end{array} \Large )^{-1} \scriptsize }[/math]

    So in the Target tunings approach, [math]\displaystyle{ P }[/math] is the projection matrix, [math]\displaystyle{ \mathrm{V} }[/math] is a matrix consisting of a list of unrotated vectors — both ones with scaling factor 1 (unchanged-intervals) and those with scaling factor 0 (commas) — and [math]\displaystyle{ \textit{Λ} }[/math] is a diagonal scaling factors matrix, where you can see along the main diagonal we have 1's paired with the unrotated vectors for unchanged-intervals and 0's paired with the unrotated vectors for commas.

    In our approach, we instead solve for [math]\displaystyle{ G }[/math] by leaving the commas out of the equation, and simply using the mapping [math]\displaystyle{ M }[/math] instead.

    In addition to being much more straightforward and easier to understand, our technique gives the same results and reduces computation time by 50% (it took the computation of a miniaverage-U tuning for a rank-3, 11-limit temperament with 15 target-intervals from 12 seconds down to 8).
  13. Technically this gives us the tunings of the generators, in ¢/g.
  14. The article for the minimax tuning scheme, Target tunings, suggests that you fall back to the miniRMS method to tie-break between these, but that sort of misses the point of the problem. The two tied points are on extreme opposite ends of the slice of good solutions, and the optimum solution lies somewhere in between them. We don't want the tie-break to choose one or the other extreme; we want to find a better solution somewhere in between them.
  15. There does not seem to be any consensus about how to identify a true optimum in the case of multiple solutions when [math]\displaystyle{ p=1 }[/math]. See https://en.wikipedia.org/wiki/Least_absolute_deviations#Properties, https://www.researchgate.net/publication/223752233_Dealing_with_the_multiplicity_of_solutions_of_the_l1_and_l_regression_models, and https://stats.stackexchange.com/questions/275931/is-it-possible-to-force-least-absolute-deviations-lad-regression-to-return-the.
  16. In fact, the Target tunings page of the wiki uses this more complicated approach in order to realize pure octaves, and so the authors of this page had to reverse engineer from it how to make it work without any held-intervals.
  17. Held-intervals should generally be removed if they also appear in the target-interval list [math]\displaystyle{ \mathrm{T} }[/math]. If these intervals are not removed, the correct tuning can still be computed; however, during optimization, effort will have been wasted on minimizing damage to these intervals, because their damage would have been held to 0 by other means anyway. In general, it should be more computationally efficient to remove these intervals from [math]\displaystyle{ \mathrm{T} }[/math] in advance, rather than submit them to the optimization procedures as-is. Duplication of intervals between these two sets will most likely occur when using a target-interval set scheme (such as a TILT or OLD) that automatically chooses the target-interval set.