User:FloraC/Hard problems of harmony and psychoacoustically supported optimization

From Xenharmonic Wiki
Jump to navigation Jump to search

In the study of tuning optimization, we find two blocker issues that deserve the title "hard problems of harmony". Versed in a catchy way, they are:

  1. Is compositeness heard?
  2. Are divisive ratios more important than multiplicative ratios?[1]

In fact, they can be modeled in terms of parameters of the norm used in optimization. The first problem is about the weight, and the second about the skew. The order of the norm is the third parameter. Although not versed into a "hard problem" rhetoric since it is a little bit abstract, we must still consider it along with the first two. Collectively, they are parameters of the norm. Being independent of specific temperaments, they are genuine metaproblems of tuning optimization, and well worth a dive.

But let us remember a metaproblem of tuning optimization is a problem of harmony – of temporal interactions of sound. Solution attempt without touching on the psychoacoustic side of harmony would be baseless. So that is where we start with.

Chapter I. Harmonic Rootedness

There are two main categories of rootedness: chordal rootedness and tonal rootedness.

Chordal rootedness is said with respect to an individual chord. A chord with chordal rootedness is dubbed a rooted chord. Such a chord feels distinctly resolved and self-contained: it speaks of itself as a center of reference, a result from the alignment of the chord's formal root with its virtual fundamental. Let us clarify these terms before we proceed.

A formal root is simply the pitch that is denoted in terms of ratios as unity. Determination of a chord's formal root is part of the traditional pedagogy of harmony, but it is really hard to speak of as it depends on many factors: the specific chord structure, the texture and perhaps the genre of the piece. So it is often found heuristically. The bass supported by the perfect fifth is typically the best candidate of a formal root in a positive, octave-equivalent context, but such a structure is lacking in many chords especially in nontertian harmony. Another reasonable strategy in the same context is to always place the formal root on the bass. Anyway, this root is considered to be the starting point of a chord on top of which other notes are built.

By contrast, the chord's actual root is what we know as the virtual fundamental. This is the particular pitch on which a chord appears to be a single harmonic note after the phenomenon of timbral fusion, so analytically it emerges at the GCD of all the ratios of the chord. This root, being simultaneously "virtual" and "actual", presents an interesting case on how we think about it. Its virtuality is reflected by the fact that the pitch need not be present – neither as a note nor even as an energy stream in the spectrogram. Its actuality is evidenced by the fact that we literally hear it due to the gestalt of harmonic series.

As such, a chord is rooted if the GCD of all the ratios is unity, either up to octave equivalence or not. In the case of dyads, Aura calls it connectivity[2].

Take the just major triad as an example: 1–3–5. The formal root is by definition 1; the virtual fundamental here is their GCD, again 1. Since they line up perfectly, we call it a rooted chord. If we examine the octave-reduced version of the same chord: 1–5/4–3/2, the virtual fundamental becomes 1/4, so that is only a rooted chord on the condition of octave equivalence.

So far we have talked about otonal rootedness, while there is also utonal rootedness. The just minor triad, 1–1/3–1/5, is a utonally rooted chord, whose virtual fundamental emerges at the LCM and will be exposed in a subharmonic timbre. Matching otonally rooted chords with harmonic timbre, or utonally rooted chords with subharmonic timbre, is the state we may describe as "present", "explicit", or "revealed", which is the essential feeling of the major tonality. The opposite state is then "absent", "implicit" or "covered", which is the essential feeling of the minor tonality, as discussed in There Is Not a Third Side of the River.

1–7/6–3/2 is an example of a nonrooted chord. The otonal virtual fundamental is 1/6. That means the chord suggests an otonal root that is 6/1 below the bass. For the utonal virtual fundamental, we rewrite the chord as 1–7/9–2/3, and find it to be 14/1. That means the chord suggests a utonal root that is 14/1 above the highest note.

Tonal rootedness is said with respect to a tonal musical passage featuring a chord progression, and is closely related to functional harmony. There was a myth that, in tonal harmony of common practice, all the chords in a passage would have a common root. They said a passage in C major would typically be rooted on Bb or F. That was very incorrect and was probably a consequence of not distinguishing chordal and tonal rootedness. In fact, we just showed each chord has its own roots. A C major triad is rooted on C, a G major triad is rooted on G, and so on[3]. A chord progression of C–F–G–C, for example, does not set the whole passage's root to F just because it involves a chord rooted on F. Quite the opposite, the chordal root of F is there to strengthen the tonal root of C, namely the tonic.

In general, the tonal root simply corresponds to the tonic, decided by a series of harmonic functions, which in turn is decided by a series of formal roots. The reason it is decided by formal roots, not actual roots, of chords, is because the formal root is the perceived starting point of any chord. Using a nonrooted dominant chord, for example, does not disqualify it from dominant. Actually, its dominant function will be even stronger if the suggestive nature of the virtual fundamental is put to good use – to suggest the tonal root in this case.

Chapter II. Divisive and Multiplicative Ratios

Divisive ratios and multiplicative ratios are always said relative to each other. If a divisive ratio is of the form n/d, where n and d are integers, then a multiplicative ratio is of the form nd. For example, 5/3 is a divisive ratio; 15/1 is a multiplicative ratio. The question is, thus, if ratios of the form n/d are more important than those of the form nd.

The problem is hard because it is not clear what is implied by importance and what context it can be applied to. Of course, importance means simplicity. But simplicity of ratios is used in two major contexts: chord construction and tuning optimization, and they correspond to distinct psychoacoustic effects. Chord construction has to do with the revelation of harmonic identities due to timbral fusion to a virtual fundamental as discussed in the last chapter, whereas tuning optimization has to do with percept formation and excitation, and to the better end, minimization of mistuned beating. These are fundamentally different effects – this essay takes the liberty of being the first to treat them separately.

The odd-limit tonality diamond fully favors divisive ratios to multiplicative ones, as the odd limit of a ratio is equal to the exponentiation of the Kees height, a norm in a lattice skewed towards divisive ratios by 1/12 turn. It is useful in just chord construction. Consider the just major triad again. While 5/1 and 3/1 are the only ratios used to build the chord, the interval between them – 5/3 – is a real, played interval, unlike the multiplicative ratio 15/1, which is not played, only present in the harmonics. Likewise, using any harmonics as components of a just chord causes all the ratios between them to be played, and thus to be emergent. Unless we stick to bare dyads, it could not be more appropriate than adopting a metric that favors divisive ratios, especially the tonality diamonds.

Such a metric will guide us to psychoacoustically supported chord construction, which spontaneously obeys the rules of primodalism as well. In a primodal chord like 1–6/5–7/5–8/5, notice how all the notes consolidate each other through the bond to a common virtual fundamental of 1/5. It is not rooted, but its harmonic identity is evident. Now consider 1–6/5–10/7–8/5, where we put 10/7 in place of 7/5, and it immediately becomes a 25-odd-limit chord. The virtual fundamental is pushed even lower to 1/35 as the denominators "fight" each other. As a result, a chord like this has little harmonic identity to listen to, and little utility to be concerned with.

Figure 1: Frequency spectrum of just major triad in semisine wave

The same cannot be assumed for tuning optimization, since that is a vastly different scenario. In a just major triad, the 15th harmonic exists in three ways: as the harmonic of the root, of the 3rd harmonic, and of the 5th harmonic. Figure 1 is the frequency spectrum of the triad played in the semisine waveform, which has been proposed as the standard ear-training waveform in Proposed Standard Ear-Training Waveform.

If we play such a triad in a tempered tuning profile, the quality of the chord is determined by how the three components said above line up. In a tuning profile characterized by the error map

$$ \mathcal{E}_1 = \left\langle \begin{matrix} 0 & -\varepsilon & +\varepsilon \end{matrix} \right] $$

the ~15/1 will be a combination of harmonics with pitch errors of −ε, 0, and +ε. In addition, the harmonic itself can be played as a dyad and its pitch error is 0.

Now consider the contrasting profile

$$ \mathcal{E}_2 = \left\langle \begin{matrix} 0 & -\varepsilon & -\varepsilon \end{matrix} \right] $$

the ~15/1 will be a combination of harmonics with pitch errors of −ε, −ε, and 0, but the played harmonic is at −2ε. So we see this harmonic will get pretty off the track whenever played.

Regarding 5/3, it is the opposite situation. Ɛ2 comes out superior to Ɛ1 as it perfectly hits 5/3 whereas Ɛ1's ~5/3 is off by +2ε.

However, the beating occurs at ~15/1 and multiples thereof, not at ~5/3. The ~5/3, played as a nonrooted dyad, is free from a real reference point (harmonic series) for it to beat against, so it lacks relevance in tuning optimization. The only scenario to account for its accuracy is where it is played on the chord's formal root, in which case its 3rd harmonic beats against the formal root's 5th harmonic, for example. That is still not a good argument for its relative importance since we would have manipulated the chord structure just in order to obtain this result. A chord with ~15/1 played on the formal root would call for an accurate ~15/1 and then neutralize the demand for an accurate ~5/3 as previously posed. For example, the demands posed by the just major sixth chord 1–5/4–3/2–5/3 and by the just major seventh chord 1–5/4–3/2–15/8 cancel each other out up to octave equivalence. More generally, for any chord featuring a divisive ratio on the formal root, there is a counterpart featuring a multiplicative ratio alike.

We should also note the just minor triad is of equal complexity as the just major triad by the principle of invertibility. The just major triad is sometimes considered to be more important by being isodifferential and thus having a common beating rate. The just minor triad is also isodifferential, though not with respect to frequency but to its inverse, the length of a virtual vibrating string. Optimizing for the just minor triad requires us to put it in the context of negative harmony. Starting atop and step downwards, the optimization targets are first 1/3 and then 1/5, which are analytically equivalent to 3/1 and 5/1 respectively in positive harmony.

That all but suggests practically equal importance of divisive ratios and multiplicative ratios in tuning optimization.

Chapter III. Power in Proportion

The first ever attempt at a systematic tuning solution was Paul Erlich's TOP tuning[4]. This tuning was elegantly explained in his Middle Path paper in the case of nullity-1 (i.e. single-comma temperaments)[5]. In this tuning, every prime makes an effort in the right direction to close out the comma. To illustrate, consider 5-limit meantone, and to simplify it even more, let us start with the constrained equilateral-optimal tuning (CEOP tuning) instead since its effect is the easiest to observe. The CEOP tuning of 5-limit meantone is given in terms of the projection map P as

$$ P = \begin{bmatrix} 1 & 4/5 & -4/5 \\ 0 & 1/5 & 4/5 \\ 0 & 1/5 & 4/5 \end{bmatrix} $$

Let us denote the just tuning map in cents by TJ, the error map Ɛ is

$$ \begin{align} \mathcal{E} &= T_J(P - I) \\ &= \left\langle \begin{matrix} 0 & -4.3013 & +4.3013 \end{matrix} \right] \end{align} $$

That is the 1/5-comma tuning, in which harmonics 3 and 5 have an equal magnitude and an opposite sign of error.

TOP tuning works principally the same, except that harmonic 2 is no longer constrained to pure and that the allowed error of q is log2 (q) times that of prime 2. The TOP error map of 5-limit meantone is

$$ \mathcal{E} = \left\langle \begin{matrix} +1.6985 & -2.6921 & +3.9439 \end{matrix} \right] $$

The Euclidean norm we are covering next differs in a very important way. In Manhattan tunings such as TOP, each prime makes equal effort to close out the comma, whereas in Euclidean tunings, how much tempering load is assigned to each prime is proportional to how efficient the prime is to close out the comma.

The effect is most clearly observed in the 5-limit CEE tuning (constrained equilateral-Euclidean tuning), where 2 is constrained to pure and only 3 and 5 placed equally distant in the lattice are under consideration. Here is the CEE projection map of 5-limit meantone:

$$ P = \begin{bmatrix} 1 & 16/17 & -4/17 \\ 0 & 1/17 & 4/17 \\ 0 & 4/17 & 16/17 \end{bmatrix} $$

The error map is

$$ \begin{align} \mathcal{E} &= T_J(P - I) \\ &= \left\langle \begin{matrix} 0 & -5.0603 & +1.2651 \end{matrix} \right] \end{align} $$

That is the 4/17-comma tuning, in which prime 3 gets four times the error of prime 5 in the opposite direction. Notice how prime 3 contributes more to close out the comma since it is better at doing it, and the amount of load assigned to it is exactly 4 times that to prime 5.

Understanding Euclidean norms will much simplify the cognitive process to Chebyshevian norms. In short, Chebyshevian norms take it to the other extreme, and can be described as demonstrating the "collapsing effect". In the case of nullity-1, it means the most efficient prime gets all the tempering loads and other primes get no load at all. The CEC (constrained equilateral-Chebyshevian) tuning of 5-limit meantone is

$$ P = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 1/4 & 1 \end{bmatrix} $$

The error map is

$$ \begin{align} \mathcal{E} &= T_J(P - I) \\ &= \left\langle \begin{matrix} 0 & -5.3766 & 0 \end{matrix} \right] \end{align} $$

That is our familiar 1/4-comma tuning. It is surprising that no interest has yet developed in tunings by the Chebyshevian norm. Compared to the 4/17-comma tuning by the Euclidean norm, The 1/4-comma tuning by the Chebyshevian norm removes all errors in prime 5 at the cost of just a little bit more in prime 3.

To evaluate, tuning by the Euclidean norm turns out advantageous not only because it is easy to compute (Euclidean being the only order of norms with analytical solutions) but because it is theoretically nice as the more capable are tasked to do proportionately more. Both Manhattan and Chebyshevian tunings show discontinuities when the complexities of the primes are at certain extreme points, and things start to break down as we approach them. Manhattan tunings show strange behaviors when some primes are orders-of-magnitude more complex than the rest. Chebyshevian tunings are as strange when all primes have near-equal complexities.

Chapter IV. Art of Compromise

Tempering is the ultimate art of compromise, a global, millenium-old puzzle, for a coarse tuning of the 12 equal temperament was actually given in the ancient Chinese book Huai Nan Zi (c. 122 BC) – not that the concept of equal temperament was laid out in any way, but they wanted twelve Pythagorean fifths to close off at the octave![6] So what about this essay? Most likely, it will be no end of a debate, but inviting more. It is high time we confront the last hard problem: compositeness of the harmonics.

If we play the 15th harmonic, does it somehow suggest 3 and 5? It seems even if we do not hear 15 as composite, we may perceive the compositeness in some other ways, making them conceptually reducible, thus simpler, than its neighboring prime harmonics. Yet the problem definitely does not end there. Sensing compositeness sounds like a reasonable assertion, but does it make composite intervals more important, or less? Does it make composite intervals deserve more care, or less? That is essentially equivalent to asking if complexity needs more care, or less.

On one hand, we want the majority of chords to be in tune, so obviously the most common intervals should get the best care. The question is then what probability distribution is followed without knowing what kind of harmony will be used in advance. A chi distribution would certainly make sense if we were to talk about randomly generated "tonal" music with no regards of psychoacoustics – since each voice's number of generator steps from the tonic was supposed to follow a normal distribution. In a world with human beings and with harmonic clarity rather than the abstract number of generator steps playing the predominant role of forming tonality, the right assumption for commonness is definitely not that but to be inversely related to complexity. The metric can be taken as the inner product of a uniform distribution and the inverse complexity, and if the uniform distribution is replaced with something that favors structurally tonal music such as a chi distribution, we obtain a commonness curve that biases heavily towards simplicity more than many would expect.

Try thinking of it this way: one could spend their life making music of only plain octaves and fifths without being bored at all. That was what happened in many cultures around the world and no one seemed to have a problem. It is actually our expression of harmonic feelings in intricate multidigit ratios that is the more peculiar endeavor.

On the other hand, it is argued that complex intervals need relatively more care since it is harder to capture their identities. It is believed that complex intervals have a smaller range of tolerance in which their identities will be revealed, which is fairly easy to understand.

The Tenney weight is the weight that turns a deaf ear to strikes a perfect balance on those considerations. In fact, it is the only weight in which tunings on composite subgroups coincide with tunings on prime subgroups, meaning that optimizing a temperament on 2.3.5 or 2.9.5 will render the same result for all the intervals they share. The reason is each prime q in the prime list Q has an importance rating of 1/log2 (q), represented by the matrix

$$ W = \operatorname {diag} \left( 1/\log_2 (Q) \right) $$

Thus, 9 has exactly twice the tolerance as 3, just as a temperament will tune it: if 3 has +1 cent of error, 9 must have +2 cents since 9 counts as two steps along the path of 3. In general, the Tenney weight makes the complexity curve of all integer harmonics a smooth one. The same is not true for any other weights, such as the equilateral weight we are addressing next.

The equilateral weight puts all prime harmonics on a sort of equal footing regardless of their inherent difference, by which is meant, for example, prime 2 lies as far from the origin as prime 13. It is represented by an identity matrix:

$$ W = I $$

That is pretty wrong if gazed from the universe of Tenney weight, as it makes harmonic 8 three times distant with three times the error of 13. One can immediately see the bumps in the complexity curve of integer harmonics. Nonetheless, it reasonably holds itself as it demands the same absolute tolerance for all primes. It only highlights higher primes in a mild manner if the standard is, as they argued, a diminishing tolerance.

The Wilson weight does the opposite to the equilateral weight, as it puts 1/q importance rating to the prime q, represented by the matrix

$$ W = \operatorname {diag} (1/Q) $$

That is also very useful because the tuning will converge even if higher and higher prime harmonics join the party. Theoretically, it can be used to define a kind of "no-limit" tuning as well as other related measurements, which is not possible with Tenney or equilateral weights since they would oscillate forever.

The situation here is rather like the order of norms assessed in the last chapter. Tuning by the Tenney weight is clearly a generalist's choice, making fewer assumptions and bearing the unique property of a smooth complexity curve for all integer harmonics. The equilateral and Wilson weights are defensible in their own ways.

We should also note that the equilateral and Wilson weights render rational tuning maps in terms of monzo lists (i.e. projection map), whereas Tenney's is transcendental due to the nature of logarithm. By the law that algebraic matrices only have algebraic eigenvectors, both equilateral and Wilson tunings make certain intervals pure and certain other intervals equally off from pure, relating a sense of well compromise. Again in the universe of Tenney weight the appeal can be logically dismissed as follows: due to the nature of logarithm no interval should be tuned pure in the first place except when explicitly told to be (i.e. constrained). By committing to Tenney tunings one should remind themself of that whenever in doubt. Otherwise, if one is indeed after purely tuned intervals, there are many hidden options besides equilateral and Wilson.

Consider an algebraic approximation to Tenney, that is, a tuning with pure intervals but not changing the bias much towards lower or higher harmonics. Notice the CTE tuning of 5-limit meantone is suspiciously close to 2/9-comma. Let us make up a Euclidean norm that results in exactly 2/9-comma. It turns out the weight must be

$$ W = \operatorname {diag} \left( \left\langle \begin{matrix} w_1 & \sqrt{2} & 1 \end{matrix} \right] \right) $$

with w1 being free since the octave is constrained. This weight always yields rational projection maps for some reasons. It can be used to tune all 5-limit temperaments alike, and the weight ratio between 5 and 3 is 1/sqrt (2), very close to log5 (3) in Tenney. In general, the weight ratio between q and 3 should be close to logq (3) and the exact values are left to readers to experiment with.

Chapter V. Towards an Optimization Strategy

Incorporating all that have been discussed above, I recommend CTE tuning as the best general-purpose reference solution to everyone, whereas my hemi-idiosyncratic answer to tuning optimization is based on a meticulously engineered weight function, which happens to be an unskewed version of the Hahn distance. Let us dub this the Canou[n] weight, and the tuning using this weight the CCnE tuning (for constrained Canou[n]-Euclidean tuning).

In this weight, the n is a positive integer determining the highest relevant harmonic. The weight of any prime harmonic equals its maximum number of stacks without exceeding the n-integer-limit. Different values of n can alter the relative weights of the primes.

To illustrate, let us set n = 9, or 9-integer-limit. Harmonic 2 can be stacked thrice, giving 8. Stacking it four times would give 16, exceeding 9. Its weight is thus 3. Harmonic 3 can be stacked twice, giving 9. Stacking it three times would give 27, exceeding 9. Its weight is thus 2. Both 5 and 7 have unit weight since they can only be stacked once in the integer limit. 11 and beyond have zero weight because they cannot be stacked at all. If optimization is to be carried out for a 13-limit temperament then we have the weights 3, 2, 1, 1, 0, 0 for primes 2 to 13. The weights are different if n = 7, or 7-integer-limit, for example. The weight of 2 is 2, of 3, 5 and 7 is unity, and of 11 and 13 zero, giving 2, 1, 1, 1, 0, 0 for primes 2 to 13.

The Canou[n] weight matrix is given as

$$ W = \operatorname {diag} \left( \operatorname {floor} \left( \log_Q (n) \right) \right) $$

which indicates that the prime q in Q has the weight equal to floor (logq (n)).

The Tenney weight is a special case of the Canou[n] weight, where n → infinity. The only thing that sets Canou[n] apart from Tenney is the floor function (since logQ (n) = log2 (n)/log2 (Q) and log2 (n) is a constant), and its effect converges to zero as n gets sufficiently large. Conceptualizing the Tenney weight in this way is not recommended, though, because Tenney's is characteristically transcendental whereas all the other Canou[n] weights are algebraic.

That defines the CnC, CnE, and CnOP tunings, but if we contrain the octave to pure, it does not matter how many times the octave is stacked, making the integer limit equivalent to the smaller closest odd limit. The proposed convention is to always use the largest number n if multiple consecutive choices of n will give the same CCnE tuning. For example, CC13E, CC14E, CC15E, and CC16E are all equivalent and one should always write CC16E.

Specifically designed to my taste, another special case of note is setting n = 24, or Canou[24]. This shall be the default n. The entries are 4, 2, 1, 1, 1, 1, 1, 1, 1 for primes 2 to 23, and primes beyond 23 are never optimized for. The octave matters not, so you can see its only difference from the equilateral weight is that not 3 but 9 is treated as a prime, meaning every two steps along the path of 3 counts as one.

Let us tune some temperaments!

Temperament Error Map (CTE) Error Map (CC24E)
Meantone, 5-limit 0 −4.7407 +2.5436] 0 −4.3013 +4.3013]
Meantone, 7-limit 0 −5.0029 +1.4948 +0.6955] 0 −4.9439 +1.7308 +1.2853]
Superpyth, 2.3.7 subgroup 0 +7.6398 +11.9845] 0 +6.8160 +13.6320]
Superpyth, 7-limit 0 +7.6357 +0.0023 +11.9928] 0 +7.5618 −0.6629 +12.1406]
Sensamagic, 11-limit 0 +1.8187 −0.7280 −2.1970 +0.2160] 0 +1.6927 −0.9366 −2.3952 +0.5219]
Marvel (hecate), 13-limit 0 −0.3917 −3.2984 +0.3314 −1.9273 −0.3053] 0 −0.3922 −3.5071 −0.0870 −1.3006 +0.1105]
Pele, 13-limit 0 +1.4848 +0.5796 −2.5714 +1.1774 +1.6483] 0 +1.5434 +0.7961 −2.7066 +0.8077 +1.1028]

Notes

  1. Prior to this material, the two problems are often said in the other order, but this essay inverts them since weight is usually considered before skew in tuning optimization.
  2. Aura's Ideas on Consonance. Aura. Xenharmonic Wiki.
  3. "The Major Mode and the Diatonic Chords", Theory of Harmony. Arnold Schoenberg, translated by Roy E. Carter. University of California Press.
  4. "All-Interval Tuning Schemes", Dave Keenan & Douglas Blumeyer's Guide to RTT. Dave Keenan and Douglas Blumeyer. Xenharmonic Wiki.
  5. "A Middle Path between Just Intonation and the Equal Temperaments – Part 1", Xenharmonikôn, An Informal Journal of Experimental Music. Paul Erlich.
  6. "Prince Chu Tsai-Yü's Life and Work: A Re-Evaluation of His Contribution to Equal Temperament Theory", Ethnomusicology. Fritz A. Kuttner.

Release Notes

© 2023–2024 Flora Canou

Version Stable 4

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.