User:FloraC/Hard problems of harmony and psychoacoustically supported optimization
This page is a work in progress.
In the study of tuning optimization, we find two blocker issues that deserve the title "hard problems of harmony". Versed in a catchy way, they are:
- Is compositeness heard?
- Are divisive ratios more important than multiplicative ratios?[1]
In fact, they can be modeled in terms of parameters of the norm used in optimization. The first problem is about the weight, and the second about the skew. The order of the norm is the third parameter. Although not versed into a "hard problem" rhetoric since it is a little bit abstract, we must still consider it along with the first two. Collectively, they are parameters of the norm. Being independent of specific temperaments, they are genuine metaproblems of tuning optimization, and well worth a dive.
But let us remember a metaproblem of tuning optimization is a problem of harmony – of temporal interactions of sound. Solution attempt without touching on the psychoacoustic side of harmony would be baseless. So that is where we start with.
Chapter I. Harmonic Rootedness
There are two main categories of rootedness: chordal rootedness and tonal rootedness.
Chordal rootedness is said with respect to an individual chord. A chord with chordal rootedness is dubbed a rooted chord. Such a chord feels distinctly resolved and self-contained: it speaks of itself as a center of reference, a result from the alignment of the chord's formal root with its virtual fundamental. Let us clarify these terms before we proceed.
A formal root is simply the pitch that is denoted in terms of ratios as unity. Determination of a chord's formal root is part of the traditional pedagogy of harmony, but it is really hard to speak of as it depends on many factors: the specific chord structure, the texture and perhaps the genre of the piece. So it is often found heuristically. The bass supported by the perfect fifth is typically the best candidate of a formal root in an positive, octave-equivalent context, but such a structure is lacking in many chords especially in nontertian harmony. Another reasonable strategy in the same context is to always place the formal root on the bass. Anyway, this root is considered to be the starting point of a chord on top of which other notes are built.
By contrast, the chord's actual root is what we know as the virtual fundamental. This is the particular pitch on which a chord appears to be a single harmonic note after the phenomenon of timbral fusion, so analytically it emerges at the GCD of all the ratios of the chord. This root, being simultaneously "virtual" and "actual", presents an interesting case on how we think about it. Its virtuality is reflected by the fact that the pitch need not be present – neither as a note nor even as an energy stream in the spectrogram. Its actuality is evidenced by the fact that we literally hear it due to the gestalt of harmonic series.
As such, a chord is rooted if the GCD of all the ratios is unity, either up to octave equivalence or not. In the case of dyads, Aura calls it connectivity[2].
Take the just major triad as an example: 1–3–5. The formal root is by definition 1; the virtual fundamental here is their GCD, again 1. Since they line up perfectly, we call it a rooted chord. If we examine the octave-reduced version of the same chord: 1–5/4–3/2, the virtual fundamental becomes 1/4, so that is only a rooted chord on the condition of octave equivalence.
So far we have talked about otonal rootedness, while there is also utonal rootedness. The just minor triad, 1–1/3–1/5, is a utonally rooted chord, whose virtual fundamental emerges at the LCM and will be exposed in a subharmonic timbre. Matching otonally rooted chords with harmonic timbre, or utonally rooted chords with subharmonic timbre, is the state we may describe as "present", "explicit", or "revealed", which is the essential feeling of the major tonality. The opposite state is then "absent", "implicit" or "covered", which is the essential feeling of the minor tonality, as discussed in There Is Not a Third Side of the River.
1–7/6–3/2 is an example of a nonrooted chord. The otonal virtual fundamental is 1/6. That means the chord suggests an otonal root that is 6/1 below the bass. For the utonal virtual fundamental, we rewrite the chord as 1–7/9–2/3, and find it to be 14/1. That means the chord suggests a utonal root that is 14/1 above the highest note.
Tonal rootedness is said with respect to a tonal musical passage featuring a chord progression, and is closely related to functional harmony. There was a myth that, in tonal harmony of common practice, all the chords in a passage would have a common root. They said a passage in C major would typically be rooted on Bb or F. That was very incorrect and was probably a consequence of not distinguishing chordal and tonal rootedness. In fact, we just showed each chord has its own roots. A C major triad is rooted on C, a G major triad is rooted on G, and so on[3]. A chord progression of C–F–G–C, for example, does not set the whole passage's root to F just because it involves a chord rooted on F. Quite the opposite, the chordal root of F is there to strengthen the tonal root of C, namely the tonic.
In general, the tonal root simply corresponds to the tonic, decided by a series of harmonic functions, which in turn is decided by a series of formal roots. The reason it is decided by formal roots, not actual roots, of chords, is because the formal root is the perceived starting point of any chord. Using a nonrooted dominant chord, for example, does not disqualify it from dominant. Actually, its dominant function will be even stronger if the suggestive nature of the virtual fundamental is put to good use – to suggest the tonal root in this case.
Chapter II. Divisive and Multiplicative Ratios
Divisive ratios and multiplicative ratios are always said relative to each other. If a divisive ratio is of the form n/d, where n and d are integers, then a multiplicative ratio is of the form nd. For example, 5/3 is a divisive ratios; 15/1 is a multiplicative ratio. The question is, thus, if ratios of the form n/d are more important than those of the form nd.
The problem is hard because it is not clear what is implied by importance and what context it can be applied to. Of course, importance means simplicity. But simplicity of ratios is used in two major contexts: chord construction and tuning optimization, and they correspond to distinct psychoacoustic effects. Chord construction has to do with the revelation of harmonic identities due to timbral fusion to a virtual fundamental as discussed above, whereas tuning optimization has to do with percept formation and excitation, and to the better end, minimization of mistuned beating. These are fundamentally different effects – this essay takes the liberty of being the first to treat them separately.
The odd-limit tonality diamond fully favors divisive ratios to multiplicative ones, as the odd limit of a ratio is equal to the exponentiation of the Kees height, a norm in a lattice skewed towards divisive ratios by 1/12 turn. It is useful in just chord construction. Consider the just major triad again. While 5/1 and 3/1 are the only ratios used to build the chord, the interval between them – 5/3 – is a real, played interval, unlike the multiplicative ratio 15/1, which is not played, only present in the harmonics. Likewise, using any harmonics as components of a just chord causes all the ratios between them to be played, and thus to be emergent. Unless we stick to bare dyads, it could not be more appropriate than adopting a metric that favors divisive ratios, especially the tonality diamonds.
Such a metric will guide us to psychoacoustically supported chord construction, which spontaneously obeys the rules of primodalism as well. In a primodal chord like 1–6/5–7/5–8/5, notice how all the notes consolidate each other through the bond to a common virtual fundamental of 1/5. It is not rooted, but its harmonic identity is evident. Now consider 1–6/5–10/7–8/5, where we put 10/7 in place of 7/5, and it immediately becomes a 25-odd-limit chord. The virtual fundamental is pushed even lower to 1/35 as the denominators "fight" each other. As a result, a chord like this has little harmonic identity to listen to, and little utility to be concerned with.

The same cannot be assumed for tuning optimization, since that is a vastly different scenario. In a just major triad, the 15th harmonic exists in three ways: as the harmonic of the root, of the 3rd harmonic, and of the 5th harmonic. Figure 1 is the frequency spectrum of the triad played in the semisine waveform, which has been proposed as the standard ear-training waveform in Proposed Standard Ear-Training Waveform.
If we play such a triad in a tempered tuning profile, the quality of the chord is determined by how the three components said above line up. In a tuning profile characterized by the mistuning map
$$ E_1 = \left\langle \begin{matrix} 0 & -\epsilon & +\epsilon \end{matrix} \right] $$
the ~15/1 will be a combination of harmonics with pitch errors of -ε, 0, and +ε. In addition, the harmonic itself can be played as a dyad and its pitch error is 0.
Now consider the contrasting profile
$$ E_2 = \left\langle \begin{matrix} 0 & -\epsilon & -\epsilon \end{matrix} \right] $$
the ~15/1 will be a combination of harmonics with pitch errors of -ε, -ε, and 0, but the played harmonic is at -2ε. So we see this harmonic will get pretty off the track whenever played.
Regarding 5/3, it is the opposite situation. E2 comes out superior to E1 as it perfectly hits 5/3 whereas E1's ~5/3 is off by +2ε.
However, the beating occurs at ~15/1 and multiples thereof, not at ~5/3. The ~5/3, played as a nonrooted dyad, is free from a real reference point (e.g. harmonic series) for it to beat against, so it lacks relevance in tuning optimization. The only scenario to account for its accuracy is where it is played on the chord's formal root, in which case its 3rd harmonic beats against the formal root's 5th harmonic, for example. That is still not a good argument for its relative importance since we would have manipulated the chord structure just in order to obtain this result. A chord with ~15/1 played on the formal root would call for an accurate ~15/1 and then neutralize the demand for an accurate ~5/3 as previously posed. For example, the just major sixth chord 1–5/4–3/2–5/3 and the just major seventh chord 1–5/4–3/2–15/8 cancel each other out up to octave equivalence. More generally, for any chord featuring a divisive ratio on the formal root, there is a counterpart featuring a multiplicative ratio alike.
We should also note the just minor triad is of equal complexity as the just major triad by the principle of invertibility. The just major triad is sometimes considered to be more important by being isodifferential and thus having a common beating rate. The just minor triad is also isodifferential, though not with respect to frequency but to its inverse, the length of a virtual vibrating string. Optimizing for the just minor triad requires us to put it in the context of negative harmony. Starting atop and step downwards, the optimization targets are first 1/3 and then 1/5, which are analytically equivalent to 3/1 and 5/1 respectively in positive harmony.
That all but suggests practically equal importance of divisive ratios and multiplicative ratios in tuning optimization.
Chapter III. Power in Proportion
The first ever attempt at a systematic tuning solution was Paul Erlich's TOP tuning[4]. This tuning was elegantly explained in his Middle Path paper in the case of nullity-1 (i.e. single-comma temperaments)[5]. In this tuning, every prime makes an effort in the right direction to close out the comma. To illustrate, consider 5-limit meantone, and to simplify it even more, let us start with the constrained equilateral-optimal tuning (CEOP tuning) instead since its effect is the easiest to observe. The CEOP tuning of 5-limit meantone is given in terms of the projection map P as
$$ P = \begin{bmatrix} 1 & 4/5 & -4/5 \\ 0 & 1/5 & 4/5 \\ 0 & 1/5 & 4/5 \end{bmatrix} $$
Let us denote the just tuning map in cents by J, the mistuning map E is
$$ \begin{align} E &= J(P - I) \\ &= \left\langle \begin{matrix} 0 & -4.3013 & +4.3013 \end{matrix} \right] \end{align} $$
That is the 1/5-comma tuning, in which harmonics 3 and 5 have an equal magnitude and an opposite sign of error.
TOP tuning works principally the same, except that harmonic 2 is no longer constrained to pure and that the allowed error of q is log2 (q) times that of prime 2. The TOP mistuning map of 5-limit meantone is
$$ E = \left\langle \begin{matrix} +1.6985 & -2.6921 & +3.9439 \end{matrix} \right] $$
The Euclidean norm we are covering next differs in a very important way. In Manhattan tunings such as TOP, each prime makes equal effort to close out the comma, whereas in Euclidean tunings, how much tempering load is assigned to each prime is proportional to how efficient the prime is to close out the comma.
The effect is most clearly observed in the 5-limit CEE tuning (constrained equilateral-Euclidean tuning), where 2 is constrained to pure and only 3 and 5 placed equally distant in the lattice are under consideration. Here is the CEE projection map of 5-limit meantone:
$$ P = \begin{bmatrix} 1 & 16/17 & -4/17 \\ 0 & 1/17 & 4/17 \\ 0 & 4/17 & 16/17 \end{bmatrix} $$
The mistuning map is
$$ \begin{align} E &= J(P - I) \\ &= \left\langle \begin{matrix} 0 & -5.0603 & +1.2651 \end{matrix} \right] \end{align} $$
That is the 4/17-comma tuning, in which prime 3 gets four times the error of prime 5 in the opposite direction. Notice how prime 3 contributes more to close out the comma since it is better at doing it, and the amount of load assigned to it is exactly 4 times that to prime 5.
Understanding Euclidean norms will much simplify the cognitive process to Chebyshevian norms. In short, Chebyshevian norms take it to the other extreme, and can be described as demonstrating the "collapsing effect". In the case of nullity-1, it means the most efficient prime gets all the tempering loads and other primes get no load at all. The CEC (constrained equilateral-Chebyshevian) tuning of 5-limit meantone is
$$ P = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 1/4 & 1 \end{bmatrix} $$
The mistuning map is
$$ \begin{align} E &= J(P - I) \\ &= \left\langle \begin{matrix} 0 & -5.3766 & 0 \end{matrix} \right] \end{align} $$
That is our familiar 1/4-comma tuning. It is surprising that no interest has yet developed in tunings by the Chebyshevian norm. Compared to the 4/17-comma tuning by the Euclidean norm, The 1/4-comma tuning by the Chebyshevian norm removes all errors in prime 5 at the cost of just a little bit more in prime 3.
To evaluate, the Euclidean tuning turns out advantageous not only because it is easy to compute (the Euclidean norm being the only norm with analytical solutions) but because it is theoretically nice as the more capable are tasked to do proportionately more. Both the Manhattan norm and the Chebyshevian norm show discontinuities when the complexities of the primes are at certain extreme points, and things start to break down as we approach them. The Manhattan norm feels strange when some primes are orders-of-magnitude more complex than the rest. The Chebyshevian norm feels as strange when all primes have near-equal complexities.
Chapter IV. Art of Compromise
Chapter V. Towards an Optimization Strategy
Notes
- ↑ Prior to this material, the two problems are often said in the other order, but this essay inverts them since weight is usually considered before the skew in tuning optimization.
- ↑ Aura's Ideas on Consonance. Aura. Xenharmonic Wiki.
- ↑ "The Major Mode and the Diatonic Chords", Theory of Harmony. Arnold Schoenberg, translated by Roy E. Carter. University of California Press.
- ↑ "All-Interval Tuning Schemes", Dave Keenan & Douglas Blumeyer's Guide to RTT. Dave Keenan and Douglas Blumeyer. Xenharmonic Wiki.
- ↑ A Middle Path between Just Intonation and the Equal Temperaments – Part 1. Paul Erlich.