Chord complexity

From Xenharmonic Wiki
Jump to navigation Jump to search

Much of tuning theory involves looking at intervals. However, when intervals are combined into chords, they can sometimes form synergies that aren't immediately apparent if one is only focused on dyads. Fortunately, many of the metrics we use to evaluate intervals generalize very easily to larger chords, and we will look at some in this article.

Basics

Summary

In this article we derive a fairly simple set of expressions which evaluate what we call the "simple" chord complexity (or "otonalness") of a chord. These generalize the familiar expressions for both the Benedetti/Tenney height and the Weil height of dyads. These expressions are as follows for chord [math]x_1:x_2:\ldots:x_N[/math]:

Benedetti height: [math]\displaystyle B_s(x_1, x_2, \ldots, x_N) = \frac{(x_1 \cdot x_2 \cdot \ldots \cdot x_N)^{1/N}}{N^{1/s}}[/math]

Weil height: [math]\displaystyle W_s(x_1, x_2, \ldots, x_N) = \frac{\max(x_1, x_2, \ldots, x_N)}{N^{1/s}}[/math]

The use of either the geometric mean or maximum has a pretty long "folklore" history of being used to evaluate the complexity of a chord; such expressions routinely show up in the computation of Harmonic Entropy, for instance. These expressions are the same, but simply multiply the result by an extra normalizing term of [math]1/N^{1/s}[/math]. This normalizing term doesn't affect the rankings for chords of the same size, but does affect how chords of different sizes scale in complexity with regard to one another. There is one free parameter [math]s[/math] which can be used to adjust this scaling between chords of different sizes; we suggest setting [math]s=1[/math] as a good default value. We also note that we get the usual raw geometric mean and maximum as [math]s \to \infty[/math].

In this article we derive these expressions rigorously, as a slight adjustment or "span-correction" of a slightly different metric which satisfies certain axioms regarding simple chord complexity.

The Psychoacoustics of a Dyad

Consonance and dissonance are rather tricky and elusive phenomena to model, in part because the terms don't unambiguously refer to one thing. David Huron, for instance, lists at least 14 different types of dissonance here, some of which are psychoacoustic, some of which depend on some kind of larger musical or "tonal" setting, and some of which are clearly dependent on learned expectations. It is thus very likely that consonance is a multidimensional quantity that cannot be represented by a single scalar value.

When we are only looking at dyads, many of the purely psychoacoustic qualities associated with consonance above simplify to the same basic metric, which is that they are strongest for dyads that are close to simple frequency ratios. In general, for some ratio n/d, these qualities tend to decrease as n and d increase, unless n/d is a complex ratio that happens to also be very close to a simple ratio. In that situation, the perception of the complex ratio starts to be eclipsed by the perception of it as a slightly-detuned version of the nearby simpler ratio.

If we don't care about modeling the latter effect, and only care about modeling the complexity of a ratio directly, then for n/d, any function of n and d that is monotonically increasing in either variable will do. The height functions on this Wiki are some simple examples of this. The two most commonly used are the Benedetti height/Tenney height of n*d and log(n*d), and the Weil height of max(n,d) or log(max(n,d)), which have the useful property that their logarithmic versions are norms on the space of monzos (in particular, the first is a type of L1 norm).

Note that the Benedetti height and Tenney height are basically the same thing; it is fairly common when talking about height functions to equivocate between the logarithmic and non-logarithmic versions of the same function, as they rank rationals the same either way. We will sometimes equivocate between the two names, but in general the name "Benedetti height" has been given to the non-logarithmic version and the name "Tenney height" to the logarithmic version.

If we do care about modeling the aforementioned detuning effect, then Harmonic entropy is one way to model this, which has a free parameter determining how "tolerant" the listener's auditory system is to perceiving slightly detuned versions of simple ratios as slightly-off versions of themselves, rather than perceiving them as other, more complex ratios. Tenney and Weil height can also be used to seed the Harmonic entropy calculation to begin with, so that they can be thought of as "primitives" from which increasingly sophisticated models can be built.

Some Caveats in Expanding to Chords of Arbitrary Size

When we look at chords of arbitrary size, it is clear that some meaningful analogue of the former property holds. For instance, for some triad a:b:c, if a and b and c are small, they will tend to exhibit qualities like virtual fundamental generation, timbral fusion, a general sense of "crunchiness" or "periodicity buzz," etc, at least as a general rule of thumb. We will simply call this sensation "justiness", as an informal and subjective term for the general quality that just intonation chords have.

It is clear right away that this "justiness" is not quite so simple as it is for dyads. For instance, we can look at the chords 16:20:24:30:36:45:54 and 15:19:23:29:35:44:53.[1] The former is basically a stack of three 4:5:6 chords on top of one another, and thus has lots of simple subdyads, subtriads, subtetrads, etc, whereas the latter has been formed by simply subtracting 1 from each note in the former chord. Thus, the latter is "simpler" from the standpoint of the heptadic complexity, but doesn't have many simple subchords at all. And at least to the ears of this author, the former seems to clearly sound much "justier" than than the latter - and in a very immediate way - even though the latter is less complex from a "heptadic" standpoint.

In addition, it is clear that this sensation of justiness has many different sub-aspects, many of which do not evolve in the same way as the combined complexity of chord grows. Terms like "periodicity buzz," "roughness," "combination tones," "virtual fundamentals," etc, all refer to different aspects of justiness, some of which involve primarily looking at subdyads, or isoharmonic chords, etc, or may not require the chord to be strictly "just" at all (such as the Mt. Meru scales). Or, if we are looking at JI chords, we may be evaluating something mathematical about the chord beyond just the complexity of the entire chord at once, or even its subchords, for some of these qualities. Thus, it is clear that justiness is a multidimensional quantity, with several different metrics simultaneously being used to evaluate different aspects of the consonance of a chord.

On the other hand, although combined chord complexity isn't everything, it is certainly something. It can be thought of as quantifying something like the "otonalness" of the chord, measuring how well the entire thing fuses into one sound. Furthermore, even if this doesn't perfectly predict every possible aspect of justiness, it's at least clear enough that these qualities at least tend to exist for simple JI chords, even if they also sometimes exist for other chords. Lastly, once we've worked through the mechanics of this simple complexity, we can perhaps build on it to create recursive metrics that also look at the subchords of a chord, and so on. So we will start there.

N-adic Simple Complexity/Otonalness

Suppose that [math]x_1 \colon x_2 \colon x_3 \colon \ldots \colon x_N[/math] is some [math]N[/math]-note JI chord. Then some function [math]f(x_1, x_2, \ldots, x_N)[/math] can be said to measure the simple complexity or otonalness of our N-note chord if it is monotonically increasing in each variable separately. We will sometimes use the notation [math]\mathbf{X} = x_1 \colon x_2 \colon x_3 \colon \ldots \colon x_N[/math], so that our simple complexity function can be written [math]f(X)[/math].

We may also look at functions [math]f[/math] which are defined for chords with a variable number of notes. This can also be thought of as a family of functions [math]f_1, f_2, f_3, \ldots[/math], each giving the simple complexity for chords of size 1, 2, 3, etc.

Then the function [math]f_N[/math] is called a simple complexity on N-ads or N-adic simple complexity (dyadic, triadic, tetradic, etc).

The entire family of functions, which we can think of as just one function, is called a simple complexity on all chords or *-adic simple complexity.

Comparing Chords of Different Sizes

The actual details of how to generalize certain dyadic complexities to larger chords are rather interesting. For instance, the Benedetti height of a ratio [math]n/d[/math] is defined as [math]nd[/math]. Clearly, for larger chords, we can look at the product [math]x_1 \cdot x_2 \cdot ... \cdot x_N[/math]. As long as we are looking only at N-ads, this is a perfectly reasonable way to measure the otonalness or simple complexity of only those N-ads, but as soon as we look at comparing chords of different size, we have to know how to scale things: do we take our product to a power, or divide by something, etc?

One interesting observation that we can use to develop the right behavior is that increasing the number of notes sometimes increases some of these psychoacoustic effects. One good example is to take the dyad 11:13 and extend it to the chord 11:13:15:17:19:21:23:25. To the ears of this author, the former is noticeably crunchier than the former. One can also try 13:15 and 13:15:17:19:21:23:25:27. Note that we make no judgment on the absolute objective crunchiness of 11:13 and 13:15 to begin with, only note that whatever it is, it is apparently increased from extending the chord in this way.

Now, of course, we have cheated somewhat - note that we have extended the chord in such a way that the differences between each frequency ratio are 2, making this an isoharmonic chord, which are known to strongly exhibit "periodicity buzz." Still, though, this general principle seems to hold to some degree, even if some of the notes are moved around by 1 here and there to form a non-isoharmonic chord, and it works well enough as a basic guiding principle to be viewed as significant, at least in the view of this author.

So we would at least like some kind of reasonable starting point in modeling this phenomenon, so that we can compare chords of different sizes.

A Simplified, But Useful Criterion

One possible way forward is to imagine that the incoming JI chord as a set of upper harmonics of some fundamental frequency - the GCD of the notes of the chord - and we want to quantify how strongly the chord matches that virtual fundamental. We can make some very basic assumptions:

1. Given some fundamental frequency [math]f[/math], an N-note chord built from very high harmonics of [math]f[/math] will be a weaker match than an N-note chord built from lower harmonics of [math]f[/math]. In other words, 4:5:6 matches "1" better than 5:6:7. This is just a restatement of our definition of the simple complexity above.

2. Given some fundamental frequency [math]f[/math] and a chord built from the harmonics of [math]f[/math], adding another note from the harmonics of [math]f[/math] always increases the strength of the match to [math]f[/math]. In other words, 4:5:6:7 matches "1" better than 4:5:6.

The second proposition is the interesting one. It means that the chord 1:2 evokes "1" less than 1:2:3, which is less than 1:2:3:4, and so on, so that the chord 1:2:3:4:... evokes the frequency "1" most strongly.

Strictly speaking, this is most true if the notes of the chord are played with sine waves, with the volume decreasing as you get higher into the harmonic series. In that situation, the chord 1:2:3:4:5:6:7:... is basically something like a sawtooth wave. It isn't quite so apparent that if you instead have all harmonics at equal volume, that the resulting "delta comb" should really be viewed as more "consonant" than a sine wave in an absolute sense. This is even more true if, instead of sine waves, all of the notes are being played with some arbitrary harmonic timbre! Still, though, we still view the basic spirit of this as a "good-enough" rule of thumb which is simple enough to be worth modeling. (As we see, we will depart from strict adherence to this criterion anyway.)

Dirichlet Complexity

One simple function, which meets both of our simple criteria, is simply to assign the n'th harmonic a weighting which is some power of n, called the rolloff, and then sum together the weights to get a strength for the overall chord. Thus, we have

[math]\displaystyle f_s(x_1, x_2, \ldots, x_N) = \frac{1}{x_1^s} + \frac{1}{x_2^s} + ... + \frac{1}{x_N^s}[/math]

We also note that this function is even defined for infinite chords as long as [math]s \gt 1[/math]:

[math]\displaystyle f_s(x_1, x_2, \ldots) = \frac{1}{x_1^s} + \frac{1}{x_2^s} + ...[/math]

This type of infinite series can be thought of as a type of general Dirichlet series, with the caveat that the numerators are all equal to 1 and we only care about real values of [math]s[/math].

Now, we note that function is inverted, so that it is a simplicity rather than a complexity. To correct this, we simply take the reciprocal:

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N) = \frac{1}{\frac{1}{x_1^s} + \frac{1}{x_2^s} + ... + \frac{1}{x_N^s}}[/math]

We call [math]D_s(\mathbf{C})[/math] the Dirichlet complexity of our chord [math]\mathbf{C}[/math], with the free parameter [math]s[/math] choosing the rolloff. In general, we will view [math]s = 1[/math] as a decent choice, given that we typically only care about finite chords.


Asymptotic Relationship to Tenney Height and Weil Height

One interesting thing is that the Tenney Height and Weil Height are special cases of the Dirichlet height. That is, at least if we are only focusing on chords of some particular size N, we have the following theorems:

  • [math]\lim_{s \to 0} D_s(x_1,x_2,...,x_N) \sim x_1 \cdot x_2 \cdot ... \cdot x_N[/math]
  • [math]\lim_{s \to -\infty} D_s(x_1,x_2,...,x_N) \sim \max(x_1, x_2, \ldots, x_N)[/math]
  • [math]\lim_{s \to \infty} D_s(x_1,x_2,...,x_N) \sim \min(x_1, x_2, \ldots, x_N)[/math]

where the symbol [math]A \sim B[/math] is to be interpreted as "[math]A[/math] ranks chords the same as [math]B[/math]", meaning we only care about the result up to monotonic transformations.

To see this, the first thing we should note is that the Dirichlet complexity can be viewed in terms of the power mean of the coefficients of the chord. The power mean is defined

[math]\displaystyle M_p(x_1, x_2, \ldots, x_N) = \left(\frac{1}{N} \left(x_1^p + x_2^p + ... + x_N^p \right) \right)^{(1/p)}[/math]

Thus, we can define the Dirichlet complexity in terms of the power mean:

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N) = \frac{1}{N \cdot M_{-s}(x_1, x_2, \ldots, x_N)^{-s}} = \frac{1}{N} \cdot M_{-s}(x_1, x_2, \ldots, x_N)^{s}[/math]

We also note that, as long as we only care about chords of some particular size [math]N[/math], it makes no difference if we multiply by [math]N[/math] and raise the entire thing to the power of [math]1/s[/math], as neither affects the chord rankings within each chord size. So we get

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N)^{1/s} \cdot N^{1/s} = M_{-s}(x_1, x_2, \ldots, x_N)[/math]

And now we need only use the proof that it is well known that the power mean tends to the geometric mean as [math]p \to 0[/math], the minimum as [math]p \to -\infty[/math] and the maximum as [math]p \to \infty[/math]. Thus, since we have flipped the sign so that [math]D_s = M_{-s}[/math], we have the aforementioned result, but with [math]s \to \infty[/math] being the minimum and [math]s \to -\infty[/math] being the maximum.

Since for dyads, at least in terms of relative rankings, the geometric mean is equivalent to the Tenney Height, and the maximum the Weil Height, we have our result.

Note that some version of this also holds when extending to multiple chords, at least for [math]s \to \pm \infty[/math]. To see this, we note that we can still raise things to the power of [math]1/s[/math] without affecting the result, but we can no longer multiply by [math]N[/math] as that now affects the rankings. So we still have the identity

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N)^{1/s} = \frac{1}{N^{1/s}} M_{-s}(x_1, x_2, \ldots, x_N)[/math]

As [math]s \to \pm \infty[/math], that [math]\frac{1}{N^{1/s}}[/math] term tends to 1, so that it cancels out and we are simply left with the minimum and maximum of the chords.

For [math]s \to 0[/math], on the other hand, the [math]\frac{1}{N^{1/s}}[/math] term tends toward infinity, and what we are left with is a ranking which is basically equivalent to the geometric mean for each chord type, but where all triads are ranked better than dyads, all tetrads better than triads, etc. It turns out, however, that we have another useful relationship to the Tenney height, which we will look at next.

The Perils of Span: A Better Metric

One very important theorem about the Dirichlet height is that the free parameter [math]s[/math] basically determines much we care about the "span" of the chord, meaning the sizes in cents (or octaves, or whatever) of the subdyads of the chord.

In particular, as [math]s[/math] increases, chords with a larger span are actually rewarded with a lower complexity ranking. This may not always be desirable, and we can look at "span-corrected" versions to compensate for this.

To see the problem, let's compare 6/5 and 30/1 and [math]s=1[/math]. We get [math]D_1(6/5) = 2.727[/math] and [math]D_1(30/1) = 0.968[/math]. This, we can see that 6/5 is almost three times as complex as 30/1, using this metric, which is... somewhat strange.

On the one hand, this really does follow from our original assumptions, because in 30/1, the fundamental "1" is literally played explicitly in the dyad. So of course it quite strongly evokes that fundamental much more than in 6/5, which is what these rankings reflect.

But on the other hand, 30/1 takes up almost five octaves, or almost half of the range of human hearing. So even if it strongly evokes some fundamental, it isn't all that useful of a musical interval simply because it is so extremely large.

We can also note that [math]D_0(6/5) = D_0(30/1)[/math], since we've shown that this asymptotically ranks things identically to the Tenney height, which ranks them equal. This is already rather questionable, although it is typically good enough for most purposes, but from this we can see that increasing the value of [math]s[/math] actually now prioritizes the larger interval!

We will see below that we can "span-correct" our Dirichlet complexity to get a much simpler and better metric, but which retains a useful and sensible way to rank chords of different sizes. This will cause the result to no longer follow that second criterion, in which adding a note to a chord always increases its strength, but we still view it as a useful guiding principle which is "approximately followed" in a way that is good enough (or maybe better).

Generalized Tenney and Weil Heights: Span-Corrected Dirichlet Complexity

Generalized Tenney Height

In general, we will derive the following expressions for the Generalized Benedetti Height or Generalized Tenney Height of any chord, in such a way that chords can be reasonably compared of different sizes:

[math]\displaystyle B_s(x_1, x_2, \ldots, x_N) = \frac{(x_1 \cdot x_2 \cdot \ldots \cdot x_N)^{1/N}}{N^{1/s}}[/math]

where the above is the "Benedetti" version. The numerator is the geometric mean, and the denominator normalizes by the size of the chord. The logarithmic "Tenney" version is as follows:

[math]\displaystyle T_s(x_1, x_2, \ldots, x_N) = \frac{1}{N} \log(x_1 \cdot x_2 \cdot \ldots \cdot x_N) - \frac{1}{s}\log(N)[/math]

In both cases, the free parameter [math]s[/math], which is derived from the original expression, now only determines the way that differently-sized chords scale relative to one another. The results for some value of [math]s[/math], when comparing chords of different sizes, will closely resemble the relative scaling of chord sizes in the Dirichlet complexity of equal value [math]s[/math], but without the caveats regarding span.

The value [math]s=1[/math] is the largest such value that we still have the complexity is decreasing from 1 to 1:2 to 1:2:3 and so on. Interestingly, 1:2:3:4:5:... has a complexity of [math]1/e[/math] with [math]s=1[/math]. Note that if we have [math]s \lt 0[/math], we end up reversing the span-correction.

We view [math]s=1[/math] as a very good default value for this metric.

Generalized Weil Height

We will likewise derive the same expressions for the Generalized Weil Height of any chord:

[math]\displaystyle W_s(x_1, x_2, \ldots, x_N) = \frac{\max(x_1, x_2, \ldots, x_N)}{N^{1/s}}[/math]

or the logarithmic version

[math]\displaystyle \log W_s(x_1, x_2, \ldots, x_N) = \log \max(x_1, x_2, \ldots, x_N) - \frac{1}{s} \log N[/math]

where the [math]s[/math] parameter has the same interpretation as the above.

Interestingly, the Generalized Weil Height for all harmonics from [math]1[/math] to [math]N[/math] has the following form

[math]\displaystyle W_s(1, 2, \ldots, N) = N/N^{1/s}[/math]

so that in fact, we note that if we set [math]s=1[/math], the Generalized Weil Height of the first N harmonics are all equal to [math]1[/math]! This is a very unique and interesting property and we will talk about it in the section called The Bar below.

Generalized Tenney-Weil Height

For dyads, note that we have the following property for Weil height:

[math]\log \max(n,d) = \frac{1}{2}\log(n\cdot d) + \frac{1}{2} \log |n/d|[/math]

The first term on the right hand side is the Tenney height, and the second term is the span. As a result, we can see that the Weil height is equal to the Tenney height plus the span, so that it can already be viewed as an alteration of the Tenney height with even greater emphasis placed on small intervals.

We have the following generalization for larger chords, where we assume without loss of generality that we have [math]x_1 \leq x_2 \leq \ldots \leq x_N[/math]:

[math]\displaystyle \log W_s(x_1, x_2, \ldots, x_N) = \log B_s(x_1, x_2, \ldots, x_N) + \frac{1}{N}\log x_N/x_1 + \frac{1}{N}\log x_N/x_2 + \ldots + \frac{1}{N}\log x_N/x_{N-1}[/math]

Note that the latter terms are the spans of the upper subdyads of the chord. Thus, we have the same basic principle.

We can use this to define the Tenney-Weil Height (or Benedetti-Weil height, if you prefer) for the chord, with a free parameter [math]k[/math] interpolating between the two (and thus determining just how much we care about the span). We will define this as follows:

[math]\displaystyle TW_s(x_1, x_2, \ldots, x_N) = (1-k) \log B_s(x_1, x_2, \ldots, x_N) + (k) \log T_s(x_1, x_2, \ldots, x_N)[/math]

so that for [math]k=0[/math] we get the Tenney height, for [math]k=1[/math] we get the Weil height (with even greater emphasis on smaller intervals), and for other values of [math]k[/math] we can get values in between or beyond.

Note this new free parameter [math]k[/math] is still independent of the free parameter [math]s[/math], which chooses how chords of different sizes scale relative to one another (for which [math]s=1[/math] is still a decent default value).


Derivation for Dyads

To start, let's look at some dyad [math]a:b[/math], where we can assume without loss of generality that [math]a \leq b[/math]. The the Dirichlet height of the dyad, then, is

[math]\displaystyle D_s(a, b) = \frac{1}{1/a^s + 1/b^s} = \frac{(ab)^s}{a^s + b^s}[/math]

We can multiply both numerator and denominator by [math](ab)^{-s/2}[/math] to get

[math]\displaystyle D_s(a, b) = \frac{(ab)^{s/2}}{(a/b)^{s/2} + (b/a)^{s/2}}[/math]

That denominator can be rewritten in terms of the [math]\cosh[/math] function as follows:

[math]\displaystyle (a/b)^{s/2} + (b/a)^{s/2} = 2 \cosh(s/2 \log(b/a))[/math]

Now, we note that [math]\log(b/a)[/math] can basically be thought of as a function of the span of the dyad. The span in cents would be [math]\text{cents}(b/a) = 1200\log2(b/a)[/math], so we have [math]\log(b/a) = \text{cents}(b/a) \log(2)/1200[/math].[2] Thus, the above expression is a monotonic function purely in terms of the span. Putting it all together, we have

[math]\displaystyle D_s(a, b) = \frac{(ab)^{s/2}}{2 \cosh(s/2 \log(b/a))}[/math]

The numerator is the Benedetti height raised to the power of s, but the denominator is an exponentially increasing monotonic function of the span! This is the basic issue: that intervals are literally being rewarded as the span increases.

It is relatively easy to see that this denominator will be maximized when b/a = 1/1, meaning the span is zero, so that the cosh term cancels out and the denominator is 2. This is the basic behavior for relatively small intervals, which is what we want. For relatively large intervals, on the other hand, the entire thing tends to [math]\min(a,b)^s[/math] - and since we have the identity [math]\log \min(a,b) = 1/2 \log(a \cdot b) - 1/2 |\log(b/a)|[/math], meaning we are subtracting the span from the Tenney height, this is definitely not what we want.

So what we will do is simply modify our formula so that the behavior for relatively small intervals is preserved across the entire interval spectrum, thus "span-correcting" our original formula. Doing so, we simply keep the numerator (the "complexity" part) the same, while pretending that we have always plugged 1/1 into the denominator. Thus, we simply get

[math]\displaystyle \frac{(ab)^{s/2}}{2}[/math]

We are mostly done here, although the result is a bit nicer if we raise the whole thing to the power of [math]1/s[/math], which doesn't affect the results in any way. If we do, we get our original result for dyads:

[math]\displaystyle B_s(a, b) = \frac{(ab)^{1/2}}{2^{1/s}}[/math]

Now, as we will see, this basic principle also works very well for general N-ads, giving us a formula with the same basic properties as the original when comparing between chords of different sizes, while also span-correcting for very large intervals in the same way.

Derivation for General N-ads

Although it is somewhat tedious, one can derive similar expressions to the above for any N-adic Dirichlet complexity, so that we can split the expression into a function of the Tenney height being divided by some monotonic function of the span of the subdyads of our chord. Then we can proceed to span-correct in a similar way as before.

However, there is a very simple and elegant proof - one so simple that it seems almost tautological - which can prove our statement for arbitrary N-ads, both for Weil and Tenney height. To see this, we will look at our original definition of Dirichlet Complexity:

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N) = \frac{1}{\frac{1}{x_1^s} + \frac{1}{x_2^s} + \ldots + \frac{1}{x_N^s}}[/math]

Since this is basically just a harmonic mean divided by [math]N[/math], we can rewrite as

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N) = \frac{1}{N} \cdot \text{harmean}(x_1^s, x_2^s, \ldots, x_N^s)[/math]

where [math]\text{harmean}[/math] refers to the harmonic mean. Then, although it seems somewhat spurious to do so, we can rewrite this as

[math]\displaystyle D_s(x_1, x_2, \ldots, x_N) = \frac{1}{N} \cdot \frac{\text{geomean}(x_1^s, x_2^s, \ldots, x_N^s)}{\text{geomean}(x_1^s, x_2^s, \ldots, x_N^s)/\text{harmean}(x_1^s, x_2^s, \ldots, x_N^s)}[/math]

where [math]\text{geomean}[/math] likewise refers to the geometric mean. This, of course, trivially cancels out to give us our original expression, although we may wonder why we have introduced the geometric mean at all.

The answer is that we have the amazing property that the harmonic mean of a set of numbers is always less than or equal to the geometric mean, with equality only if all of the numbers are equal. As a result, we can use this to show that the denominator of this expression, or geomean/harmean, happens to already be a function of only the spans of the subdyads of the chords!

To see this, we can easily note that if we multiply all of the coefficients in our original chord by two, for instance, going from 4:5:6 to 8:10:12, the geomean/harmean quotient remains unaltered, as the multiplier cancels out - so the only thing we care about is the general shape of the chord, not the absolute values of the coefficients (unlike with Tenney height, for instance). And we also note that this quotient is maximized, again, if all of the numbers are equal - and if they are, that means that the chord we are evaluating is 1:1:1:...:1, which is of minimum span. Once any of the numbers increase from this, the geometric mean becomes larger than the harmonic mean, so this quotient will become larger than 1. As a result, it is easy to see that this quotient is a monotonically increasing function of the spans of the subdyads of the chord. And since we are dividing by this result, this means we are, once again, dividing by some monotonically increasing function of the span.

We can then do the same span-correction procedure as before, where we want the behavior for "small" intervals to be exhibited across the entire spectrum, but with the same properties in the way we compare chords of different sizes. So if we simply just "pretend" 1:1:1:...:1 is being plugged into the denominator no matter what, we get

[math]\displaystyle \frac{\text{geomean}(x_1^s, x_2^s, \ldots, x_N^s)}{N}[/math]

and once again, raising to the power of [math](1/s)[/math] to get a slightly nicer looking result, we get our original expression for the generalized Benedetti Height:

[math]\displaystyle B_s(x_1, x_2, \ldots, x_N) = \frac{\text{geomean}(x_1, x_2, \ldots, x_N)}{N^{1/s}}[/math]

Now, as a last note, we can easily see that our choice of the geometric mean above was somewhat arbitrary. The main important point is that this quotient of two power means gives us something which already is, perhaps non-obviously, already a non-trivial function of the spans of the subdyads of the chord. But, we could have chosen any mean which has the property that it is always greater than or equal to the harmonic mean. For instance, the maximum function, which can be viewed as the power mean as [math]p \to \infty[/math], also has the same property. If we did the above with the max function instead, we'd have instead gotten our expression for the generalized Weil height:

[math]\displaystyle W_s(x_1, x_2, \ldots, x_N) = \frac{\text{max}(x_1, x_2, \ldots, x_N)}{N^{1/s}}[/math]

These are both useful as the dyadic versions are norms on monzo space, which makes them easy to prove theorems about. And in general, since the Weil height can itself be looked at as a version of the Benedetti height with even greater span-correction, we can take some average of both to determine how much we care about the span. The easiest way is to take a weighted average of the logarithmic versions of these two height functions, which corresponds to a geometric mean of the non-logarthmic versions, giving us our original expression for the Tenney-Weil height.

Examples

Given all of this, it may be useful to see some examples. Let's look at the set of all chords of size 1-5 notes, with coefficients of at most 5, just to see what we get. Higher values on the list are "stronger" or "less complex". This is with the Benedetti Height with s=1:

'''Benedetti Height, s=1'''
1:2:3:4:5 - 0.52103
1:2:3:4 - 0.55334
1:2:3:5 - 0.58509
1:2:3 - 0.60571
1:2:4:5 - 0.62872
1:2:4 - 0.66667
1:3:4:5 - 0.69579
1:2 - 0.70711
1:2:5 - 0.71814
1:3:4 - 0.76314
1:3:5 - 0.82207
2:3:4:5 - 0.82744
1:3 - 0.86603
1:4:5 - 0.90481
2:3:4 - 0.9615
1 - 1
1:4 - 1
2:3:5 - 1.0357
1:5 - 1.118
2:4:5 - 1.14
2:3 - 1.2247
3:4:5 - 1.305
2:5 - 1.5811
3:4 - 1.7321
3:5 - 1.9365
4:5 - 2.2361

So we can see that the "strongest" chord is 1:2:3:4:5, followed by 1:2:3:4 and 1:2:3:5, then 1:2:3, and so on. We can see that 3:4:5 is ranked somewhere near 2:3 and 2:5, and likewise that 2:4:5 is near 1:5 and 2:3 and so on. Increasing the max coefficient to 7, which is not shown here because of brevity, we get that 5:6:7 is between 3:5 and 4:5, and 4:6:7 is between 3:4 and 2:7, which seems as sensible as anything else.

Note that the span-correction has caused the results not to strictly follow our second criterion from before. For instance, 1:2 is ranked slightly higher than 1:2:5. This is in part because of the span-correction lowering the rank of 1:2:5 due to the large interval of 5/1 on the outside. Still, it serves as a basic guiding principle for these chords.

If we do this instead with the Weil height, we get something which looks slightly tidier at first glance, but which on further inspection has some strange features. Let's take a look:

'''Weil Height, s=1'''
1:2:3:4:5 - 1
1:2:3:4 - 1
1:2:3 - 1
1:2 - 1
1 - 1
1:2:3:5 - 1.25
1:2:4:5 - 1.25
1:3:4:5 - 1.25
2:3:4:5 - 1.25
1:2:4 - 1.3333
1:3:4 - 1.3333
2:3:4 - 1.3333
1:3 - 1.5
2:3 - 1.5
1:2:5 - 1.6667
1:3:5 - 1.6667
1:4:5 - 1.6667
2:3:5 - 1.6667
2:4:5 - 1.6667
3:4:5 - 1.6667
1:4 - 2
3:4 - 2
1:5 - 2.5
2:5 - 2.5
3:5 - 2.5
4:5 - 2.5

This looks slightly neater at first because many similar chords of similar size are ranked together - but in this situation, the problem is that it is ranking many of these chords the same. So we have 1:2:5 and 3:4:5 ranked the same - either a blessing or a curse, depending on how you look at it (the latter is smaller in span, but more complex, and it balances out).

We are also now pretty far from our original criterion in that 1:2:3:5 is now ranked lower than 1, because we care about the span so much that adding harmonics is penalized just because the intervals are large. Again, a blessing or a curse, depending on what you are going for.

We can always interpolate between the two as well; if we go with the Tenney-Weil height for k=0.5, and again s=1, we get something intermediate:

'''Benedetti-Weil Height, k=0.5, s=1'''
1:2:3:4:5 - 0.72183
1:2:3:4 - 0.74387
1:2:3 - 0.77827
1:2 - 0.8409
1:2:3:5 - 0.8552
1:2:4:5 - 0.88651
1:3:4:5 - 0.9326
1:2:4 - 0.94281
1 - 1
1:3:4 - 1.0087
2:3:4:5 - 1.017
1:2:5 - 1.094
2:3:4 - 1.1323
1:3 - 1.1398
1:3:5 - 1.1705
1:4:5 - 1.228
2:3:5 - 1.3139
2:3 - 1.3554
2:4:5 - 1.3784
1:4 - 1.4142
3:4:5 - 1.4748
1:5 - 1.6719
3:4 - 1.8612
2:5 - 1.9882
3:5 - 2.2003
4:5 - 2.3644

So we again get something sensible, with results intermediate to the Tenney and Weil results. My view is that the Tenney one is probably just as valid in most cases, and perhaps somewhat simpler, but it's up to you.

Let's focus on just the Tenney height, and look at how the value of s changes things. If we instead set s to 0.5, we get the following:

'''Benedetti height, s=0.5'''
1:2:3:4:5 - 0.10421
1:2:3:4 - 0.13834
1:2:3:5 - 0.14627
1:2:4:5 - 0.15718
1:3:4:5 - 0.17395
1:2:3 - 0.2019
2:3:4:5 - 0.20686
1:2:4 - 0.22222
1:2:5 - 0.23938
1:3:4 - 0.25438
1:3:5 - 0.27402
1:4:5 - 0.3016
2:3:4 - 0.3205
2:3:5 - 0.34525
1:2 - 0.35355
2:4:5 - 0.37999
1:3 - 0.43301
3:4:5 - 0.43499
1:4 - 0.5
1:5 - 0.55902
2:3 - 0.61237
2:5 - 0.79057
3:4 - 0.86603
3:5 - 0.96825
1 - 1
4:5 - 1.118

We can see as we move s towards 0 that the larger chords tend to slide towards the top. We have one pentad, then mostly tetrads, then mostly triads, and then mostly dyads, with some very slight intermingling at the transition points. As s moves closer to 0 this behavior increases.

Likewise, if we make s larger and move it towards infinity, the Tenney height tends toward just the geometric mean, without being divided by any normalizing term. Then we get this:

'''Benedetti height, s=Inf'''
1 - 1
1:2 - 1.4142
1:3 - 1.7321
1:2:3 - 1.8171
1:4 - 2
1:2:4 - 2
1:2:5 - 2.1544
1:2:3:4 - 2.2134
1:5 - 2.2361
1:3:4 - 2.2894
1:2:3:5 - 2.3403
2:3 - 2.4495
1:3:5 - 2.4662
1:2:4:5 - 2.5149
1:2:3:4:5 - 2.6052
1:4:5 - 2.7144
1:3:4:5 - 2.7832
2:3:4 - 2.8845
2:3:5 - 3.1072
2:5 - 3.1623
2:3:4:5 - 3.3098
2:4:5 - 3.42
3:4 - 3.4641
3:5 - 3.873
3:4:5 - 3.9149
4:5 - 4.4721

Now there is plenty of intermingling, but it isn't at all apparent that the results are sensible. For instance, 2:5 is stronger than 2:3:4:5 even though the latter is fleshing out the chord with unambiguously simpler dyads, 1:4 is way ahead of 1:2:3:4, etc. The problem is that the bar is too high for tetrads; it simply is too hard for a tetrad to compete with smaller chords, since the geometric mean always tends to increase when adding notes, and we don't have this magic normalizing factor to decrease it anymore. In fact, we can get some interesting things by looking at this notion of "the bar" for a particular metric, which we will do below.

The viewpoint of this author is that the most sensible results are when we have s=1, with Tenney height preferred to Weil height. The all-around best seems to be the Tenney-Weil height with k somewhere near 0.5, although having k=0 is probably good enough and slightly simpler to work with, particularly since it reduces to an L1 norm for dyads. We will do an extended listing of both, this time with at-most tetrads and with coefficients up to 7. First the Tenney height:

'''Benedetti height, s=1, tetrads with max-coefficient=7'''
1:2:3:4 - 0.55334
1:2:3:5 - 0.58509
1:2:3 - 0.60571
1:2:3:6 - 0.61237
1:2:4:5 - 0.62872
1:2:3:7 - 0.63643
1:2:4:6 - 0.65804
1:2:4 - 0.66667
1:2:4:7 - 0.68389
1:3:4:5 - 0.69579
1:2:5:6 - 0.69579
1:2 - 0.70711
1:2:5 - 0.71814
1:2:5:7 - 0.72313
1:3:4:6 - 0.72824
1:3:4:7 - 0.75685
1:2:6:7 - 0.75685
1:3:4 - 0.76314
1:2:6 - 0.76314
1:3:5:6 - 0.77002
1:3:5:7 - 0.80027
1:2:7 - 0.80338
1:3:5 - 0.82207
2:3:4:5 - 0.82744
1:4:5:6 - 0.82744
1:3:6:7 - 0.83759
1:4:5:7 - 0.85995
2:3:4:6 - 0.86603
1:3 - 0.86603
1:3:6 - 0.87358
2:3:4:7 - 0.90005
1:4:6:7 - 0.90005
1:4:5 - 0.90481
2:3:5:6 - 0.91571
1:3:7 - 0.91964
2:3:5:7 - 0.95169
1:5:6:7 - 0.95169
2:3:4 - 0.9615
1:4:6 - 0.9615
2:4:5:6 - 0.98399
2:3:6:7 - 0.99607
1 - 1
1:4 - 1
1:4:7 - 1.0122
2:4:5:7 - 1.0227
2:3:5 - 1.0357
1:5:6 - 1.0357
2:4:6:7 - 1.0703
3:4:5:6 - 1.089
1:5:7 - 1.0904
2:3:6 - 1.1006
1:5 - 1.118
3:4:5:7 - 1.1318
2:5:6:7 - 1.1318
2:4:5 - 1.14
2:3:7 - 1.1587
1:6:7 - 1.1587
3:4:6:7 - 1.1845
2:3 - 1.2247
1:6 - 1.2247
3:5:6:7 - 1.2525
2:4:7 - 1.2753
3:4:5 - 1.305
2:5:6 - 1.305
1:7 - 1.3229
4:5:6:7 - 1.3459
2:5:7 - 1.3738
3:4:6 - 1.3867
3:4:7 - 1.4598
2:6:7 - 1.4598
3:5:6 - 1.4938
3:5:7 - 1.5726
2:5 - 1.5811
4:5:6 - 1.6441
3:6:7 - 1.6711
4:5:7 - 1.7308
3:4 - 1.7321
4:6:7 - 1.8393
2:7 - 1.8708
3:5 - 1.9365
5:6:7 - 1.9813
4:5 - 2.2361
3:7 - 2.2913
4:7 - 2.6458
5:6 - 2.7386
5:7 - 2.958
6:7 - 3.2404

The results seem reasonably sensible to me, although with a few little caveats here and there - we have 1:7 ranked above 4:5:6:7, partly because there is no notion of octave-equivalence involved, and partly because Tenney height may not be prioritizing small-span intervals quite enough. But this is at least ballpark-sensible. We can tweak it slightly by looking at the Tenney-Weil norm with k=0.5 and s=1:

'''Benedetti-Weil height, s=1, k=0.5, tetrads with max-coefficient=7'''
1:2:3:4 - 0.74387
1:2:3 - 0.77827
1:2 - 0.8409
1:2:3:5 - 0.8552
1:2:4:5 - 0.88651
1:3:4:5 - 0.9326
1:2:4 - 0.94281
1:2:3:6 - 0.95841
1:2:4:6 - 0.99351
1 - 1
1:3:4 - 1.0087
2:3:4:5 - 1.017
1:2:5:6 - 1.0216
1:3:4:6 - 1.0452
1:2:3:7 - 1.0553
1:3:5:6 - 1.0747
1:2:4:7 - 1.094
1:2:5 - 1.094
1:4:5:6 - 1.1141
1:2:5:7 - 1.1249
2:3:4 - 1.1323
1:3 - 1.1398
2:3:4:6 - 1.1398
1:3:4:7 - 1.1509
1:2:6:7 - 1.1509
1:3:5 - 1.1705
2:3:5:6 - 1.172
1:3:5:7 - 1.1834
1:3:6:7 - 1.2107
2:4:5:6 - 1.2149
1:4:5:7 - 1.2267
1:4:5 - 1.228
1:2:6 - 1.2354
2:3:4:7 - 1.255
1:4:6:7 - 1.255
3:4:5:6 - 1.2781
2:3:5:7 - 1.2905
1:5:6:7 - 1.2905
2:3:5 - 1.3139
2:3:6:7 - 1.3203
1:3:6 - 1.3218
2:4:5:7 - 1.3378
2:3 - 1.3554
2:4:6:7 - 1.3686
1:2:7 - 1.3691
2:4:5 - 1.3784
1:4:6 - 1.3867
3:4:5:7 - 1.4073
2:5:6:7 - 1.4073
1:4 - 1.4142
1:5:6 - 1.4393
3:4:6:7 - 1.4398
1:3:7 - 1.4649
3:4:5 - 1.4748
3:5:6:7 - 1.4805
2:3:6 - 1.4837
4:5:6:7 - 1.5347
1:4:7 - 1.5368
1:5:7 - 1.595
2:5:6 - 1.6155
2:3:7 - 1.6443
1:6:7 - 1.6443
3:4:6 - 1.6654
1:5 - 1.6719
2:4:7 - 1.725
3:5:6 - 1.7285
2:5:7 - 1.7904
4:5:6 - 1.8134
3:4:7 - 1.8456
2:6:7 - 1.8456
3:4 - 1.8612
3:5:7 - 1.9155
1:6 - 1.9168
3:6:7 - 1.9746
2:5 - 1.9882
4:5:7 - 2.0096
4:6:7 - 2.0716
5:6:7 - 2.1501
1:7 - 2.1518
3:5 - 2.2003
4:5 - 2.3644
2:7 - 2.5589
3:7 - 2.8319
5:6 - 2.8663
4:7 - 3.043
5:7 - 3.2176
6:7 - 3.3677

This is probably my favorite of all the lists, although I would expect the Tenney height one would be good enough for most things, even if not perfect.

Next we will look at some ways to better quantify what this value of s is doing, and what "the bar" is for tetrads to outperform triads and so on.

The Bar

Given all of this, we may want to make explicit how these metrics rank chords of different sizes. One way to do so is to start with the "monad" of 1, and then look at which triads, tetrads, etc have the same complexity as that dyad. This sets a bar for how simple, in intuitive terms, a triad has to be to perform better than a dyad, and so on, and we will indeed call this the bar (or a bar, since it is generally non-unique) for our chordal complexity metric.

For instance, given the generalized Tenney Height with [math]s=1[/math], it is easy to see that the following chords all have the same complexity, which determines the bar for that metric. In fact, this is also the bar for the Weil height and all of the Tenney-Weil heights, for all values of [math]k[/math], as long as [math]s=1[/math]

  • 1
  • 2:2
  • 3:3:3
  • 4:4:4:4
  • 5:5:5:5:5
  • 6:6:6:6:6:6

And so on. Note that in this situation we aren't treating e.g. 2:2 as equivalent to the reduced ratio 1:1, but as its own thing.

This is a useful way to see how the complexity scales as we increase the number of notes in our chord. Thus, if we are starting with the monad "1", any dyad simpler than 2:2 will be ranked stronger than 1, as will any triad simpler than 3:3:3, and so on.

So as you add another harmonic to some chord, you have a little bit of leeway. If the new note is simple enough, the ranking will increase. But, if the new harmonic is very complex, then it is also very large, and that span-compensation starts to kick in, and the ranking will decrease.

Thus, for s=1, we have an increasingly simple set of chords from 1 -> 1:2 -> 1:2:3 -> 1:2:3:4 and so on, which we view as an interesting feature of this system, and a relatively simple "bar" which scales chords in a reasonably sensible way.

Note that for the Weil height, on the other hand, we get some additional ways to express this bar, because we have that the Weil height of 6:6:6:6:6:6, 1:2:3:4:5:6, and 1:1:1:1:1:6 are all the same thing. This may seem strange, but it's simply what results from taking the max of the elements in the ratio. One way to look at it is that the spans of the subdyads of 1:1:1:1:1:6 are much larger than those of 6:6:6:6:6:6 - you have five 6/1 dyads in the first chord, for instance, whereas the second chord is all unisons - and with the max function, these things simply balance out with the decreased complexity and they are ranked the same.

So with the Weil height, for s=1, we could have also written the bar like this:

  • 1
  • 1:2
  • 1:2:3
  • 1:2:3:4
  • 1:2:3:4:5
  • 1:2:3:4:5:6

or even

  • 1
  • 1:2
  • 1:1:3
  • 1:1:1:4
  • 1:1:1:1:5
  • 1:1:1:1:1:6

This would appear to be raising the bar - when written this way, now a large chord has to be much simpler in order to have the same complexity 1 - but because the other bar is also equally valid, it doesn't really make much difference either way. The Weil height simply ranks lots of things as equal in complexity, so we're really talking about a difference within the rankings of chords that are the same size, without really any significant change in large-scale behavior between chords (if the first effect is accounted for).

If we want that kind of change, we can change the value of s. For [math]s=1/2[/math], we get a slightly different bar for all of these heights:

  • 1
  • 4:4
  • 9:9:9
  • 16:16:16:16
  • 25:25:25:25:25
  • 36:36:36:36:36:36

Now the triads can be much more complex than before and still have lower complexity than the monad "1". Thus, the bar has been significantly lowered. Again, for the Weil height, we have that 36:36:36:36:36:36 is the same complexity as 1:1:1:1:1:36.

N-adic Recursive Complexity

These metrics are fairly useful as a starting point - from some basic first principles we have derived a fairly neat way to normalize the Tenney or Weil heights to compare chords of different sizes in a reasonably sensible way. We would like to build on this to derive a better metric for large chords.

Let's again look at the example of the 16:20:24:30:36:45:54 and 15:19:23:29:35:44:53, for which the former exhibits more of the formerly-described "justy" quality than the latter. The main thing is, even though the simple complexities above would rank the second chord better than the first, the first benefits from being just three 4:5:6's stacked on top of one another, so that everywhere you look there are simple subdyads, subtriads, etc. A metric of complexity which only looks at the entire chord without the subchords will not catch these kinds of things.

From a psychoacoustic standpoint, there are several things happening.

First, the presents of simpler subdyads means the first chord exhibits less psychoacoustic roughness than the second, so we should look at those.

Second, while the basic premise that the brain attempts to fit sounds to the harmonic series is fairly sound in some sense, in real life the brain does not literally attempt to fit the entire auditory signal into the perception of being one note. Rather, there are several pitched sound sources taking place at the same time, and the brain is attempting to sparsely locate several different pitched sounds at once. This is probably relevant to the perception of what are sometimes called "upper structure triads" in jazz, or upper structure subchords in general.

And of course, third, while psychoacoustic literature on this kind of thing is somewhat sparse, from a compositional and musical standpoint we can certainly note that these kinds of chords seem to sound very interesting, so we may as well look for them.

For these reasons, we want to build a better metric that also looks at the various subchords, each of which can be evaluated with the simple complexity we've built above. And since Tenney height seems to behave reasonably well, and is very easy to work with and compute, we can use that to form our composite metric. We will call such a metric a recursive complexity metric on chords.

Subcomplexities and Subfundamentals

There are two basic ways to proceed with this: the first, simpler method is to simply look at all of the subchords of our chord and evaluate the simple complexities of each. The result will be a vector of subcomplexities of our chord. We can then integrate these together into a notion of the scalar complexity of the chord by taking some monotonic function of the resulting vector of subcomplexities - such as a p-norm or power mean - to get a measure of the recursive complexity of some chord of size N. Then we can scale the result so that it is meaningful to compare chords of different sizes, as we did before.

The other way to proceed is to also look at not just the subcomplexities of the chords, but also keep note of when two subchords are evoking the same subfundamental. For instance, for the chord 4:5:7:9, all of the subdyads, subtriads, etc point to the same subfundamental (which would be "1"), except for the "sub-monads" which point to themselves. As a result, there are really five pitched sounds of interest here: the individual notes themselves, as atomic pitched sounds, and the virtual "1", which every possible subdyad, subtriad, etc identically points to. So our set of subfundamentals for this chord would be {1, 4, 5, 7, 9}. On the other hand, if we also add the note "6" to the above chord, making 4:5:6:7:9, we also have a new subfundamental at "2", which the 4:6 points to as the second and third harmonics of, as well as "3", which the 6:9 points to as the second and third harmonics of, so that we now have {1, 2, 3, 4, 5, 7, 9}. So in this method, we look at all of the subfundamentals evoked and assign a strength to each one. We end up with a vector of strengths for each harmonic from 1 to M, where M is equal to the max coefficient of the chord, and we can then incorporate that into a general score for the chord. We can also, if we care, look at how harmonically related the various subfundamentals are to one another.


TODO: add more later...

  1. TODO: add audio examples
  2. In fact, this can also be thought of as a representation 'of' the span in terms of a different unit: rather than cents, we are using "nepers", where one "neper" is equal to [math]1200\log_2(e) = 1731.234[/math] cents, rather than the typical units of cents or octaves - perfectly legitimate, if not a bit strange, and used rather frequently in the writings of the late Martin Gough.