Revision as of 03:38, 25 November 2018

Harmonic Entropy, sometimes abbreviated as "HE", is a simple model to quantify the extent to which musical chords exhibit various psychoacoustic effects, lumped together in a single construct called psychoacoustic concordance. It was invented by Paul Erlich and developed extensively on the Yahoo! tuning and harmonic_entropy lists. Various later contributions to the model have been made by Steve Martin, Mike Battaglia, Keenan Pepper, and others.

Background

The general workings of the human auditory system lead to a plethora of well-documented and sonically interesting phenomena that can occur when a musical chord is played:

The perception of partial timbral fusion of the chord into one complex sound
The appearance of a virtual fundamental pitch in the bass
Critical band effects, such as timbral beatlessness, compared to mistunings of the chord in the surrounding area
The appearance of a quick fluttering effect sometimes known as periodicity buzz

There has been much research specifically on the musical implications critical band effects in the literature (e.g. Sethares's work), which are perhaps the psychoacoustic phenomena that readers are most familiar with. However, the modern xenharmonic community has displayed immense interest in exploring the other effects mentioned above as well, which have proven extremely important to the development of modern xenharmonic music.

These effects sometimes behave differently, and do not always appear strictly in tandem with one another. For instance, Paul Erlich has noted that most models for beatlessness measure 10:12:15 and 4:5:6 as being identical, whereas the latter exhibits more timbral fusion and a more salient virtual fundamental than the former.

However, suppose we want to come up with a combined measure for how often effects such as the above tend to occur. It is then useful to note that:

In general, effects such as these tend to appear most strongly for those chords with large subsets that correspond to simple chunks of the harmonic series
In general, the effects produced exhibit some degree of tolerance for mistuning

This enables us to speak of a general notion of the psychoacoustic concordance of an interval - the degree to which effects such as the above will appear when an arbitrary musical chord is played. Additionally, chords which are very inharmonic often exhibit a quality known as psychoacoustic discordance.

While psychoacoustic concordance is not a feature universal to all styles of music, it has been utilized significantly in Western music in the study of intonation. For instance, flexible-pitch ensembles operating within 12-EDO, such as barbershop quartets and string ensembles, will often adjust intonationally from the underlying 12-EDO reference to maximize the concordance of individual chords. Indeed, the entire history of Western tuning theory -- from meantone temperament, to the various Baroque well-temperaments, to 12-EDO itself, to the modern theory of regular temperament -- can be seen as an attempt to reason mathematically about how to generate manageable tuning systems that will maximize concordance and minimize discordance.

The Harmonic Entropy model is a simple way of quantifying how much an arbitrary chord will exhibit psychoacoustic concordance.

Concordance has often been confused with actual musical consonance, an unfortunate fact made more common by the psychoacoustics literature under the unfortunate name sensory consonance, most often used to refer to phenomena related to roughness and beatlessness specifically. This is not to be confused with the more familiar construct of tonal stability, typically just called "consonance" in Western common practice music theory and sometimes clarified as "musical consonance" in the music cognition literature. To make matters worse, the literature has also at times referred to concordance -- and not tonal stability -- as tonal consonance, often referring to phenomena related to virtual pitch integration, creating a complete terminological mess. As a result, the term "consonance" has been completely avoided in this article.

Basic Model: Shannon Entropy

The original Harmonic Entropy model limited itself to working with dyads. More recently, work by Steve Martin and others has extended this basic idea to higher-cardinality chords. This article will concern itself with dyads, as the dyadic case is still the most well-developed, and many of the ideas extend naturally to larger chords without need for much exposition.

The general idea of Harmonic Entropy is to first develop a discrete probability distribution quantifying how strongly an arbitrary incoming dyad "matches" every element in a set of basis rational intervals, and then seeing how evenly distributed the resulting probabilities are. If the distribution for some dyad is spread out very evenly, such that there is no clear "victor" basis interval that dominates the distribution, the dyad is considered to be more discordant; on the other extreme, if the distribution tends to concentrate on one or a small set of dyads, the dyad is considered to be more concordant.

A clear mathematical way of quantifying this "dispersion" is via the Shannon entropy of the probability distribution, which can be thought of as describing the "uncertainty" in the distribution. A distribution which has a very high probability of picking one outcome has low entropy and is not very uncertain, whereas a distribution which has the probability spread out on many outcomes is highly uncertain and has a high entropy.

Definitions

To formalize our notion of Shannon entropy, we will first describe the random variable [math]\displaystyle{ J }[/math], representing the set of JI "basis" intervals that our incoming interval is being "matched" to, and the parameter [math]\displaystyle{ C }[/math], representing the "cents" of the incoming interval being played. For example, the interval [math]\displaystyle{ C }[/math] would take values such as "400 cents," and the interval [math]\displaystyle{ J }[/math] would take values in the set of basis ratios, such as "5/4" or "9/7."

So for example, if we want to express the probability that the incoming dyad "400 cents" is perceived as the JI basis interval "5/4," we would write that as the conditional probability [math]\displaystyle{ \newcommand{\cent}{\text{¢}} }[/math] [math]\displaystyle{ P(J=5/4|C=400\cent) }[/math]

Or, in general, if we want to write the conditional probability that some incoming dyad of [math]\displaystyle{ c }[/math] cents is perceived as the JI basis interval [math]\displaystyle{ j }[/math], we would write that as

[math]\displaystyle{ P(J=j|C=c) }[/math]

Note that at this point, we haven't yet specified what the particular probability distribution is. There are different ways to do this, which are described in more detail below. Generally, most approaches involve each JI interval's probability being assigned based on how close it is to [math]\displaystyle{ c }[/math] (closer dyads are given a larger probability), and how simple it is (simple dyads are given a higher probability, if distance is the same).

A noteworthy point is that we generally do not assume any probability distribution on [math]\displaystyle{ C }[/math]. This reflects that we do not make any assumptions at all about which notes or intervals are likely to be played to begin with. In other words, we are treating [math]\displaystyle{ C }[/math] more as a "parameter" rather than as a random variable.

Once we have decided on a probability distribution, we can finally evaluate the Shannon entropy. For a random variable [math]\displaystyle{ X }[/math], the Shannon entropy is defined as:

[math]\displaystyle{ H(X) = -\sum_{x \in X} P(X=x) \log_b P(X=x) }[/math]

where the different [math]\displaystyle{ x }[/math] are taken from the sample space of [math]\displaystyle{ X }[/math], and [math]\displaystyle{ b }[/math] is the base of the log. Different choices of [math]\displaystyle{ b }[/math] simply change the units in which entropy is given, the most common values being 2 and e, denoting "bits" and "nats". We will omit the base going forward, for simplicity.

In our case, we want to find the entropy of the random variable [math]\displaystyle{ J }[/math] of JI intervals, given a particular choice of incoming dyad in cents. The corresponding quantity that we want is:

[math]\displaystyle{ H(J|C=c) = -\sum_{j \in J} P(J=j|C=c) \log P(J=j|C=c) }[/math]

Note that above, the summation is only taken on the [math]\displaystyle{ j }[/math] from the sample space of [math]\displaystyle{ JI }[/math] (i.e. the set of JI basis intervals), whereas the parameter [math]\displaystyle{ c }[/math] is treated as constant within the summation (and is taken as the free parameter to the function).

Since the parameter [math]\displaystyle{ c }[/math] is the free parameter, sometimes the above is notated as

[math]\displaystyle{ \text{HE}(c) = H(J|C=c) }[/math]

which makes more explicit that [math]\displaystyle{ c }[/math] is the argument to the harmonic entropy function, which is equal to the entropy of [math]\displaystyle{ J }[/math], conditioned on [math]\displaystyle{ C=c }[/math].

Probability Distributions

In order to systematically assign a probability distribution to this dyad, we first start by defining a spreading function, denoted by [math]\displaystyle{ S(x) }[/math], that dictates how the dyad is "smeared" out in log-frequency space, representing how the auditory system allows for some tolerance for mistuning. The typical choice that we will assume here for a spreading function is a Gaussian distribution, with mean centered around the incoming dyad, and standard deviation typically taken as a free parameter in the system and denoted as [math]\displaystyle{ s }[/math].

A fairly typical choice of settings for a basic dyadic HE model would be:

The basis set is all those rationals bounded by some maximum Tenney height, with the bound typically notated as [math]\displaystyle{ N }[/math] and set to at least 10,000.
The spreading function is typically a Gaussian distribution with a frequency deviation of 1% either way, or about s=~17 cents.

Other spreading functions have also been explored, such as the use of the heavy-tailed Laplace distribution, sometimes described as the "Vos function" in Paul's writings. We will assume the Gaussian distribution as the spreading function for the remainder of this article, so that the spreading function for an incoming dyad [math]\displaystyle{ c }[/math] can be written as follows:

[math]\displaystyle{ S(x-c) = \frac{1}{s\sqrt{2\pi}} e^{-\frac{(x-c)^2}{2s^2}} }[/math]

where the notation [math]\displaystyle{ S(x-c) }[/math] is chosen to make clear that we are translating [math]\displaystyle{ S(x) }[/math] to be centered around the incoming dyad [math]\displaystyle{ c }[/math], which is now the mean of the Gaussian.

We assume here that the variable [math]\displaystyle{ x }[/math] is a dummy variable representing cents, and will adopt this convention for the remainder of the article.

In this notation, [math]\displaystyle{ s }[/math] becomes the standard deviation of the Gaussian, being an ASCII-friendly version of the more familiar symbol [math]\displaystyle{ \sigma }[/math] for representing the standard deviation. Note that in previous expositions on Harmonic Entropy, [math]\displaystyle{ s }[/math] was sometimes given in units representing a percentage of linear-frequency deviation; we allow [math]\displaystyle{ s }[/math] to stand for cents here to simplify the notation. To convert from a percentage to cents, the formula [math]\displaystyle{ \text{cents} = 1200\log_2(1+\text{percentage}) }[/math] can be used.

It is also common to use as a basis set all those rationals bounded by some maximum Weil height, with a typical cutoff for [math]\displaystyle{ N }[/math] set to at least 100. This has sometimes been referred to as seeding HE with the "Farey sequence of order [math]\displaystyle{ N }[/math]" and its reciprocals, so references in Paul's work to "Farey series HE" vs "Tenney series HE" are sometimes seen.

Lastly, the set of rationals is often chosen to be only those "reduced" rationals within the cutoff, such that [math]\displaystyle{ n/d }[/math] is in the set only if [math]\displaystyle{ n }[/math] and [math]\displaystyle{ d }[/math] are coprime. HE can also be formulated with unreduced rationals as well. Both methods tend to give similar results. In Paul's work, reduced rationals are most common, although the use of unreduced rationals may be useful in extending HE to the case where [math]\displaystyle{ N=\infty }[/math].

Given a spreading function and set of basis rationals, there are two different procedures commonly used to assign probabilities to each rational. The first, the domain-integral approach, works for arbitrary nowhere dense sets of rationals without any further free parameters. The second, the complexity-normalization approach, has nice mathematical properties which sometimes make it easier to compute and which may lead to generalizations to infinite sets of rationals which are sometimes dense in the reals. It is conjectured that there are certain important limiting situations where the two converge; both are described in detail below.

Domain-Integral Probabilities

For discrete sets of JI basis ratios, the log-frequency spectrum can be divided up into domains assigned to each ratio. Each ratio is assigned a domain with lower bound equal to the mediant of itself and its nearest lower neighbor, and likewise with upper bound equal to the mediant of itself and its nearest upper neighbor. If no such neighbor exists, [math]\displaystyle{ \pm \infty }[/math] is used instead. Mathematically, this can be represented via the following expression:

[math]\displaystyle{ P(J=j|C=c) = \int_{\cent(j_l)}^{\cent(j_u)} S(x-c) dx }[/math]

where [math]\displaystyle{ S(x-c) }[/math] is the spreading function associated with c, [math]\displaystyle{ j_l }[/math] and [math]\displaystyle{ j_u }[/math] are the domain lower and upper bounds associated with JI basis ratio [math]\displaystyle{ j }[/math], and [math]\displaystyle{ \cent(f) = 1200\log_2(f) }[/math], or the "cents" function converting frequency ratios to cents. Typically, [math]\displaystyle{ j_l }[/math] is set equal to the mediant of [math]\displaystyle{ j }[/math] and its nearest lower neighbor (if it exists), or [math]\displaystyle{ -\infty }[/math] if not; likewise with [math]\displaystyle{ j_u }[/math] and its nearest upper neighbor.

This process can be summarized by the following picture, taken from William Sethares' paper on Harmonic Entropy:

Note the difference in terminology here - in this example, the [math]\displaystyle{ f_{j+n} }[/math] are the basis ratios, the [math]\displaystyle{ r_{j+n} }[/math] are the domains for each basis ratio, and the bounds for each domain are the mediants between each [math]\displaystyle{ f_{j+n} }[/math] and its nearest neighbor. The probability assigned to each basis ratio is then the area under the spreading function curve for each ratio's domain. The entropy of this probability distribution is then the Harmonic Entropy for that dyad.

In the case where the set of basis rationals consists of a finite set bounded by Tenney or Weil height, the resulting set of widths is conjectured to have interesting mathematical properties, leading to mathematically nice conceptual simplifications of the model. These simplifications are explained below.

Complexity-Normalization Probabilities

It has been noted empirically by Paul Erlich that, given all those rationals with Tenney height under some cutoff [math]\displaystyle{ N }[/math] as a basis set, that the domain widths for rationals sufficiently far from the cutoff seem to be proportional to [math]\displaystyle{ \frac{1}{\sqrt{nd}} }[/math].

While it's still an open conjecture that this pattern holds for arbitrarily large [math]\displaystyle{ N }[/math], the assumption is sometimes made that this is the case, and hence that for these basis ratio sets, [math]\displaystyle{ \frac{1}{\sqrt{nd}} }[/math] "approximations" to the width are sufficient to estimate domain-integral Harmonic Entropy.

This modifies the expression for the probabilities [math]\displaystyle{ P(J=j|C=c) }[/math] as follows, noting that for now the "probabilities" won't sum to 1:

[math]\displaystyle{ \hat P(J=j|C=c) = \frac{S(\cent(j)-c)}{\sqrt{j_n \cdot j_d}} }[/math]

where the [math]\displaystyle{ \hat P }[/math] notation now represents that these "probabilities" are unnormalized, and [math]\displaystyle{ j_n }[/math] and [math]\displaystyle{ j_d }[/math] are the numerator and denominator, respectively, of JI basis ratio [math]\displaystyle{ j }[/math]. Again, the set of basis rationals here is assumed to be all of those rationals of Tenney Height ≤ [math]\displaystyle{ N }[/math] for some [math]\displaystyle{ N }[/math].

A similar observation for the use of Weil-bounded subsets of the rationals suggests domain widths of [math]\displaystyle{ \frac{1}{\max(n,d)} }[/math], yielding instead the following formula:

[math]\displaystyle{ \hat P(J=j|C=c) = \frac{S(\cent(j)-c)}{\max(j_n, j_d)} }[/math]

where this time the set of basis rationals is assumed to be all of those of Weil Height ≤ [math]\displaystyle{ N }[/math] for some [math]\displaystyle{ N }[/math].

In both cases, the general approach is the same: the value of the spreading function, taken at the value of [math]\displaystyle{ \cent(j) }[/math], is divided by some sort of "complexity" function representing how much weight is given to that rational number. While the two complexity functions considered thus far were derived empirically by observing the asymptotic behavior of various height-bounded subsets of the rationals, we can generalize this for arbitrary basis sets of rationals and arbitrary complexities as follows:

[math]\displaystyle{ \hat P(J=j|C=c) = \frac{S(\cent(j)-c)}{\|j\|} }[/math]

where [math]\displaystyle{ \|j\| }[/math] denotes a complexity function mapping from rational numbers to non-negative reals.

As these "probabilities" don't sum to 1, the result is not a probability distribution at all, invalidating the use of the Shannon Entropy. To rectify this, the distribution is normalized so that the probabilities do sum to 1:

[math]\displaystyle{ P(J=j|C=c) = \frac{\hat P(J=j|C=c)}{\sum_{j \in J} \hat P(J=j|C=c)} }[/math]

which is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities. This definition of [math]\displaystyle{ P(J=j|C=c) }[/math] is then used directly to compute the entropy.

This approach to assigning probabilities to basis rationals is useful because it hypothetically makes it possible to consider the HE of sets of rationals which are dense in the reals, or even the entire set of positive rationals, although the best way to do this is a subject of ongoing research.

Examples

In all of these examples, the x-axis represents the width in cents of the dyad, and the y-axis represents discordance rather than concordance, measured in nats of Shannon entropy.

s=17, N<10000, sqrt(n*d) weights

This uses as a spreading function the Gaussian distribution with [math]\displaystyle{ s=~17\cent }[/math] (or a lin-frequency deviation of 1%). The basis set is all rationals of Tenney height less than 10,000. This uses the complexity-normalization approach, and the complexity function is [math]\displaystyle{ \sqrt{nd} }[/math]:

s=17, N<100, max(n,d) weights

This example uses the same spreading function and standard deviation, but this time the basis set is all rationals of Weil height less than 100. The complexity function here is [math]\displaystyle{ \max(n,d) }[/math]:

s=17, N<10000, sqrt(n*d) vs mediant-to-mediant weights

The following image (from Paul Erlich) compares the domain-integral and complexity-normalization approaches by overlaying the two curves on top of each other. In both cases, the spreading function is again a Gaussian with s=~17 cents, and the basis set is all those rationals with Tenney height ≤ 10000. It can be seen that the curves are extremely similar, and that the locations of the minima and maxima are largely preserved:

Harmonic Rényi Entropy

An extension to the base Harmonic Entropy model, proposed by Mike Battaglia, is to generalize the use of Shannon entropy by replacing it instead with Rényi entropy, a q-analog of Shannon's original entropy. This can be thought of as adding a second parameter, called [math]\displaystyle{ a }[/math], to the model, reflecting how "intelligent" the brain's "decoding" process is when determining the most likely JI interpretation of an ambiguous interval.

Definitions and Background

The Harmonic Rényi Entropy of order a of an incoming dyad can be defined as follows:

[math]\displaystyle{ \text{HE}_a(c) = H_a(J=j|C=c) = \frac{1}{1-a} \log \sum_{j \in J} P(J=j|C=c)^a }[/math]

Being a q-analog, it is noteworthy that Rényi entropy converges to Shannon entropy in the limit as [math]\displaystyle{ a \to 1 }[/math], a fact which can be verified using L'Hôpital's rule as found here.

The Rényi entropy has found use in cryptography as a measure of the strength of a cryptographic code in the face of an intelligent attacker, an application for which Shannon entropy has long been known to be insufficient as described in this paper and this RFC. More precisely, the Rényi entropy of order [math]\displaystyle{ \infty }[/math], also called the min-entropy, is used to measure the strength of the randomness used to define a cryptographic secret against a "worst-case" attacker who has complete knowledge of the probability distribution from which cryptographic secrets are drawn.

In a musical context, by considering the incoming dyad as analogous to a cryptographic code which is attempting to be "cracked" by an intelligent auditory system, we can consider that the analogous "worst-case attacker" would be a "best-case auditory system" which has complete awareness of the probability distribution for any incoming dyad. This analogy would view such an auditory system as actively attempting to choose the most probable rational, rather than drawing a rational at random weighted by the distribution.

The use of \math>a=∞</math> min-entropy would reflect this view. In contrast, the use of [math]\displaystyle{ a=1 }[/math] Shannon entropy reflects a much "dumber" process which performs no such analysis and perhaps doesn't even seek to "choose" any sort of "victor" rational at all. As the parameter a interpolates between these two options, it can be interpreted as the extent to which the rational-matching process for incoming dyads is considered to be "intelligent" and "active" in this way.

Some psychoacoustic effects naturally fit into this paradigm, such as the virtual pitch integration process, which actually does attempt to find a single victor when matching incoming chords with chunks of the harmonic series. Other psychoacoustic effects, such as that of beatlessness, may instead be better viewed as "dumb" processes whereby nothing in particular is being "chosen," but where a more uniform distribution of matching rational numbers for a dyad simply generates a more discordant sonic effect. Different values of a can differentiate between the predominance given to these two types of effect in the overall construct of psychoacoustic concordance.

Certain values of [math]\displaystyle{ a }[/math] reduce to simpler expressions and have special names, as given in the examples below.

Examples

a=0: Harmonic Hartley Entropy

[math]\displaystyle{ H_0(J|C=c) = \log |J| }[/math]

where [math]\displaystyle{ |J| }[/math] is the cardinality of the set of basis rationals. This assumes, in essence, an "infinitely dumb" auditory system which can do no better than picking a rational number from a uniform distribution completely at random. All dyads have the same Harmonic Hartley Entropy. The Hartley Entropy is sometimes called the "max-entropy," and is useful mainly as an upper bound on the other forms of entropy: all Rényi Entropies are always guaranteed to be less than the Hartley Entropy.

Harmonic Hartley Entropy (a=0) with the basis set all rationals with Tenney height ≤ 10000. Note that the choice of spreading function makes no difference in the end result at all.

a=1: Harmonic Shannon Entropy (Harmonic Entropy)

[math]\displaystyle{ H_1(J|C=c) = -\sum_{j \in J} P(J=j|C=c) \log P(J=j|C=c) }[/math]

This is Paul's original Harmonic Entropy. Within the cryptographic analogy, this can be thought of as an auditory system which simply selects a rational at random from the incoming distribution, weighted via the distribution itself.

Harmonic Shannon Entropy (a=1) with the basis set all rationals with Tenney height ≤ 10000, spreading function a Gaussian distribution with s=1% (~17 cents), and [math]\displaystyle{ \sqrt{nd} }[/math] complexity.

a=2: Harmonic Collision Entropy

[math]\displaystyle{ H_2(J=j|C=c) = -\log \sum_{j \in J} P(J=j|C=c)^2 = -\log (J_1 = J_2|C=c) }[/math]

where [math]\displaystyle{ J_1 }[/math] and [math]\displaystyle{ J_2 }[/math] are two independent and identically distributed random variables of JI basis ratios, conditioned on the same incoming dyad [math]\displaystyle{ C=c }[/math], and the collision entropy is the same as the negative log of the probability that the two JI variables produce the same outcome.

Harmonic Collision Entropy (a=2) with the basis set all rationals with Tenney height ≤ 10000, spreading function a Gaussian distribution with s=1% (~17 cents), and [math]\displaystyle{ \sqrt{nd} }[/math] complexity.

a=∞: Harmonic Min-Entropy

[math]\displaystyle{ H_\infty(J=j|C=c) = -\log \max_{j \in J} P(J=j|C=c) }[/math]

This is the min-entropy, which simply takes the negative log of the largest probability in the distribution. This can be thought of as representing the "strength" of the incoming dyad from being "deciphered" by a "best-case" auditory system. The name "min-entropy" reflects that the [math]\displaystyle{ a=\infty }[/math] case is guaranteed to be a lower bound among all Rényi entropies.

Harmonic Rényi Entropy with a=7, with the high value of a being chosen to approximate min-entropy (a=∞). The basis set is still all rationals with Tenney height ≤ 10000, the spreading function a Gaussian distribution with s=1% (~17 cents), and the complexity function [math]\displaystyle{ \sqrt{nd} }[/math].

Convolution-Based Expression For Quickly Computing Renyi Entropy

Below is given an derivation that expresses Harmonic Renyi Entropy in terms of two simpler functions, each of which is a convolution product and hence can be computed quickly using the Fast Fourier Transform.

The below derivation depends on the use of complexity-normalization probabilities, although it may be possible to extend to domain-integral probabilities instead.

Preliminaries

The Harmonic Renyi Entropy is defined as

[math]\displaystyle{ \text{HE}_a(c) = H_a(J=j|C=c) = \frac{1}{1-a} \log \sum_{j \in J} P(J=j|C=c)^a }[/math]

As before, we can write [math]\displaystyle{ P(J=j|C=c) }[/math] as follows:

[math]\displaystyle{ P(J=j|C=c) = \frac{\hat P(J=j|C=c)}{\sum_{j \in J} \hat P(J=j|C=c)} }[/math]

where [math]\displaystyle{ \hat P(J=j|C=c) }[/math] is the "unnormalized" probability, and the denominator above is the sum of these unnormalized probabilities, so that all of the [math]\displaystyle{ P(J=j|C=c) }[/math] sum to 1.

To simplify notation, we first rewrite the denominator as a "normalization" function:

[math]\displaystyle{ \psi(c) = \sum_{j \in J} \hat P(J=j|C=c) }[/math]

and putting back into the original equation, we get

[math]\displaystyle{ H_a(J=j|C=c) = \frac{1}{1-a} \log \left( \sum_{j \in J} \left( \frac{\hat P(J=j|C=c)}{\psi(c)} \right)^a \right) }[/math]

Since [math]\displaystyle{ \psi(c) }[/math] is the same for each basis ratio, we can pull it out of the summation to obtain:

[math]\displaystyle{ H_a(J=j|C=c) = \frac{1}{1-a} \log \left( \frac{\sum_{j \in J} \hat P(J=j|C=c)^a}{\psi(c)^a} \right) }[/math]

To simplify notation further, we can also rewrite the numerator, the sum of "raw" (unnormalized) pseudo-probabilities, as a function:

[math]\displaystyle{ \rho_a(c) = \sum_{j \in J} \hat P(J=j|C=c)^a }[/math]

Finally, we put this all together to obtain a simplified version of the Harmonic Renyi Entropy equation:

[math]\displaystyle{ \text{HE}_a(c) = H_a(J=j|C=c) = \frac{1}{1-a} \log \left( \frac{\rho_a(c)}{\psi(c)^a} \right) }[/math]

We thus reduce the term inside the logarithm to the quotient of the functions [math]\displaystyle{ \rho_a(c) }[/math] and [math]\displaystyle{ \psi(c) }[/math]. Our aim is now to express each of these two functions in terms of a convolution product.

Convolution product for [math]\displaystyle{ \psi(c) }[/math]

[math]\displaystyle{ \psi(c) }[/math], the normalization function, is written as follows:

[math]\displaystyle{ \psi(c) = \sum_{j \in J} \hat P(J=j|C=c) }[/math]

Again, [math]\displaystyle{ \hat P(J=j|C=c) }[/math] is defined as follows:

[math]\displaystyle{ \hat P(J=j|C=c) = \frac{S(\cent(j)-c)}{\|j\|} }[/math]

We can rewrite the above equation as a convolution with a delta distribution:

[math]\displaystyle{ \hat P(J=j|C=c) = \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c) }[/math]

Putting this back into the original summation, we obtain

[math]\displaystyle{ \psi(c) = \sum_{j \in J} \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c) }[/math]

We note that the left factor in the convolution product is always the same [math]\displaystyle{ S(-c) }[/math], which is not dependent on [math]\displaystyle{ j }[/math] in any way. Since convolution distributes over addition, we can factor the [math]\displaystyle{ S }[/math] out of the summation to obtain

[math]\displaystyle{ \psi(c) = \left[S \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right)\right](-c) }[/math]

We can clean up this notation by defining the auxiliary distribution K:

[math]\displaystyle{ K(c) = \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right) }[/math]

Which leaves us with the final expression:

[math]\displaystyle{ \psi(c) = \left[S \ast K\right](-c) }[/math]

Convolution product for [math]\displaystyle{ \rho_a(c) }[/math]

The derivation for [math]\displaystyle{ \rho_a(c) }[/math] proceeds similarly. Recall the function is written as follows:

[math]\displaystyle{ \rho_a(c) = \sum_{j \in J} \hat P(J=j|C=c)^a }[/math]

The expression for each [math]\displaystyle{ \hat P(J=j|C=c)^a }[/math] is:

[math]\displaystyle{ \hat P(J=j|C=c)^a = \frac{S(\cent(j)-c)^a}{\|j\|^a} }[/math]

We can again express this as a convolution, this time of the function [math]\displaystyle{ S^a(-c) }[/math], meaning the spreading function S taken to the a'th power, and a delta distribution:

[math]\displaystyle{ \hat P(J=j|C=c)^a = \left(S^a \ast \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)(-c) }[/math]

Putting this back into the original summation and factoring as before, we obtain

[math]\displaystyle{ \rho_a(c) = \left[S^a \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)\right](-c) }[/math]

And again we clean up notation by defining the auxiliary distribution

[math]\displaystyle{ K^a(c) = \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right) }[/math]

so that

[math]\displaystyle{ \rho_a(c) = \left[S^a \ast K^a\right](-c) }[/math]

We have now succeeded in representing [math]\displaystyle{ \rho_a(c) }[/math] as a convolution.

Note that the function [math]\displaystyle{ K^a(c) }[/math] involves a slight abuse of notation, as it is not literally [math]\displaystyle{ K(c) }[/math] taken to the [math]\displaystyle{ a }[/math]'th power (as the square of the delta distribution is undefined). Rather, we are simply taking the weights of each delta distribution in the summation to the [math]\displaystyle{ a }[/math]'th power.

Round-up

Taking all of this, we can rewrite the original expression for Harmonic Renyi Entropy as follows:

[math]\displaystyle{ \text{HE}_a(c) = H_a(J=j|C=c) = \frac{1}{1-a} \log \left( \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)} \right) }[/math]

where the expression

[math]\displaystyle{ \left[S \ast K\right]^a(-c) }[/math]

represents the convolution of [math]\displaystyle{ S }[/math] and </math>K, taken to the [math]\displaystyle{ a }[/math]'th power, and flipped backwards. Note that if [math]\displaystyle{ S(x) }[/math] is a symmetrical (even) spreading function, and if for each ratio [math]\displaystyle{ n/d }[/math] in [math]\displaystyle{ J }[/math], if the inverse [math]\displaystyle{ d/n }[/math] is also in [math]\displaystyle{ J }[/math], then the above convolution will also be symmetrical, and we also have

[math]\displaystyle{ \left[S \ast K\right]^a(-c) = \left[S \ast K\right]^a(c) }[/math]

We have succeeded in representing Harmonic Renyi Entropy in simple terms of two convolution products, each of which can be computed in [math]\displaystyle{ O(N log N) }[/math] time.

References

Paul Erlich article

William Sethares article

Harmonic entropy (TonalSoft encyclopedia)

Harmonic entropy group on Yahoo

Harmonic entropy graph calculator (JavaScript)

@@ Line 27: / Line 27: @@
 Concordance has often been confused with actual musical consonance, an unfortunate fact made more common by the psychoacoustics literature under the unfortunate name '''sensory consonance''', most often used to refer to phenomena related to roughness and beatlessness specifically. This is not to be confused with the more familiar construct of tonal stability, typically just called "consonance" in Western common practice music theory and sometimes clarified as "musical consonance" in the music cognition literature. To make matters worse, the literature has also at times referred to concordance -- and not tonal stability -- as '''tonal consonance''', often referring to phenomena related to virtual pitch integration, creating a complete terminological mess. As a result, the term "consonance" has been completely avoided in this article.
-=Basic Model=
+=Basic Model: Shannon Entropy=
 The original Harmonic Entropy model limited itself to working with dyads. More recently, work by Steve Martin and others has extended this basic idea to higher-cardinality chords. This article will concern itself with dyads, as the dyadic case is still the most well-developed, and many of the ideas extend naturally to larger chords without need for much exposition.
@@ Line 34: / Line 34: @@
 A clear mathematical way of quantifying this "dispersion" is via the [http://en.wikipedia.org/wiki/Entropy_(information_theory) Shannon entropy] of the probability distribution, which can be thought of as describing the "uncertainty" in the distribution. A distribution which has a very high probability of picking one outcome has low entropy and is not very uncertain, whereas a distribution which has the probability spread out on many outcomes is highly uncertain and has a high entropy.
+==Definitions==
 To formalize our notion of Shannon entropy, we will first describe the random variable <math>J</math>, representing the set of JI "basis" intervals that our incoming interval is being "matched" to, and the parameter <math>C</math>, representing the "cents" of the incoming interval being played. For example, the interval <math>C</math> would take values such as "400 cents," and the interval <math>J</math> would take values in the set of basis ratios, such as "5/4" or "9/7."
@@ Line 66: / Line 67: @@
 which makes more explicit that <math>c</math> is the argument to the harmonic entropy function, which is equal to the entropy of <math>J</math>, conditioned on <math>C=c</math>.
-=Probability Distributions=
+==Probability Distributions==
 In order to systematically assign a probability distribution to this dyad, we first start by defining a '''spreading function''', denoted by <math>S(x)</math>, that dictates how the dyad is "smeared" out in log-frequency space, representing how the auditory system allows for some tolerance for mistuning. The typical choice that we will assume here for a spreading function is a Gaussian distribution, with mean centered around the incoming dyad, and standard deviation typically taken as a free parameter in the system and denoted as <math>s</math>.
@@ Line 91: / Line 92: @@
 Given a spreading function and set of basis rationals, there are two different procedures commonly used to assign probabilities to each rational. The first, the '''domain-integral approach''', works for arbitrary nowhere dense sets of rationals without any further free parameters. The second, the '''complexity-normalization approach''', has nice mathematical properties which sometimes make it easier to compute and which may lead to generalizations to infinite sets of rationals which are sometimes dense in the reals. It is conjectured that there are certain important limiting situations where the two converge; both are described in detail below.
-==Domain-Integral Probabilities==
+===Domain-Integral Probabilities===
-For sets of JI basis rationals which are nowhere dense, and in particular for a finite set of basis rationals, the log-frequency spectrum can be divided up into '''domains''' assigned to each ratio. Each ratio is assigned a domain with lower bound equal to the mediant of itself and its nearest lower neighbor, and likewise with upper bound equal to the mediant of itself and its nearest upper neighbor. If no such neighbor exists, <math>\pm \infty</math> is used instead. Mathematically, this can be represented via the following expression:
+For discrete sets of JI basis ratios, the log-frequency spectrum can be divided up into '''domains''' assigned to each ratio. Each ratio is assigned a domain with lower bound equal to the mediant of itself and its nearest lower neighbor, and likewise with upper bound equal to the mediant of itself and its nearest upper neighbor. If no such neighbor exists, <math>\pm \infty</math> is used instead. Mathematically, this can be represented via the following expression:
@@ Line 107: / Line 108: @@
 In the case where the set of basis rationals consists of a finite set bounded by Tenney or Weil height, the resulting set of widths is conjectured to have interesting mathematical properties, leading to mathematically nice conceptual simplifications of the model. These simplifications are explained below.
-==Complexity-Normalization Probabilities==
+===Complexity-Normalization Probabilities===
 It has been noted empirically by Paul Erlich that, given all those rationals with Tenney height under some cutoff <math>N</math> as a basis set, that the domain widths for rationals sufficiently far from the cutoff seem to be proportional to <math>\frac{1}{\sqrt{nd}}</math>.
@@ Line 138: / Line 139: @@
 This approach to assigning probabilities to basis rationals is useful because it hypothetically makes it possible to consider the HE of sets of rationals which are dense in the reals, or even the entire set of positive rationals, although the best way to do this is a subject of ongoing research.
-=Examples=
+==Examples==
 In all of these examples, the x-axis represents the width in cents of the dyad, and the y-axis represents ''discordance'' rather than concordance, measured in nats of Shannon entropy.
+=== s=17, N<10000, sqrt(n*d) weights ===
 This uses as a spreading function the Gaussian distribution with <math>s=~17\cent</math> (or a lin-frequency deviation of 1%). The basis set is all rationals of Tenney height less than 10,000. This uses the complexity-normalization approach, and the complexity function is <math>\sqrt{nd}</math>:
 [[File:HE_Tenney_N_10000_s_17cents.png]]
+=== s=17, N<100, max(n,d) weights ===
 This example uses the same spreading function and standard deviation, but this time the basis set is all rationals of Weil height less than 100. The complexity function here is <math>\max(n,d)</math>:
 [[File:HE_Weil_N_100_s_17cents.png]]
+=== s=17, N<10000, sqrt(n*d) vs mediant-to-mediant weights ===
 The following image (from Paul Erlich) compares the domain-integral and complexity-normalization approaches by overlaying the two curves on top of each other. In both cases, the spreading function is again a Gaussian with s=~17 cents, and the basis set is all those rationals with Tenney height ≤ 10000. It can be seen that the curves are extremely similar, and that the locations of the minima and maxima are largely preserved:
@@ Line 156: / Line 160: @@
 =Harmonic Rényi Entropy=
-An extension to the base Harmonic Entropy model, proposed by Mike Battaglia, is to generalize the use of [http://en.wikipedia.org/wiki/Entropy_(information_theory) Shannon entropy] by replacing it instead with [http://en.wikipedia.org/wiki/R%C3%A9nyi_entropy Rényi entropy], a [http://en.wikipedia.org/wiki/Q-analog q-analog] of Shannon's original entropy. The '''Harmonic Rényi Entropy of order a''' of an incoming dyad can be defined as follows:
+An extension to the base Harmonic Entropy model, proposed by Mike Battaglia, is to generalize the use of [http://en.wikipedia.org/wiki/Entropy_(information_theory) Shannon entropy] by replacing it instead with [http://en.wikipedia.org/wiki/R%C3%A9nyi_entropy Rényi entropy], a [http://en.wikipedia.org/wiki/Q-analog q-analog] of Shannon's original entropy. This can be thought of as adding a second parameter, called <math>a</math>, to the model, reflecting how "intelligent" the brain's "decoding" process is when determining the most likely JI interpretation of an ambiguous interval.
+==Definitions and Background==
+The '''Harmonic Rényi Entropy of order a''' of an incoming dyad can be defined as follows:
 <math>\text{HE}_a(c) = H_a(J=j|C=c) = \frac{1}{1-a} \log \sum_{j \in J} P(J=j|C=c)^a</math>
@@ Line 170: / Line 178: @@
 Some psychoacoustic effects naturally fit into this paradigm, such as the virtual pitch integration process, which actually does attempt to find a single victor when matching incoming chords with chunks of the harmonic series. Other psychoacoustic effects, such as that of beatlessness, may instead be better viewed as "dumb" processes whereby nothing in particular is being "chosen," but where a more uniform distribution of matching rational numbers for a dyad simply generates a more discordant sonic effect. Different values of a can differentiate between the predominance given to these two types of effect in the overall construct of psychoacoustic concordance.
-Certain values of <math>a</math> reduce to simpler expressions and have special names.
+Certain values of <math>a</math> reduce to simpler expressions and have special names, as given in the examples below.
-==a=0: Harmonic Hartley Entropy==
+==Examples==
+===a=0: Harmonic Hartley Entropy===
 <math>H_0(J|C=c) = \log |J|</math>
@@ Line 181: / Line 190: @@
 ''Harmonic Hartley Entropy (a=0) with the basis set all rationals with Tenney height ≤ 10000. Note that the choice of spreading function makes no difference in the end result at all.''
-==a=1: Harmonic Shannon Entropy (Harmonic Entropy)==
+===a=1: Harmonic Shannon Entropy (Harmonic Entropy)===
 <math>H_1(J|C=c) = -\sum_{j \in J} P(J=j|C=c) \log P(J=j|C=c)</math>
@@ Line 190: / Line 199: @@
 ''Harmonic Shannon Entropy (a=1) with the basis set all rationals with Tenney height ≤ 10000, spreading function a Gaussian distribution with s=1% (~17 cents), and <math>\sqrt{nd}</math> complexity.''
-==a=2: Harmonic Collision Entropy==
+===a=2: Harmonic Collision Entropy===
 <math>H_2(J=j|C=c) = -\log \sum_{j \in J} P(J=j|C=c)^2 = -\log (J_1 = J_2|C=c)</math>
@@ Line 199: / Line 208: @@
 ''Harmonic Collision Entropy (a=2) with the basis set all rationals with Tenney height ≤ 10000, spreading function a Gaussian distribution with s=1% (~17 cents), and <math>\sqrt{nd}</math> complexity.''
-==a=∞: Harmonic Min-Entropy==
+===a=∞: Harmonic Min-Entropy===
 <math>H_\infty(J=j|C=c) = -\log \max_{j \in J} P(J=j|C=c)</math>
@@ Line 208: / Line 217: @@
 ''Harmonic Rényi Entropy with a=7, with the high value of a being chosen to approximate min-entropy (a=''∞''). The basis set is still all rationals with Tenney height ≤ 10000, the spreading function a Gaussian distribution with s=1% (~17 cents), and the complexity function <math>\sqrt{nd}</math>.''
-=Convolution-Based Expression For Quickly Computing Renyi Entropy=
+==Convolution-Based Expression For Quickly Computing Renyi Entropy==
 Below is given an derivation that expresses Harmonic Renyi Entropy in terms of two simpler functions, each of which is a convolution product and hence can be computed quickly using the Fast Fourier Transform.
 The below derivation depends on the use of complexity-normalization probabilities, although it may be possible to extend to domain-integral probabilities instead.
-==Preliminaries==
+===Preliminaries===
 The Harmonic Renyi Entropy is defined as
@@ Line 252: / Line 261: @@
 We thus reduce the term inside the logarithm to the quotient of the functions <math>\rho_a(c)</math> and <math>\psi(c)</math>. Our aim is now to express each of these two functions in terms of a convolution product.
-==Convolution product for <math>\psi(c)</math>==
+===Convolution product for <math>\psi(c)</math>===
 <math>\psi(c)</math>, the normalization function, is written as follows:
@@ Line 287: / Line 296: @@
 <math>\psi(c) = \left[S \ast K\right](-c)</math>
-==Convolution product for <math>\rho_a(c)</math>==
+===Convolution product for <math>\rho_a(c)</math>===
 The derivation for <math>\rho_a(c)</math> proceeds similarly. Recall the function is written as follows:
@@ Line 321: / Line 330: @@
 Note that the function <math>K^a(c)</math> involves a slight abuse of notation, as it is not literally <math>K(c)</math> taken to the <math>a</math>'th power (as the square of the delta distribution is undefined). Rather, we are simply taking the weights of each delta distribution in the summation to the <math>a</math>'th power.
-==Round-up==
+===Round-up===
 Taking all of this, we can rewrite the original expression for Harmonic Renyi Entropy as follows:

Harmonic entropy: Difference between revisions

Revision as of 03:38, 25 November 2018

Contents

Background

Basic Model: Shannon Entropy

Definitions