Harmonic entropy: Difference between revisions

Line 41:

So for example, if we want to express the probability that the incoming dyad "400 cents" is perceived as the JI basis interval "5/4," we would write that as the conditional probability

~~<math>~~\displaystyle \newcommand{\cent}{\text{¢}}~~</math>~~

$$\displaystyle \newcommand{\cent}{\text{¢}}$$

~~<math>~~\displaystyle P(J=5/4|C=400\cent)~~</math>~~

$$\displaystyle P(J=5/4|C=400\cent)$$

Or, in general, if we want to write the conditional probability that some incoming dyad of <math>c</math> cents is perceived as the JI basis interval <math>j</math>, we would write that as

~~<math>~~\displaystyle P(J=j|C=c)~~</math>~~

$$\displaystyle P(J=j|C=c)$$

which notationally, we will often abbreviate as

~~<math>~~\displaystyle P(j|c)~~</math>~~

$$\displaystyle P(j|c)$$

Note that at this point, we haven't yet specified what the particular probability distribution is. There are different ways to do this, which are described in more detail below. Generally, most approaches involve each JI interval's probability being assigned based on how close it is to <math>c</math> (closer dyads are given a larger probability), and how simple it is (simple dyads are given a higher probability, if distance is the same).

Line 58:

Once we have decided on a probability distribution, we can finally evaluate the Shannon entropy. For a random variable <math>X</math>, the Shannon entropy is defined as:

~~<math>~~\displaystyle H(X) = -\sum_{x \in X} P(x) \log_b P(x)~~</math>~~

$$\displaystyle H(X) = -\sum_{x \in X} P(x) \log_b P(x)$$

where the different <math>x</math> are taken from the sample space of <math>X</math>, and <math>b</math> is the base of the log. Different choices of <math>b</math> simply change the units in which entropy is given, the most common values being 2 and e, denoting "bits" and "nats". We will omit the base going forward, for simplicity.

Line 64:

In our case, we want to find the entropy of the random variable <math>J</math> of JI intervals, given a particular choice of incoming dyad in cents. The corresponding quantity that we want is:

~~<math>~~\displaystyle H(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)~~</math>~~

$$\displaystyle H(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)$$

Note that above, the summation is only taken on the <math>j</math> from the sample space of <math>J</math> (i.e. the set of JI basis intervals), whereas the parameter <math>c</math> is treated as constant within the summation (and is taken as the free parameter to the function).

Line 70:

Since the parameter <math>c</math> is the free parameter, sometimes the above is notated as

~~<math>~~\displaystyle \text{HE}(c) = H(J|c)~~</math>~~

$$\displaystyle \text{HE}(c) = H(J|c)$$

which makes more explicit that <math>c</math> is the argument to the harmonic entropy function, which is equal to the entropy of <math>J</math>, conditioned on the incoming dyad of <math>c</math> cents.

Line 85:

Other spreading functions have also been explored, such as the use of the heavy-tailed [https://en.wikipedia.org/wiki/Laplace_distribution Laplace distribution], sometimes described as the "Vos function" in Paul's writings. These two functions are part of the [https://en.wikipedia.org/wiki/Generalized_normal_distribution Generalized normal distribution] family, which has a parameter not only for the variance but for the kurtosis. However, for simplicity, we will assume the Gaussian distribution as the spreading function for the remainder of this article, so that the spreading function for an incoming dyad <math>c</math> can be written as follows:

~~<math>~~\displaystyle S(x-c) = \frac{1}{s\sqrt{2\pi}} e^{-\frac{(x-c)^2}{2s^2}}~~</math>~~

$$\displaystyle S(x-c) = \frac{1}{s\sqrt{2\pi}} e^{-\frac{(x-c)^2}{2s^2}}$$

where the notation <math>S(x-c)</math> is chosen to make clear that we are translating <math>S(x)</math> to be centered around the incoming dyad <math>c</math>, which is now the mean of the Gaussian.

Line 102:

For discrete sets of JI basis ratios, the log-frequency spectrum can be divided up into '''domains''' assigned to each ratio. Each ratio is assigned a domain with lower bound equal to the mediant of itself and its nearest lower neighbor, and likewise with upper bound equal to the mediant of itself and its nearest upper neighbor. If no such neighbor exists, <math>\pm \infty</math> is used instead. Mathematically, this can be represented via the following expression:

~~<math>~~\displaystyle P(j|c) = \int_{\cent(j_l)}^{\cent(j_u)} S(x-c) dx~~</math>~~

$$\displaystyle P(j|c) = \int_{\cent(j_l)}^{\cent(j_u)} S(x-c) dx$$

where <math>S(x-c)</math> is the spreading function associated with c, <math>j_l</math> and <math>j_u</math> are the domain lower and upper bounds associated with JI basis ratio <math>j</math>, and <math>\cent(f) = 1200\log_2(f)</math>, or the "cents" function converting frequency ratios to cents. Typically, <math>j_l</math> is set equal to the mediant of <math>j</math> and its nearest lower neighbor (if it exists), or <math>-\infty</math> if not; likewise with <math>j_u</math> and its nearest upper neighbor.

Line 121:

This modifies the expression for the probabilities <math>P(j|c)</math> as follows, noting that for now the "probabilities" won't sum to 1:

~~<math>~~\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\sqrt{j_n \cdot j_d}}~~</math>~~

$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\sqrt{j_n \cdot j_d}}$$

where the <math>Q</math> notation now represents that these "probabilities" are unnormalized, and <math>j_n</math> and <math>j_d</math> are the numerator and denominator, respectively, of JI basis ratio <math>j</math>. Again, the set of basis rationals here is assumed to be all of those rationals of Tenney Height ≤ <math>N</math> for some <math>N</math>.

Line 127:

A similar observation for the use of Weil-bounded subsets of the rationals suggests domain widths of <math>\frac{1}{\max(n,d)}</math>, yielding instead the following formula:

~~<math>~~\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\max(j_n, j_d)}~~</math>~~

$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\max(j_n, j_d)}$$

where this time the set of basis rationals is assumed to be all of those of Weil Height ≤ <math>N</math> for some <math>N</math>.

Line 133:

In both cases, the general approach is the same: the value of the spreading function, taken at the value of <math>\cent(j)</math>, is divided by some sort of "weighting" (or sometimes, "complexity") function representing how much weight is given to that rational number. While the two weighting functions considered thus far were derived empirically by observing the asymptotic behavior of various height-bounded subsets of the rationals, we can generalize this for arbitrary basis sets of rationals and arbitrary weights as follows:

~~<math>~~\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}~~</math>~~

$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}$$

where <math>\|j\|</math> denotes a weighting function that maps from rational numbers to non-negative reals.

Line 139:

As these "probabilities" don't sum to 1, the result is not a probability distribution at all, invalidating the use of the Shannon Entropy. To rectify this, the distribution is normalized so that the probabilities do sum to 1:

~~<math>~~\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}~~</math>~~

$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$

which is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities. This definition of <math>P(j|c)</math> is then used directly to compute the entropy.

Line 172:

The '''Harmonic Rényi Entropy of order a''' of an incoming dyad can be defined as follows:

~~<math>~~\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a~~</math>~~

$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$

Being a q-analog, it is noteworthy that Rényi entropy converges to Shannon entropy in the limit as <math>a \to 1</math>, a fact which can be verified using L'Hôpital's rule as found [http://www.sonycsl.co.jp/person/nielsen/Note-HopitalRuleShannonRenyiTsallis.pdf here].

Line 188:

=== Examples ===

==== a=0: Harmonic Hartley Entropy ====

~~<math>~~\displaystyle H_0(J|c) = \log |J|~~</math>~~

$$\displaystyle H_0(J|c) = \log |J|$$

where <math>|J|</math> is the cardinality of the set of basis rationals. This assumes, in essence, an "infinitely dumb" auditory system which can do no better than picking a rational number from a uniform distribution completely at random. All dyads have the same Harmonic Hartley Entropy. The Hartley Entropy is sometimes called the "max-entropy," and is useful mainly as an upper bound on the other forms of entropy: all Rényi Entropies are always guaranteed to be less than the Hartley Entropy.

Line 197:

==== a=1: Harmonic Shannon Entropy (Harmonic Entropy) ====

~~<math>~~\displaystyle H_1(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)~~</math>~~

$$\displaystyle H_1(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)$$

This is Paul's original Harmonic Entropy. Within the cryptographic analogy, this can be thought of as an auditory system which simply selects a rational at random from the incoming distribution, weighted via the distribution itself.

Line 206:

==== a=2: Harmonic Collision Entropy ====

~~<math>~~\displaystyle H_2(J|c) = -\log \sum_{j \in J} P(j|c)^2 = -\log (J_1 = J_2|c)~~</math>~~

$$\displaystyle H_2(J|c) = -\log \sum_{j \in J} P(j|c)^2 = -\log (J_1 = J_2|c)$$

where <math>J_1</math> and <math>J_2</math> are two independent and identically distributed random variables of JI basis ratios, conditioned on the same incoming dyad <math>c</math>, and the collision entropy is the same as the negative log of the probability that the two JI variables produce the same outcome.

Line 215:

==== a=∞: Harmonic Min-Entropy ====

~~<math>~~\displaystyle H_\infty(J|c) = -\log \max_{j \in J} P(j|c)~~</math>~~

$$\displaystyle H_\infty(J|c) = -\log \max_{j \in J} P(j|c)$$

This is the min-entropy, which simply takes the negative log of the largest probability in the distribution. This can be thought of as representing the "strength" of the incoming dyad from being "deciphered" by a "best-case" auditory system. The name "min-entropy" reflects that the <math>a=\infty</math> case is guaranteed to be a lower bound among all Rényi entropies.

Line 231:

The Harmonic Rényi Entropy is defined as

~~<math>~~\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a~~</math>~~

$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$

As before, we can write <math>P(j|c)</math> as follows:

~~<math>~~\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}~~</math>~~

$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$

where <math>Q(j|c)</math> is the "unnormalized" probability, and the denominator above is the sum of these unnormalized probabilities, so that all of the <math>P(j|c)</math> sum to 1.

Line 243:

To simplify notation, we first rewrite the denominator as a "normalization" function:

~~<math>~~\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)~~</math>~~

$$\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)$$

and putting back into the original equation, we get

~~<math>~~\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \sum_{j \in J} \left( \frac{Q(j|c)}{\psi(c)} \right)^a \right)~~</math>~~

$$\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \sum_{j \in J} \left( \frac{Q(j|c)}{\psi(c)} \right)^a \right)$$

Since <math>\psi(c)</math> is the same for each basis ratio, we can pull it out of the summation to obtain:

~~<math>~~\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\sum_{j \in J} Q(j|c)^a}{\psi(c)^a} \right)~~</math>~~

$$\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\sum_{j \in J} Q(j|c)^a}{\psi(c)^a} \right)$$

To simplify notation further, we can also rewrite the numerator, the sum of "raw" (unnormalized) pseudo-probabilities, as a function:

~~<math>~~\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a~~</math>~~

$$\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a$$

Finally, we put this all together to obtain a simplified version of the Harmonic Rényi Entropy equation:

~~<math>~~\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\rho_a(c)}{\psi(c)^a} \right)~~</math>~~

$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\rho_a(c)}{\psi(c)^a} \right)$$

Line 270:

<math>\displaystyle \psi(c)</math>, the normalization function, is written as follows:

~~<math>~~\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)~~</math>~~

$$\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)$$

Again, <math>Q(j|c)</math> is defined as follows:

~~<math>~~\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}~~</math>~~

$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}$$

We can rewrite the above equation as a convolution with a delta distribution:

~~<math>~~\displaystyle Q(j|c) = \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)~~</math>~~

$$\displaystyle Q(j|c) = \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)$$

Putting this back into the original summation, we obtain

~~<math>~~\displaystyle \psi(c) = \sum_{j \in J} \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)~~</math>~~

$$\displaystyle \psi(c) = \sum_{j \in J} \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)$$

We note that the left factor in the convolution product is always the same <math>S(-c)</math>, which is not dependent on <math>j</math> in any way. Since convolution distributes over addition, we can factor the <math>S</math> out of the summation to obtain

~~<math>~~\displaystyle \psi(c) = \left[S \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right)\right](-c)~~</math>~~

$$\displaystyle \psi(c) = \left[S \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right)\right](-c)$$

We can clean up this notation by defining the auxiliary distribution K:

~~<math>~~\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}~~</math>~~

$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$

Which leaves us with the final expression:

~~<math>~~\displaystyle \psi(c) = \left[S \ast K\right](-c)~~</math>~~

$$\displaystyle \psi(c) = \left[S \ast K\right](-c)$$

==== Convolution product for <math>\rho_a(c)</math> ====

The derivation for <math>\rho_a(c)</math> proceeds similarly. Recall the function is written as follows:

~~<math>~~\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a~~</math>~~

$$\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a$$

The expression for each <math>Q(j|c)^a</math> is:

~~<math>~~\displaystyle Q(j|c)^a = \frac{S(\cent(j)-c)^a}{\|j\|^a}~~</math>~~

$$\displaystyle Q(j|c)^a = \frac{S(\cent(j)-c)^a}{\|j\|^a}$$

We can again express this as a convolution, this time of the function <math>S^a(-c)</math>, meaning the spreading function S taken to the a'th power, and a delta distribution:

~~<math>~~\displaystyle Q(j|c)^a = \left(S^a \ast \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)(-c)~~</math>~~

$$\displaystyle Q(j|c)^a = \left(S^a \ast \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)(-c)$$

Putting this back into the original summation and factoring as before, we obtain

~~<math>~~\displaystyle \rho_a(c) = \left[S^a \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)\right](-c)~~</math>~~

$$\displaystyle \rho_a(c) = \left[S^a \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)\right](-c)$$

And again we clean up notation by defining the auxiliary distribution

~~<math>~~\displaystyle K^a(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}~~</math>~~

$$\displaystyle K^a(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}$$

so that

~~<math>~~\displaystyle \rho_a(c) = \left[S^a \ast K^a\right](-c)~~</math>~~

$$\displaystyle \rho_a(c) = \left[S^a \ast K^a\right](-c)$$

We have now succeeded in representing <math>\rho_a(c)</math> as a convolution.

Line 339:

Taking all of this, we can rewrite the original expression for Harmonic Rényi Entropy as follows:

~~<math>~~\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)} \right)~~</math>~~

$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)} \right)$$

where the expression

~~<math>~~\displaystyle \left[S \ast K\right]^a(-c)~~</math>~~

$$\displaystyle \left[S \ast K\right]^a(-c)$$

represents the convolution of <math>S</math> and <math>K</math>, taken to the <math>a</math>'th power, and flipped backwards. Note that if <math>S(x)</math> is a symmetrical (even) spreading function, and if for each ratio <math>n/d</math> in <math>J</math>, if the inverse <math>d/n</math> is also in <math>J</math>, then the above convolution will also be symmetrical, and we also have

~~<math>~~\displaystyle \left[S \ast K\right]^a(-c) = \left[S \ast K\right]^a(c)~~</math>~~

$$\displaystyle \left[S \ast K\right]^a(-c) = \left[S \ast K\right]^a(c)$$

We have succeeded in representing Harmonic Rényi Entropy in simple terms of two convolution products, each of which can be computed in <math>O(N log N)</math> time.

Line 362:

In short, what we will show is that the Fourier Transform of this unnormalized Shannon Harmonic Entropy is given by

~~<math>~~|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}~~</math>~~

$$|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}$$

where <math>\phi(t)</math> is the characteristic function of the spreading distribution and <math>\overline {\phi(t)}</math> is complex conjugation. Below we also give an expression for the Renyi entropy for arbitrary choice of the parameter <math>a</math>.

Line 397:

Let's start by recalling the original definition for Harmonic Rényi Entropy, using simple weighted probabilities:

~~<math>~~\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a~~</math>~~

$$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$

Remember also that the definition of <math>P(j|c)</math> is as follows:

~~<math>~~\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}~~</math>~~

$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$

where the <math>Q(j|c)</math> is the "unnormalized probability" - the raw value of the spreading function, evaluated at the ratio in question, divided by the ratio's weighting. The above equation tells us that the normalized probability is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities.

Line 408:

Putting the two together, we get

~~<math>~~\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} \left( \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)} \right)^a~~</math>~~

$$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} \left( \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)} \right)^a$$

Now, for us to define the unnormalized HE, we simply take the standard Rényi entropy equation, and replace the normalized probabilities with unnormalized ones, yielding

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} Q(j|c)^a~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} Q(j|c)^a$$

Using our convolution theorem from before, we can express the above as

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$

where, as before, <math>S^a</math> is our spreading function, taken to the <math>a</math>'th power, and <math>K^a</math> is our convolution kernel, with the weights on the delta functions taken to the <math>a</math>'th power as described previously.

Line 427:

Lastly, it so happens that it will be much easier to understand our analytic continuation if we look at the exponential of the UHE times <math>(1-a)</math>, rather than the UHE itself. The reasons for this will become clear later. If we do so, we get

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$

Note that this function is simply a monotonic transformation of the original, and so preserves the exact same concordance ranking on all intervals.

Line 435:

The definition for <math>K</math> is:

~~<math>~~\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}~~</math>~~

$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$

where <math>\|j\|</math> represents the weighting of the JI basis ratio <math>j</math>. In the particular case of Tenney weighting, we get:

~~<math>~~\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}~~</math>~~

$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}$$

where <math>j_n</math> and <math>j_d</math> are the numerator and denominator of <math>j</math>, respectively.

Line 446:

Although it may seem odd, we can take the Fourier transform of the above to obtain the following expression:

~~<math>~~\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i t \cent(j)}}{(j_n \cdot j_d)^{0.5}}~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i t \cent(j)}}{(j_n \cdot j_d)^{0.5}}$$

Furthermore, for simplicity, we can change the units, so that rather than the argument being given in cents, it is given in "natural" units of "[https://en.wikipedia.org/wiki/Neper nepers]", a technique often used by Martin Gough in his work on [[Logarithmic_approximants|Logarithmic approximants]]. The representation of any interval in nepers is given by simply taking is natural logarithm. Doing so, by defining the change of variables <math>c = \frac{1200}{\log(2)}n</math>, we obtain

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}$$

We can treat the presence of the logarithm within the exponential function as changing the base of the exponential, so that we get

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{(j_n/j_d)^{i t}}{(j_n \cdot j_d)^{0.5}}~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{(j_n/j_d)^{i t}}{(j_n \cdot j_d)^{0.5}}$$

We can also factor each term in the summation to obtain

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{{j_n}^{i t}}{j_n^{0.5}} \cdot \frac{{j_d}^{-i t}}{j_d^{0.5}} \right]~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{{j_n}^{i t}}{j_n^{0.5}} \cdot \frac{{j_d}^{-i t}}{j_d^{0.5}} \right]$$

which we can rewrite as

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]$$

Line 471:

Bounding by <math>\max(n,d) < N</math> is the same as specifying that <math>j_n < N</math> and <math>j_d < N</math>. Doing so, we get

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]$$

We can now factor the above product to obtain:

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i t}} \right]~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i t}} \right]$$

Now, we can see that as <math>N \to \infty</math> above, the summations do not converge. However, incredibly enough, each of the above expressions has a very well-known analytic continuation, which is the Riemann zeta function.

Line 483:

To perform the analytic continuation, we temporarily change the <math>0.5</math> in the denominator to some other weight <math>w> 1</math>. This is equivalent to changing our original <math>\sqrt{nd}</math> weighting to some other exponent, such as <math>(nd)^2</math> or <math>(nd)^{1.5}</math>. Doing this causes both of the summations above to converge, so that we obtain

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i t}} \right]~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i t}} \right]$$

It has been well known for more than a century that both of these summations converge to the Riemann zeta function, so that we get

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \zeta(w-i t) \cdot \zeta(w+i t)~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \zeta(w-i t) \cdot \zeta(w+i t)$$

Rewriting as a function of a complex variable <math>z = w + i t</math>, and noting that the zeta function obeys the property that <math>\zeta(\overline z)=\overline{\zeta(z)}</math>, where <math>\overline s</math> represents complex conjugation, we get

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2$$

Line 499:

Furthermore, although the above series doesn't converge for <math>w = 0.5</math>, we can simply use the analytic continuation of the Riemann zeta function to obtain a meaningful function at that point, so that our original convolution kernel can be written as

~~<math>~~\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)~~</math>~~

$$\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)$$

which is the inverse Fourier transform of the squared absolute value of the Riemann zeta function, taken at the critical line.

Line 508:

It is likewise easy to show that the function <math>K^a(n)</math>, taken from the numerator of our original Harmonic Rényi Entropy convolution expression, can be expressed as

~~<math>~~\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)~~</math>~~

$$\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)$$

so that the choice of <math>a</math> simply changes our choice of vertical slice of the Riemann zeta function, as well as the shape of our spreading function (because it is also being raised to a power). If our spreading function is a Gaussian, then we simply get another Gaussian with a different standard deviation.

Line 519:

Our original equation was

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast K^a \right)(-n)~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast K^a \right)(-n)$$

Using our expression for <math>K^a</math> as <math>N \to \infty</math>, we get

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)$$

To simplify this, we will define an auxiliary notation for the zeta function as follows:

~~<math>~~\zeta_w(t) = \zeta(w + i t)~~</math>~~

$$\zeta_w(t) = \zeta(w + i t)$$

yielding the simplified expression:

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta_{0.5a}|^2\right\} \right)(-n)~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta_{0.5a}|^2\right\} \right)(-n)$$

We can simplify the expression of the above if we likewise take the Fourier transform of <math>S</math>. If we do, we obtain the [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) characteristic function] of the distribution, which is typically denoted by <math>\phi(t)</math>. We will use the following definitions:

~~<math>~~\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)~~</math>~~

$$\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)$$

~~<math>~~\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)~~</math>~~

$$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$

Doing so, and noting that convolution becomes multiplication in the Fourier domain, we get

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\phi_a \cdot |\zeta_{0.5a}|^2\right\}(-n)~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\phi_a \cdot |\zeta_{0.5a}|^2\right\}(-n)$$

Lastly, we note that for any real function <math>f(x)</math>, we have <math>\mathcal{F}\left\{f(-x)\right\} = \mathcal{F}\left\{\overline {f(x)} \right\}</math>, where <math>\overline x</math> is complex conjugation. For simplicity's sake, we can this write as <math>\mathcal{F}\left\{\overline f\right\}</math>. Putting that all together, we get

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$

where we can drop the overline on <math>|\zeta_{0.5a}|^2</math> because it is purely real, and its complex conjugate is itself.

Line 584:

Let's go back to our original convolution expression for finite-<math>N</math> UHE:

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$

We will also, for now, use the notation

~~<math>~~U(c) = \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)~~</math>~~

$$U(c) = \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$

to get

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)$$

Now let's consider the auxiliary function <math>\tilde{U}(c) = U(c) - U(0)</math>, which gives us a "shifted" version of the UHE which has the UHE of 1/1 normalized to 0. Then we can re-express the UHE as follows:

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)$$

We can expand the above into a Taylor series as follows:

~~<math>~~\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \left(\log(U(0)) + \frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)~~</math>~~

$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \left(\log(U(0)) + \frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)$$

Now, suppose we only care about the behavior of this function up to a constant vertical shift and scaling. Then we can drop the <math>\frac{1}{1-a}</math> term, since it's just a constant scaling, and also get rid of the <math>\log(U(0))</math> term, which simply subtracts a constant offset. This leaves us with

~~<math>~~\displaystyle \text{UHE}_a(c) \sim \left(\frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)~~</math>~~

$$\displaystyle \text{UHE}_a(c) \sim \left(\frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)$$

where <math>\sim</math> denotes the two sides are now "equivalent" up to a constant shifting and scaling.

Line 610:

Finally, we can go one step further and multiply the above by the constant <math>U(0)</math>. Doing so, we get

~~<math>~~\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{2 U(0)} + \frac{\tilde{U}(c)^3}{3 U(0)^2} - ...\right)~~</math>~~

$$\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{2 U(0)} + \frac{\tilde{U}(c)^3}{3 U(0)^2} - ...\right)$$

And we now have a function that is equivalent to our original, up to a constant shift and scaling, but which is fairly easy to analyze asymptotically.

Line 624:

Assuming this is true (which we will not prove here, but which seems self-evident given all of the other results), we can go back to our last expression for exp-UHE:

~~<math>~~\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{U(0)} + \frac{\tilde{U}(c)^3}{U(0)} - ...\right)~~</math>~~

$$\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{U(0)} + \frac{\tilde{U}(c)^3}{U(0)} - ...\right)$$

and see that as <math>N \to \infty</math>, the <math>U(0)</math> terms blow up, whereas the <math>\tilde{U}(c)</math> terms do not. As a result, all of the terms with <math>U(0)</math> in the denominator disappear, and we are left with:

~~<math>~~\displaystyle \text{UHE}_a(c) \sim \tilde{U}(c)~~</math>~~

$$\displaystyle \text{UHE}_a(c) \sim \tilde{U}(c)$$

Putting this together from our prior realization that <math>\text{UHE}_a(c) \sim \log \tilde{U}(c)</math>, we get

Line 634:

before, we get

~~<math>~~\displaystyle \tilde{U}(c) \sim \log \tilde{U}(c)~~</math>~~

$$\displaystyle \tilde{U}(c) \sim \log \tilde{U}(c)$$

additionally, noting that <math>\log \tilde{U}(c) \sim \frac{1}{1-a} \log \tilde{U}(c) = \log \tilde{U}(c)^{\frac{1}{1-a}} \sim \tilde{U}(c)^{\frac{1}{1-a}} = \exp(\text{UHE}_a(c))</math>, we get

~~<math>~~\displaystyle \text{UHE}_a(c) \sim \exp(\text{UHE}_a(c))~~</math>~~

$$\displaystyle \text{UHE}_a(c) \sim \exp(\text{UHE}_a(c))$$

as <math>N \to \infty</math>, for <math>a \leq 2</math>.

Line 649:

On the surface, everything we did with the convolution theorem, and subsequent analytic continuation, should appear to work for normalized HE as well. For example, let's review our result for the exp of unnormalized HE:

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$

Our original expression for normalized HE was

~~<math>~~\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)}~~</math>~~

$$\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)}$$

Naively, you might expect we could simply apply the same analytic continuation technique to the numerator and denominator. If we do so, we would get

~~<math>~~\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}}{\mathcal{F}^{-1}\left\{\overline \phi \cdot |\zeta_{0.5}|^2\right\}^a}~~</math>~~

$$\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}}{\mathcal{F}^{-1}\left\{\overline \phi \cdot |\zeta_{0.5}|^2\right\}^a}$$

Line 695:

If we add this as a third parameter, called <math>w</math> we can modify our definition of exp-UHE as follows:

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$

So that our vertical slice of the zeta function is given by $\Re(z) = w\cdot \a$.

Line 705:

Let's go back to our three-parameter definition of exp-UHE above:

~~<math>~~\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}~~</math>~~

$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$

We can see that, in a sense, the need for both <math>a</math> and <math>w</math> is almost redundant. Their product specifies the vertical slice of the zeta function. If you set <math>w=0.5</math> and <math>a=1</math>, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set <math>w=0.25</math> and <math>a=2</math>, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.

Line 711:

The only reason that these expressions are different is due to the <math>\phi_a</math> above. We had previously defined that as:

~~<math>~~\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)~~</math>~~

$$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$

or, the Fourier transform of the spreading distribution, raised to the power of <math>a</math>. So if you hold the product <math>w a</math> as constant, but change the balance of <math>w</math> and <math>a</math>, you will indeed get different results, simply because only the choice of <math>a</math> changes the <math>\phi_a</math>.

Line 725:

In our derivation, we assumed the use of unreduced rationals. It turns out that with a minor adjustment, the same model gives us reduced rationals, up to a constant multiplicative scaling. Let's go back to our analytic continuation of the convolution kernel, for some arbitrary weighting:

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}$$

Now, suppose we want to analytically continue this so that the set <math>J</math> is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as <math>\frac{n}{d}</math>, but rather as <math>\frac{n}{d} \cdot \frac{c}{c}</math>, where <math>n'</math> and <math>d'</math> are coprime, and <math>c</math> is the gcd of both. For example, we would express <math>\frac{6}{4}</math> as <math>\frac{3}{2} \cdot \frac{2}{2}</math>. Doing so, and assuming that we denote the set of unreduced rationals by <math>\mathbb{U}</math>, we get the following equivalent expression of the same convolution kernel above:

~~<math>~~\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2~~</math>~~

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2$$

where the last equality is what we proved before. Note that this is the same exact function as before, just written such that the GCD of the unreduced fraction has an explicit term.

Line 735:

The <math>\frac{j_c}{j_c}</math> in the log in the numerator cancels out, but in the denominator we have an extra factor of <math>{j_c}^2</math> to contend with. This yields

~~<math>~~\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]~~</math>~~

$$\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

Now, assuming we have <math>w>1</math> and everything is absolutely convergent, we can factor this into a product of series as follows:

~~<math>~~\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]~~</math>~~

$$\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

where the left summation now has <math>j_c \in \mathbb{N}^+</math>, the set of strictly positive rational numbers, and the right summation now has <math>j \in \mathbb{Q}</math> the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the <math>j_c</math>.

Line 745:

Now, note that that left series is, itself, just another Dirichlet series that converges to the zeta function. We have

~~<math>~~\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]~~</math>~~

$$\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by <math>\zeta(2w)</math>:

~~<math>~~\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}~~</math>~~

$$\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}$$

This function then becomes our new <math>\mathcal{F}\left\{K(n)\right\}</math>.

@@ Line 41: / Line 41: @@
 So for example, if we want to express the probability that the incoming dyad "400 cents" is perceived as the JI basis interval "5/4," we would write that as the conditional probability
-<math>\displaystyle \newcommand{\cent}{\text{¢}}</math>
+$$\displaystyle \newcommand{\cent}{\text{¢}}$$
-<math>\displaystyle P(J=5/4|C=400\cent)</math>
+$$\displaystyle P(J=5/4|C=400\cent)$$
 Or, in general, if we want to write the conditional probability that some incoming dyad of <math>c</math> cents is perceived as the JI basis interval <math>j</math>, we would write that as
-<math>\displaystyle P(J=j|C=c)</math>
+$$\displaystyle P(J=j|C=c)$$
 which notationally, we will often abbreviate as
-<math>\displaystyle P(j|c)</math>
+$$\displaystyle P(j|c)$$
 Note that at this point, we haven't yet specified what the particular probability distribution is. There are different ways to do this, which are described in more detail below. Generally, most approaches involve each JI interval's probability being assigned based on how close it is to <math>c</math> (closer dyads are given a larger probability), and how simple it is (simple dyads are given a higher probability, if distance is the same).
@@ Line 58: / Line 58: @@
 Once we have decided on a probability distribution, we can finally evaluate the Shannon entropy. For a random variable <math>X</math>, the Shannon entropy is defined as:
-<math>\displaystyle H(X) = -\sum_{x \in X} P(x) \log_b P(x)</math>
+$$\displaystyle H(X) = -\sum_{x \in X} P(x) \log_b P(x)$$
 where the different <math>x</math> are taken from the sample space of <math>X</math>, and <math>b</math> is the base of the log. Different choices of <math>b</math> simply change the units in which entropy is given, the most common values being 2 and e, denoting "bits" and "nats". We will omit the base going forward, for simplicity.
@@ Line 64: / Line 64: @@
 In our case, we want to find the entropy of the random variable <math>J</math> of JI intervals, given a particular choice of incoming dyad in cents. The corresponding quantity that we want is:
-<math>\displaystyle H(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)</math>
+$$\displaystyle H(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)$$
 Note that above, the summation is only taken on the <math>j</math> from the sample space of <math>J</math> (i.e. the set of JI basis intervals), whereas the parameter <math>c</math> is treated as constant within the summation (and is taken as the free parameter to the function).
@@ Line 70: / Line 70: @@
 Since the parameter <math>c</math> is the free parameter, sometimes the above is notated as
-<math>\displaystyle \text{HE}(c) = H(J|c)</math>
+$$\displaystyle \text{HE}(c) = H(J|c)$$
 which makes more explicit that <math>c</math> is the argument to the harmonic entropy function, which is equal to the entropy of <math>J</math>, conditioned on the incoming dyad of <math>c</math> cents.
@@ Line 85: / Line 85: @@
 Other spreading functions have also been explored, such as the use of the heavy-tailed [https://en.wikipedia.org/wiki/Laplace_distribution Laplace distribution], sometimes described as the "Vos function" in Paul's writings. These two functions are part of the [https://en.wikipedia.org/wiki/Generalized_normal_distribution Generalized normal distribution] family, which has a parameter not only for the variance but for the kurtosis. However, for simplicity, we will assume the Gaussian distribution as the spreading function for the remainder of this article, so that the spreading function for an incoming dyad <math>c</math> can be written as follows:
-<math>\displaystyle S(x-c) = \frac{1}{s\sqrt{2\pi}} e^{-\frac{(x-c)^2}{2s^2}}</math>
+$$\displaystyle S(x-c) = \frac{1}{s\sqrt{2\pi}} e^{-\frac{(x-c)^2}{2s^2}}$$
 where the notation <math>S(x-c)</math> is chosen to make clear that we are translating <math>S(x)</math> to be centered around the incoming dyad <math>c</math>, which is now the mean of the Gaussian.
@@ Line 102: / Line 102: @@
 For discrete sets of JI basis ratios, the log-frequency spectrum can be divided up into '''domains''' assigned to each ratio. Each ratio is assigned a domain with lower bound equal to the mediant of itself and its nearest lower neighbor, and likewise with upper bound equal to the mediant of itself and its nearest upper neighbor. If no such neighbor exists, <math>\pm \infty</math> is used instead. Mathematically, this can be represented via the following expression:
-<math>\displaystyle P(j|c) = \int_{\cent(j_l)}^{\cent(j_u)} S(x-c) dx</math>
+$$\displaystyle P(j|c) = \int_{\cent(j_l)}^{\cent(j_u)} S(x-c) dx$$
 where <math>S(x-c)</math> is the spreading function associated with c, <math>j_l</math> and <math>j_u</math> are the domain lower and upper bounds associated with JI basis ratio <math>j</math>, and <math>\cent(f) = 1200\log_2(f)</math>, or the "cents" function converting frequency ratios to cents. Typically, <math>j_l</math> is set equal to the mediant of <math>j</math> and its nearest lower neighbor (if it exists), or <math>-\infty</math> if not; likewise with <math>j_u</math> and its nearest upper neighbor.
@@ Line 121: / Line 121: @@
 This modifies the expression for the probabilities <math>P(j|c)</math> as follows, noting that for now the "probabilities" won't sum to 1:
-<math>\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\sqrt{j_n \cdot j_d}}</math>
+$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\sqrt{j_n \cdot j_d}}$$
 where the <math>Q</math> notation now represents that these "probabilities" are unnormalized, and <math>j_n</math> and <math>j_d</math> are the numerator and denominator, respectively, of JI basis ratio <math>j</math>. Again, the set of basis rationals here is assumed to be all of those rationals of Tenney Height ≤ <math>N</math> for some <math>N</math>.
@@ Line 127: / Line 127: @@
 A similar observation for the use of Weil-bounded subsets of the rationals suggests domain widths of <math>\frac{1}{\max(n,d)}</math>, yielding instead the following formula:
-<math>\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\max(j_n, j_d)}</math>
+$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\max(j_n, j_d)}$$
 where this time the set of basis rationals is assumed to be all of those of Weil Height ≤ <math>N</math> for some <math>N</math>.
@@ Line 133: / Line 133: @@
 In both cases, the general approach is the same: the value of the spreading function, taken at the value of <math>\cent(j)</math>, is divided by some sort of "weighting" (or sometimes, "complexity") function representing how much weight is given to that rational number. While the two weighting functions considered thus far were derived empirically by observing the asymptotic behavior of various height-bounded subsets of the rationals, we can generalize this for arbitrary basis sets of rationals and arbitrary weights as follows:
-<math>\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}</math>
+$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}$$
 where <math>\|j\|</math> denotes a weighting function that maps from rational numbers to non-negative reals.
@@ Line 139: / Line 139: @@
 As these "probabilities" don't sum to 1, the result is not a probability distribution at all, invalidating the use of the Shannon Entropy. To rectify this, the distribution is normalized so that the probabilities do sum to 1:
-<math>\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}</math>
+$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$
 which is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities. This definition of <math>P(j|c)</math> is then used directly to compute the entropy.
@@ Line 172: / Line 172: @@
 The '''Harmonic Rényi Entropy of order a''' of an incoming dyad can be defined as follows:
-<math>\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a</math>
+$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$
 Being a q-analog, it is noteworthy that Rényi entropy converges to Shannon entropy in the limit as <math>a \to 1</math>, a fact which can be verified using L'Hôpital's rule as found [http://www.sonycsl.co.jp/person/nielsen/Note-HopitalRuleShannonRenyiTsallis.pdf here].
@@ Line 188: / Line 188: @@
 === Examples ===
 ==== a=0: Harmonic Hartley Entropy ====
-<math>\displaystyle H_0(J|c) = \log |J|</math>
+$$\displaystyle H_0(J|c) = \log |J|$$
 where <math>|J|</math> is the cardinality of the set of basis rationals. This assumes, in essence, an "infinitely dumb" auditory system which can do no better than picking a rational number from a uniform distribution completely at random. All dyads have the same Harmonic Hartley Entropy. The Hartley Entropy is sometimes called the "max-entropy," and is useful mainly as an upper bound on the other forms of entropy: all Rényi Entropies are always guaranteed to be less than the Hartley Entropy.
@@ Line 197: / Line 197: @@
 ==== a=1: Harmonic Shannon Entropy (Harmonic Entropy) ====
-<math>\displaystyle H_1(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)</math>
+$$\displaystyle H_1(J|c) = -\sum_{j \in J} P(j|c) \log P(j|c)$$
 This is Paul's original Harmonic Entropy. Within the cryptographic analogy, this can be thought of as an auditory system which simply selects a rational at random from the incoming distribution, weighted via the distribution itself.
@@ Line 206: / Line 206: @@
 ==== a=2: Harmonic Collision Entropy ====
-<math>\displaystyle H_2(J|c) = -\log \sum_{j \in J} P(j|c)^2 = -\log (J_1 = J_2|c)</math>
+$$\displaystyle H_2(J|c) = -\log \sum_{j \in J} P(j|c)^2 = -\log (J_1 = J_2|c)$$
 where <math>J_1</math> and <math>J_2</math> are two independent and identically distributed random variables of JI basis ratios, conditioned on the same incoming dyad <math>c</math>, and the collision entropy is the same as the negative log of the probability that the two JI variables produce the same outcome.
@@ Line 215: / Line 215: @@
 ==== a=∞: Harmonic Min-Entropy ====
-<math>\displaystyle H_\infty(J|c) = -\log \max_{j \in J} P(j|c)</math>
+$$\displaystyle H_\infty(J|c) = -\log \max_{j \in J} P(j|c)$$
 This is the min-entropy, which simply takes the negative log of the largest probability in the distribution. This can be thought of as representing the "strength" of the incoming dyad from being "deciphered" by a "best-case" auditory system. The name "min-entropy" reflects that the <math>a=\infty</math> case is guaranteed to be a lower bound among all Rényi entropies.
@@ Line 231: / Line 231: @@
 The Harmonic Rényi Entropy is defined as
-<math>\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a</math>
+$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$
 As before, we can write <math>P(j|c)</math> as follows:
-<math>\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}</math>
+$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$
 where <math>Q(j|c)</math> is the "unnormalized" probability, and the denominator above is the sum of these unnormalized probabilities, so that all of the <math>P(j|c)</math> sum to 1.
@@ Line 243: / Line 243: @@
 To simplify notation, we first rewrite the denominator as a "normalization" function:
-<math>\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)</math>
+$$\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)$$
 and putting back into the original equation, we get
-<math>\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \sum_{j \in J} \left( \frac{Q(j|c)}{\psi(c)} \right)^a \right)</math>
+$$\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \sum_{j \in J} \left( \frac{Q(j|c)}{\psi(c)} \right)^a \right)$$
 Since <math>\psi(c)</math> is the same for each basis ratio, we can pull it out of the summation to obtain:
-<math>\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\sum_{j \in J} Q(j|c)^a}{\psi(c)^a} \right)</math>
+$$\displaystyle H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\sum_{j \in J} Q(j|c)^a}{\psi(c)^a} \right)$$
 To simplify notation further, we can also rewrite the numerator, the sum of "raw" (unnormalized) pseudo-probabilities, as a function:
-<math>\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a</math>
+$$\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a$$
 Finally, we put this all together to obtain a simplified version of the Harmonic Rényi Entropy equation:
-<math>\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\rho_a(c)}{\psi(c)^a} \right)</math>
+$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\rho_a(c)}{\psi(c)^a} \right)$$
@@ Line 270: / Line 270: @@
 <math>\displaystyle \psi(c)</math>, the normalization function, is written as follows:
-<math>\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)</math>
+$$\displaystyle \psi(c) = \sum_{j \in J} Q(j|c)$$
 Again, <math>Q(j|c)</math> is defined as follows:
-<math>\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}</math>
+$$\displaystyle Q(j|c) = \frac{S(\cent(j)-c)}{\|j\|}$$
 We can rewrite the above equation as a convolution with a delta distribution:
-<math>\displaystyle Q(j|c) = \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)</math>
+$$\displaystyle Q(j|c) = \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)$$
 Putting this back into the original summation, we obtain
-<math>\displaystyle \psi(c) = \sum_{j \in J} \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)</math>
+$$\displaystyle \psi(c) = \sum_{j \in J} \left(S \ast \frac{\delta_{-\cent(j)}}{\|j\|}\right)(-c)$$
 We note that the left factor in the convolution product is always the same <math>S(-c)</math>, which is not dependent on <math>j</math> in any way. Since convolution distributes over addition, we can factor the <math>S</math> out of the summation to obtain
-<math>\displaystyle \psi(c) = \left[S \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right)\right](-c)</math>
+$$\displaystyle \psi(c) = \left[S \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}\right)\right](-c)$$
 We can clean up this notation by defining the auxiliary distribution K:
-<math>\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}</math>
+$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$
 Which leaves us with the final expression:
-<math>\displaystyle \psi(c) = \left[S \ast K\right](-c)</math>
+$$\displaystyle \psi(c) = \left[S \ast K\right](-c)$$
 ==== Convolution product for <math>\rho_a(c)</math> ====
 The derivation for <math>\rho_a(c)</math> proceeds similarly. Recall the function is written as follows:
-<math>\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a</math>
+$$\displaystyle \rho_a(c) = \sum_{j \in J} Q(j|c)^a$$
 The expression for each <math>Q(j|c)^a</math> is:
-<math>\displaystyle Q(j|c)^a = \frac{S(\cent(j)-c)^a}{\|j\|^a}</math>
+$$\displaystyle Q(j|c)^a = \frac{S(\cent(j)-c)^a}{\|j\|^a}$$
 We can again express this as a convolution, this time of the function <math>S^a(-c)</math>, meaning the spreading function S taken to the a'th power, and a delta distribution:
-<math>\displaystyle Q(j|c)^a = \left(S^a \ast \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)(-c)</math>
+$$\displaystyle Q(j|c)^a = \left(S^a \ast \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)(-c)$$
 Putting this back into the original summation and factoring as before, we obtain
-<math>\displaystyle \rho_a(c) = \left[S^a \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)\right](-c)</math>
+$$\displaystyle \rho_a(c) = \left[S^a \ast \left(\sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}\right)\right](-c)$$
 And again we clean up notation by defining the auxiliary distribution
-<math>\displaystyle K^a(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}</math>
+$$\displaystyle K^a(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|^a}$$
 so that
-<math>\displaystyle \rho_a(c) = \left[S^a \ast K^a\right](-c)</math>
+$$\displaystyle \rho_a(c) = \left[S^a \ast K^a\right](-c)$$
 We have now succeeded in representing <math>\rho_a(c)</math> as a convolution.
@@ Line 339: / Line 339: @@
 Taking all of this, we can rewrite the original expression for Harmonic Rényi Entropy as follows:
-<math>\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)} \right)</math>
+$$\displaystyle \text{HE}_a(c) = H_a(J|c) = \frac{1}{1-a} \log \left( \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)} \right)$$
 where the expression
-<math>\displaystyle \left[S \ast K\right]^a(-c)</math>
+$$\displaystyle \left[S \ast K\right]^a(-c)$$
 represents the convolution of <math>S</math> and <math>K</math>, taken to the <math>a</math>'th power, and flipped backwards. Note that if <math>S(x)</math> is a symmetrical (even) spreading function, and if for each ratio <math>n/d</math> in <math>J</math>, if the inverse <math>d/n</math> is also in <math>J</math>, then the above convolution will also be symmetrical, and we also have
-<math>\displaystyle \left[S \ast K\right]^a(-c) = \left[S \ast K\right]^a(c)</math>
+$$\displaystyle \left[S \ast K\right]^a(-c) = \left[S \ast K\right]^a(c)$$
 We have succeeded in representing Harmonic Rényi Entropy in simple terms of two convolution products, each of which can be computed in <math>O(N log N)</math> time.
@@ Line 362: / Line 362: @@
 In short, what we will show is that the Fourier Transform of this unnormalized Shannon Harmonic Entropy is given by
-<math>|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}</math>
+$$|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}$$
 where <math>\phi(t)</math> is the characteristic function of the spreading distribution and <math>\overline {\phi(t)}</math> is complex conjugation. Below we also give an expression for the Renyi entropy for arbitrary choice of the parameter <math>a</math>.
@@ Line 397: / Line 397: @@
 Let's start by recalling the original definition for Harmonic Rényi Entropy, using simple weighted probabilities:
-<math>\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a</math>
+$$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$
 Remember also that the definition of <math>P(j|c)</math> is as follows:
-<math>\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}</math>
+$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$
 where the <math>Q(j|c)</math> is the "unnormalized probability" - the raw value of the spreading function, evaluated at the ratio in question, divided by the ratio's weighting. The above equation tells us that the normalized probability is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities.
@@ Line 408: / Line 408: @@
 Putting the two together, we get
-<math>\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} \left( \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)} \right)^a</math>
+$$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} \left( \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)} \right)^a$$
 Now, for us to define the unnormalized HE, we simply take the standard Rényi entropy equation, and replace the normalized probabilities with unnormalized ones, yielding
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} Q(j|c)^a</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} Q(j|c)^a$$
 Using our convolution theorem from before, we can express the above as
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$
 where, as before, <math>S^a</math> is our spreading function, taken to the <math>a</math>'th power, and <math>K^a</math> is our convolution kernel, with the weights on the delta functions taken to the <math>a</math>'th power as described previously.
@@ Line 427: / Line 427: @@
 Lastly, it so happens that it will be much easier to understand our analytic continuation if we look at the exponential of the UHE times <math>(1-a)</math>, rather than the UHE itself. The reasons for this will become clear later. If we do so, we get
-<math>\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$
 Note that this function is simply a monotonic transformation of the original, and so preserves the exact same concordance ranking on all intervals.
@@ Line 435: / Line 435: @@
 The definition for <math>K</math> is:
-<math>\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}</math>
+$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$
 where <math>\|j\|</math> represents the weighting of the JI basis ratio <math>j</math>. In the particular case of Tenney weighting, we get:
-<math>\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}</math>
+$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}$$
 where <math>j_n</math> and <math>j_d</math> are the numerator and denominator of <math>j</math>, respectively.
@@ Line 446: / Line 446: @@
 Although it may seem odd, we can take the Fourier transform of the above to obtain the following expression:
-<math>\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \cent(j)}}{(j_n \cdot j_d)^{0.5}}</math>
+$$\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \cent(j)}}{(j_n \cdot j_d)^{0.5}}$$
 Furthermore, for simplicity, we can change the units, so that rather than the argument being given in cents, it is given in "natural" units of "[https://en.wikipedia.org/wiki/Neper nepers]", a technique often used by Martin Gough in his work on [[Logarithmic_approximants|Logarithmic approximants]]. The representation of any interval in nepers is given by simply taking is natural logarithm. Doing so, by defining the change of variables <math>c = \frac{1200}{\log(2)}n</math>, we obtain
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}$$
 We can treat the presence of the logarithm within the exponential function as changing the base of the exponential, so that we get
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{(j_n/j_d)^{i  t}}{(j_n \cdot j_d)^{0.5}}</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{(j_n/j_d)^{i  t}}{(j_n \cdot j_d)^{0.5}}$$
 We can also factor each term in the summation to obtain
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{{j_n}^{i  t}}{j_n^{0.5}} \cdot \frac{{j_d}^{-i  t}}{j_d^{0.5}} \right]</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{{j_n}^{i  t}}{j_n^{0.5}} \cdot \frac{{j_d}^{-i  t}}{j_d^{0.5}} \right]$$
 which we can rewrite as
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{1}{{j_n}^{0.5 -i  t}} \cdot \frac{1}{{j_d}^{0.5 + i  t}} \right]</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \left[ \frac{1}{{j_n}^{0.5 -i  t}} \cdot \frac{1}{{j_d}^{0.5 + i  t}} \right]$$
@@ Line 471: / Line 471: @@
 Bounding by <math>\max(n,d) < N</math> is the same as specifying that <math>j_n < N</math> and <math>j_d < N</math>. Doing so, we get
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i  t}} \cdot \frac{1}{{j_d}^{0.5 + i  t}} \right]</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i  t}} \cdot \frac{1}{{j_d}^{0.5 + i  t}} \right]$$
 We can now factor the above product to obtain:
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i  t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i  t}} \right]</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i  t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i  t}} \right]$$
 Now, we can see that as <math>N \to \infty</math> above, the summations do not converge. However, incredibly enough, each of the above expressions has a very well-known analytic continuation, which is the Riemann zeta function.
@@ Line 483: / Line 483: @@
 To perform the analytic continuation, we temporarily change the <math>0.5</math> in the denominator to some other weight <math>w> 1</math>. This is equivalent to changing our original <math>\sqrt{nd}</math> weighting to some other exponent, such as <math>(nd)^2</math> or <math>(nd)^{1.5}</math>. Doing this causes both of the summations above to converge, so that we obtain
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i  t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i  t}} \right]</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i  t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i  t}} \right]$$
 It has been well known for more than a century that both of these summations converge to the Riemann zeta function, so that we get
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \zeta(w-i t) \cdot \zeta(w+i t)</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \zeta(w-i t) \cdot \zeta(w+i t)$$
 Rewriting as a function of a complex variable <math>z = w + i t</math>, and noting that the zeta function obeys the property that <math>\zeta(\overline z)=\overline{\zeta(z)}</math>, where <math>\overline s</math> represents complex conjugation, we get
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2$$
@@ Line 499: / Line 499: @@
 Furthermore, although the above series doesn't converge for <math>w = 0.5</math>, we can simply use the analytic continuation of the Riemann zeta function to obtain a meaningful function at that point, so that our original convolution kernel can be written as
-<math>\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)</math>
+$$\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)$$
 which is the inverse Fourier transform of the squared absolute value of the Riemann zeta function, taken at the critical line.
@@ Line 508: / Line 508: @@
 It is likewise easy to show that the function <math>K^a(n)</math>, taken from the numerator of our original Harmonic Rényi Entropy convolution expression, can be expressed as
-<math>\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)</math>
+$$\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)$$
 so that the choice of <math>a</math> simply changes our choice of vertical slice of the Riemann zeta function, as well as the shape of our spreading function (because it is also being raised to a power). If our spreading function is a Gaussian, then we simply get another Gaussian with a different standard deviation.
@@ Line 519: / Line 519: @@
 Our original equation was
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast K^a \right)(-n)</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast K^a \right)(-n)$$
 Using our expression for <math>K^a</math> as <math>N \to \infty</math>, we get
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast  \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast  \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)$$
 To simplify this, we will define an auxiliary notation for the zeta function as follows:
-<math>\zeta_w(t) = \zeta(w + i t)</math>
+$$\zeta_w(t) = \zeta(w + i t)$$
 yielding the simplified expression:
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast  \mathcal{F}^{-1}\left\{|\zeta_{0.5a}|^2\right\} \right)(-n)</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast  \mathcal{F}^{-1}\left\{|\zeta_{0.5a}|^2\right\} \right)(-n)$$
 We can simplify the expression of the above if we likewise take the Fourier transform of <math>S</math>. If we do, we obtain the [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) characteristic function] of the distribution, which is typically denoted by <math>\phi(t)</math>. We will use the following definitions:
-<math>\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)</math>
+$$\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)$$
-<math>\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)</math>
+$$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$
 Doing so, and noting that convolution becomes multiplication in the Fourier domain, we get
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\phi_a \cdot |\zeta_{0.5a}|^2\right\}(-n)</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\phi_a \cdot |\zeta_{0.5a}|^2\right\}(-n)$$
 Lastly, we note that for any real function <math>f(x)</math>, we have <math>\mathcal{F}\left\{f(-x)\right\} = \mathcal{F}\left\{\overline {f(x)} \right\}</math>, where <math>\overline x</math> is complex conjugation. For simplicity's sake, we can this write as <math>\mathcal{F}\left\{\overline f\right\}</math>. Putting that all together, we get
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$
 where we can drop the overline on <math>|\zeta_{0.5a}|^2</math> because it is purely real, and its complex conjugate is itself.
@@ Line 584: / Line 584: @@
 Let's go back to our original convolution expression for finite-<math>N</math> UHE:
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$
 We will also, for now, use the notation
-<math>U(c) = \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)</math>
+$$U(c) = \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$
 to get
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)$$
 Now let's consider the auxiliary function <math>\tilde{U}(c) = U(c) - U(0)</math>, which gives us a "shifted" version of the UHE which has the UHE of 1/1 normalized to 0. Then we can re-express the UHE as follows:
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)$$
 We can expand the above into a Taylor series as follows:
-<math>\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \left(\log(U(0)) + \frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)</math>
+$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \left(\log(U(0)) + \frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)$$
 Now, suppose we only care about the behavior of this function up to a constant vertical shift and scaling. Then we can drop the <math>\frac{1}{1-a}</math> term, since it's just a constant scaling, and also get rid of the <math>\log(U(0))</math> term, which simply subtracts a constant offset. This leaves us with
-<math>\displaystyle \text{UHE}_a(c) \sim \left(\frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)</math>
+$$\displaystyle \text{UHE}_a(c) \sim \left(\frac{\tilde{U}(c)}{U(0)} - \frac{\tilde{U}(c)^2}{2 U(0)^2} + \frac{\tilde{U}(c)^3}{3 U(0)^3} - ...\right)$$
 where <math>\sim</math> denotes the two sides are now "equivalent" up to a constant shifting and scaling.
@@ Line 610: / Line 610: @@
 Finally, we can go one step further and multiply the above by the constant <math>U(0)</math>. Doing so, we get
-<math>\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{2 U(0)} + \frac{\tilde{U}(c)^3}{3 U(0)^2} - ...\right)</math>
+$$\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{2 U(0)} + \frac{\tilde{U}(c)^3}{3 U(0)^2} - ...\right)$$
 And we now have a function that is equivalent to our original, up to a constant shift and scaling, but which is fairly easy to analyze asymptotically.
@@ Line 624: / Line 624: @@
 Assuming this is true (which we will not prove here, but which seems self-evident given all of the other results), we can go back to our last expression for exp-UHE:
-<math>\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{U(0)} + \frac{\tilde{U}(c)^3}{U(0)} - ...\right)</math>
+$$\displaystyle \text{UHE}_a(c) \sim \left(\tilde{U}(c) - \frac{\tilde{U}(c)^2}{U(0)} + \frac{\tilde{U}(c)^3}{U(0)} - ...\right)$$
 and see that as <math>N \to \infty</math>, the <math>U(0)</math> terms blow up, whereas the <math>\tilde{U}(c)</math> terms do not. As a result, all of the terms with <math>U(0)</math> in the denominator disappear, and we are left with:
-<math>\displaystyle \text{UHE}_a(c) \sim \tilde{U}(c)</math>
+$$\displaystyle \text{UHE}_a(c) \sim \tilde{U}(c)$$
 Putting this together from our prior realization that <math>\text{UHE}_a(c) \sim \log \tilde{U}(c)</math>, we get
@@ Line 634: / Line 634: @@
 before, we get
-<math>\displaystyle \tilde{U}(c) \sim \log \tilde{U}(c)</math>
+$$\displaystyle \tilde{U}(c) \sim \log \tilde{U}(c)$$
 additionally, noting that <math>\log \tilde{U}(c) \sim \frac{1}{1-a} \log \tilde{U}(c) = \log \tilde{U}(c)^{\frac{1}{1-a}} \sim \tilde{U}(c)^{\frac{1}{1-a}} = \exp(\text{UHE}_a(c))</math>, we get
-<math>\displaystyle \text{UHE}_a(c) \sim \exp(\text{UHE}_a(c))</math>
+$$\displaystyle \text{UHE}_a(c) \sim \exp(\text{UHE}_a(c))$$
 as <math>N \to \infty</math>, for <math>a \leq 2</math>.
@@ Line 649: / Line 649: @@
 On the surface, everything we did with the convolution theorem, and subsequent analytic continuation, should appear to work for normalized HE as well. For example, let's review our result for the exp of unnormalized HE:
-<math>\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}</math>
+$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$
 Our original expression for normalized HE was
-<math>\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)}</math>
+$$\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\left[S^a \ast K^a\right](-c)}{\left[S \ast K\right]^a(-c)}$$
 Naively, you might expect we could simply apply the same analytic continuation technique to the numerator and denominator. If we do so, we would get
-<math>\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}}{\mathcal{F}^{-1}\left\{\overline \phi \cdot |\zeta_{0.5}|^2\right\}^a}</math>
+$$\displaystyle \exp((1-a) \text{HE}_a(n)) = \frac{\mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}}{\mathcal{F}^{-1}\left\{\overline \phi \cdot |\zeta_{0.5}|^2\right\}^a}$$
@@ Line 695: / Line 695: @@
 If we add this as a third parameter, called <math>w</math> we can modify our definition of exp-UHE as follows:
-<math>\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}</math>
+$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$
 So that our vertical slice of the zeta function is given by $\Re(z) = w\cdot \a$.
@@ Line 705: / Line 705: @@
 Let's go back to our three-parameter definition of exp-UHE above:
-<math>\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}</math>
+$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$
 We can see that, in a sense, the need for both <math>a</math> and <math>w</math> is almost redundant. Their product specifies the vertical slice of the zeta function. If you set <math>w=0.5</math> and <math>a=1</math>, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set <math>w=0.25</math> and <math>a=2</math>, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.
@@ Line 711: / Line 711: @@
 The only reason that these expressions are different is due to the <math>\phi_a</math> above. We had previously defined that as:
-<math>\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)</math>
+$$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$
 or, the Fourier transform of the spreading distribution, raised to the power of <math>a</math>. So if you hold the product <math>w a</math> as constant, but change the balance of <math>w</math> and <math>a</math>, you will indeed get different results, simply because only the choice of <math>a</math> changes the <math>\phi_a</math>.
@@ Line 725: / Line 725: @@
 In our derivation, we assumed the use of unreduced rationals. It turns out that with a minor adjustment, the same model gives us reduced rationals, up to a constant multiplicative scaling. Let's go back to our analytic continuation of the convolution kernel, for some arbitrary weighting:
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}$$
 Now, suppose we want to analytically continue this so that the set <math>J</math> is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as <math>\frac{n}{d}</math>, but rather as <math>\frac{n}{d} \cdot \frac{c}{c}</math>, where <math>n'</math> and <math>d'</math> are coprime, and <math>c</math> is the gcd of both. For example, we would express <math>\frac{6}{4}</math> as <math>\frac{3}{2} \cdot \frac{2}{2}</math>. Doing so, and assuming that we denote the set of unreduced rationals by <math>\mathbb{U}</math>, we get the following equivalent expression of the same convolution kernel above:
-<math>\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2</math>
+$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2$$
 where the last equality is what we proved before. Note that this is the same exact function as before, just written such that the GCD of the unreduced fraction has an explicit term.
@@ Line 735: / Line 735: @@
 The <math>\frac{j_c}{j_c}</math> in the log in the numerator cancels out, but in the denominator we have an extra factor of <math>{j_c}^2</math> to contend with. This yields
-<math>\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]</math>
+$$\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
 Now, assuming we have <math>w>1</math> and everything is absolutely convergent, we can factor this into a product of series as follows:
-<math>\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]</math>
+$$\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
 where the left summation now has <math>j_c \in \mathbb{N}^+</math>, the set of strictly positive rational numbers, and the right summation now has <math>j \in \mathbb{Q}</math> the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the <math>j_c</math>.
@@ Line 745: / Line 745: @@
 Now, note that that left series is, itself, just another Dirichlet series that converges to the zeta function. We have
-<math>\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]</math>
+$$\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
 and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by <math>\zeta(2w)</math>:
-<math>\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}</math>
+$$\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}$$
 This function then becomes our new <math>\mathcal{F}\left\{K(n)\right\}</math>.