Harmonic entropy: Difference between revisions

Line 694:

However, to see why this doesn't work, let's compare the analytically continued version of the denominator (i.e. the normalization term) with the finite versions. Look at the following picture, which is a plot of ~~<math>\mathcal~~{F}^{-1}~~\left\~~{\overline ~~\phi \cdot~~ |~~\zeta_~~{0.5}|^2~~\right\~~}^a</~~math~~>:

However, to see why this doesn't work, let's compare the analytically continued version of the denominator (i.e. the normalization term) with the finite versions. Look at the following picture, which is a plot of {{nowrap|ℱ{{inv}}{{(}}{{overline|φ}} · {{!}}ζ0.5{{!}}2{{)}}''a''}}:

[[File:HE_normalization_terms.png|800px]]

This picture shows how the denominator changes as ''N'' increases: you can see that in general, the function is shifted upward, increasing without bound. The thin plots reflect this for N=1000, 5000, 10000, 50000, and 100000, where you can see them increasing.

This picture shows how the denominator changes as ''N'' increases: you can see that in general, the function is shifted upward, increasing without bound. The thin plots reflect this for {{nowrap|''N'' {{=}} 1000}}, 5000, 10000, 50000, and 100000, where you can see them increasing.

You will note that the denominator also looks exactly like unnormalized HE, just upside down. Normalized HE is the quotient of two functions that both look like this, which are slightly different. This quotient produces the usual HE curve, which is flipped upside down relative to the denominator, and which also increases without bound. That all these functions increase without bound is just another way to state that these things generally don't converge as ~~<math>~~N ~~\to \infty</math>~~.

You will note that the denominator also looks exactly like unnormalized HE, just upside down. Normalized HE is the quotient of two functions that both look like this, which are slightly different. This quotient produces the usual HE curve, which is flipped upside down relative to the denominator, and which also increases without bound. That all these functions increase without bound is just another way to state that these things generally don't converge as {{nowrap|''N'' → ∞}}.

However, look at what happens with our analytic continuation, which is given by the thicker blue line at the bottom. Despite our sequence of finite-''N'' denominator terms increasing on the y-axis, the analytically continued version suddenly "snaps" back to zero. Although the curve shape is roughly the same, the vertical offset is almost completely eliminated when the analytic continuation is done.

The problem here is that the original HE function was the quotient of two very large, strictly positive functions - the numerator and denominator. However, performing the analytic continuation on each separately has caused both to "snap" back to zero, so that the denominator, while retaining the same shape, now has points where it touches the x-axis. As a result, the quotient of the two will have poles where the denominator is zero.

The problem here is that the original HE function was the quotient of two very large, strictly positive functions: the numerator and denominator. However, performing the analytic continuation on each separately has caused both to "snap" back to zero, so that the denominator, while retaining the same shape, now has points where it touches the ''x''-axis. As a result, the quotient of the two will have poles where the denominator is zero.

The resulting quotient of analytically continued functions looks like this, and does not remotely resemble HE:

Line 716:

But in the case of "normalized HE," we analytically continued the Fourier transforms of the numerator and denominator, separately, transformed both out of the Fourier domain, and then took the quotient. Complex analysis ''really'' makes no guarantee on the behavior of the quotient of two Fourier transforms of the analytic continuations of holomorphic functions, and in this case the behavior is very strange. A different approach to analytically continuing the expression would be required.

This same principle explains why we plotted the exp of UHE, rather than UHE itself. Were we to take the log of finite UHE, we would be taking the log of a strictly positive function. However, the analytically continued exp-UHE snaps back to the x-axis, so that there are points where the function is zero or even negative. Taking the log of the analytically continued exp-UHE would yield a complex-valued function where it is negative, due to this snapping effect. However, looking at exp-UHE directly has no such problem.

This same principle explains why we plotted the exp of UHE, rather than UHE itself. Were we to take the log of finite UHE, we would be taking the log of a strictly positive function. However, the analytically continued exp-UHE snaps back to the ''x''-axis, so that there are points where the function is zero or even negative. Taking the log of the analytically continued exp-UHE would yield a complex-valued function where it is negative, due to this snapping effect. However, looking at exp-UHE directly has no such problem.

Finally, it is noteworthy that for ~~<math>~~a>2~~</math>~~, we end up looking at slices of the zeta function for which ~~<math>\~~Re(z)>1~~</math>~~. This is where our original unnormalized HE function should converge as ~~<math>~~N ~~\to \infty</math>~~, corresponding to the region where the Riemann zeta function Dirichlet series converges. For these values of ''a'', the exp-UHE ''is'' positive. So, we can take the log again and look at the usual UHE. This can be useful for plotting, since exp-UHE tends to "flatten" out the curve for high values of ''a'', whereas taking the log accentuates the minima and maxima (and more closely resembles the usual HRE).

Finally, it is noteworthy that for {{nowrap|''a'' > 2}}, we end up looking at slices of the zeta function for which {{nowrap|Re(''z'') > 1}}. This is where our original unnormalized HE function should converge as {{nowrap|''N'' → ∞}}, corresponding to the region where the Riemann zeta function Dirichlet series converges. For these values of ''a'', the exp-UHE ''is'' positive. So, we can take the log again and look at the usual UHE. This can be useful for plotting, since exp-UHE tends to "flatten" out the curve for high values of ''a'', whereas taking the log accentuates the minima and maxima (and more closely resembles the usual HRE).

=== Interpretation as a New Free Parameter: the Weighting Exponent ===

In our original derivation of the analytic continuation, we temporarily changed the weighting for rationals from ~~<math>~~(nd)^{0.5}</~~math~~> to some other ~~<math>~~(nd)^w</~~math~~>, with ~~<math>~~w > 1~~</math>~~, for the sake of obtaining a series that converges. We then changed the exponent back to ~~<math>~~0.5~~</math>~~.

In our original derivation of the analytic continuation, we temporarily changed the weighting for rationals from (''nd'')0.5 to some other (''nd'')''w'', with {{nowrap|''w'' > 1}}, for the sake of obtaining a series that converges. We then changed the exponent back to 0.5.

This can be thought of as giving us another free parameter to HE, in addition to ''s'' and ''a'': the exponent for the weighting for each rational. That is, although Paul originally derived the ~~<math>~~(nd)^{0.5}</~~math~~> exponent empirically by studying the behavior of mediant-to-mediant HE for Tenney-bounded rationals, there is no reason we can't simply that exponent to something else. As shown before, so long as that exponent is greater than 1, unnormalized HE will converge in the limit as ~~<math>~~N ~~-> \infty</math>~~, and will converge to the same thing whether we are bounding ~~<math>~~nd < N~~</math>~~, ~~<math>\~~max(n,d) < N~~</math>~~, or anything else (see again [https://math.stackexchange.com/questions/2593993/convergence-of-product-of-series-to-zeta-function here]). We can then analytically continue to the case where ~~<math>~~w < 1~~</math>~~.

This can be thought of as giving us another free parameter to HE, in addition to ''s'' and ''a'': the exponent for the weighting for each rational. That is, although Paul originally derived the (''nd'')0.5 exponent empirically by studying the behavior of mediant-to-mediant HE for Tenney-bounded rationals, there is no reason we can't simply that exponent to something else. As shown before, so long as that exponent is greater than 1, unnormalized HE will converge in the limit as {{nowrap|''N'' → ∞}}, and will converge to the same thing whether we are bounding {{nowrap|''nd'' < ''N''}}, {{nowrap|max(''n'', ''d'') < ''N''}}, or anything else (see again [https://math.stackexchange.com/questions/2593993/convergence-of-product-of-series-to-zeta-function here]). We can then analytically continue to the case where {{nowrap|''w'' < 1}}.

If we add this as a third parameter, called ''w'' we can modify our definition of exp-UHE as follows:

Line 729:

$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$

So that our vertical slice of the zeta function is given by $\Re(z) = ~~w\cdot \a$~~.

So that our vertical slice of the zeta function is given by {{nowrap|Re(''z'') {{=}} ''wa''}}.

=== Equivalence of the ~~Weighting Exponent~~ and ''a'' for ~~Generalized Normal Distributions~~ ===

=== Equivalence of the weighting exponent and ''a'' for generalized normal distributions ===

We get a very interesting result if our spreading distribution is a {{w|generalized normal distribution}}, which a family that encompasses both the Gaussian and the Laplace distributions (sometimes referred to as the "Vos curve" in Paul's work).

We get a very interesting result if our spreading distribution is a ~~[https://en.wikipedia.org/wiki/Generalized_normal_distribution~~ generalized normal distribution], which a family that encompasses both the Gaussian and the Laplace distributions (sometimes referred to as the "Vos curve" in Paul's work).

Let's go back to our three-parameter definition of exp-UHE above:

Line 739:

Line 738:

$$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$

We can see that, in a sense, the need for both ''a'' and ''w'' is almost redundant. Their product specifies the vertical slice of the zeta function. If you set ~~<math>~~w=0.5~~</math>~~ and ~~<math>~~a=1~~</math>~~, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set ~~<math>~~w=0.25~~</math>~~ and ~~<math>a~~=2~~</math>~~, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.

We can see that, in a sense, the need for both ''a'' and ''w'' is almost redundant. Their product specifies the vertical slice of the zeta function. If you set {{nowrap|''w'' {{=}} 0.5}} and {{nowrap|''a'' {{=}} 1}}, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set {{nowrap|''w'' {{=}} 0.25}} and {{nowrap|''w'' {{=}} 2}}, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.

The only reason that these expressions are different is due to the <~~math~~>~~\phi_a~~</~~math~~> above. We had previously defined that as:

The only reason that these expressions are different is due to the φ''a'' above. We had previously defined that as:

$$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$

or, the Fourier transform of the spreading distribution, raised to the power of ''a''. So if you hold the product ~~<math>w a</math>~~ as constant, but change the balance of ''w'' and ''a'', you will indeed get different results, simply because only the choice of ''a'' changes the <~~math~~>~~\phi_a~~</~~math~~>.

or, the Fourier transform of the spreading distribution, raised to the power of ''a''. So if you hold the product ''wa'' as constant, but change the balance of ''w'' and ''a'', you will indeed get different results, simply because only the choice of ''a'' changes the φ''a''.

However, we get a very neat result if we are using the generalized normal distribution. In that case, if we take the generalized normal distribution to a power ''a'', we get another instance of the same generalized normal distribution. The difference is, the variance will be divided by <~~math~~>a^{\frac{1~~}{\beta~~}}</~~math~~>, where ~~<math>\beta</math>~~ is the shape parameter for the distribution (a value of 1 is the Laplace distribution, a value of 2 is the Gaussian distribution, etc). The whole distribution will also no longer have an integral of 1, since we have also raised the scaling coefficient to a power, but this won't change anything, as it just corresponds to a uniform scaling of the end result.

However, we get a very neat result if we are using the generalized normal distribution. In that case, if we take the generalized normal distribution to a power ''a'', we get another instance of the same generalized normal distribution. The difference is, the variance will be divided by ''a''{{frac|1|β}}, where β is the shape parameter for the distribution (a value of 1 is the Laplace distribution, a value of 2 is the Gaussian distribution, etc). The whole distribution will also no longer have an integral of 1, since we have also raised the scaling coefficient to a power, but this won't change anything, as it just corresponds to a uniform scaling of the end result.

In practice, what this means is that if you are using one of the above distributions, and you change ''a'', this is ''equivalent'' to changing the weighting exponent ''w'', and tweaking the standard deviation ''s'' according to the above equation.

This gives us a very nice interpretation of our ''a'' coefficient from HRE: it basically represents the weighting exponent on the rationals, with a corresponding adjustment to the standard deviation. The collision entropy ~~<math>~~a=2~~</math>~~ with the standard weighting <math>\sqrt{nd}</math> is totally equivalent to the Shannon entropy ~~<math>~~a=1~~</math>~~ with the weighting ''nd'' on the rationals, so long as the value of ''s'' is adjusted according to the equation above. However, it should be noted that this definition only holds for the "unnormalized HRE" given above.

This gives us a very nice interpretation of our ''a'' coefficient from HRE: it basically represents the weighting exponent on the rationals, with a corresponding adjustment to the standard deviation. The collision entropy {{nowrap|''a'' {{=}} 2}} with the standard weighting <math>\sqrt{nd}</math> is totally equivalent to the Shannon entropy {{nowrap|''a'' {{=}} 1}} with the weighting ''nd'' on the rationals, so long as the value of ''s'' is adjusted according to the equation above. However, it should be noted that this definition only holds for the "unnormalized HRE" given above.

~~=== Reduced Rationals Only ===~~

=== Reduced rationals only ===

In our derivation, we assumed the use of unreduced rationals. It turns out that with a minor adjustment, the same model gives us reduced rationals, up to a constant multiplicative scaling. Let's go back to our analytic continuation of the convolution kernel, for some arbitrary weighting:

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}$$

Now, suppose we want to analytically continue this so that the set ''J'' is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as ~~<math>\frac~~{n}{d}~~</math>~~, but rather as ~~<math>\frac~~{n}{d} ~~\cdot \frac~~{c}{c}~~</math>~~, where ~~<math>~~n'~~</math>~~ and ~~<math>~~d'~~</math>~~ are coprime, and ''c'' is the gcd of both. For example, we would express ~~<math>\frac~~{6}{4}~~</math>~~ as ~~<math>\frac~~{3}{2} ~~\cdot \frac{2~~}{2}~~</math>~~. Doing so, and assuming that we denote the set of unreduced rationals by ~~<math>\mathbb{~~U~~}</math>~~, we get the following equivalent expression of the same convolution kernel above:

Now, suppose we want to analytically continue this so that the set ''J'' is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as {{sfrac|''n''|''d''}}, but rather as {{nowrap|{{sfrac|''n''{{'}}|''d''{{-'}}}} · {{sfrac|''c''|''c''}}}}, where ''n''{{'}} and ''d''{{-'}} are coprime, and ''c'' is the gcd of both. For example, we would express {{sfrac|6|4}} as {{nowrap|{{sfrac|3|2}} · {{sfrac|2|2}}}}. Doing so, and assuming that we denote the set of unreduced rationals by ''U'', we get the following equivalent expression of the same convolution kernel above:

$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2$$

Line 769:

Line 767:

$$\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

Now, assuming we have ~~<math>~~w>1~~</math>~~ and everything is absolutely convergent, we can factor this into a product of series as follows:

Now, assuming we have {{nowrap|''w'' > 1}} and everything is absolutely convergent, we can factor this into a product of series as follows:

$$\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

where the left summation now has <~~math~~>~~j_c \in \mathbb{N}^+~~</~~math~~>, the set of strictly positive rational numbers, and the right summation now has ~~<math>~~j ~~\in \mathbb{Q~~}~~</math>~~ the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the ''~~j_c~~''.

where the left summation now has {{nowrap|''j''''c'' ∈ ℕ{{mpp}}}}, the set of strictly positive rational numbers, and the right summation now has {{nowrap|''j'' ∈ ℚ}} the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the ''j''''c''.

Now, note that that left series is, itself, just another Dirichlet series that converges to the zeta function. We have

Line 779:

Line 777:

$$\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$

and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by ~~<math>\zeta~~(2w)~~</math>~~:

and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by ζ(2''w''):

$$\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}$$

This function then becomes our new ~~<math>\mathcal{F}\left\~~{K(n)~~\right\~~}~~</math>~~.

This function then becomes our new ℱ{''K''(''n'')}.

However, you will note that ~~<math>\zeta~~(2w)~~</math>~~ is a constant not depending at all on ''t''. As a result, the reduced rational kernel is exactly equal to the unreduced rational kernel, times a constant depending only on ''w''. This means that when we take the inverse Fourier transform and convolve, the result for exp-UHE will likewise be identical, scaled only by a constant.

However, you will note that ζ(2''w'') is a constant not depending at all on ''t''. As a result, the reduced rational kernel is exactly equal to the unreduced rational kernel, times a constant depending only on ''w''. This means that when we take the inverse Fourier transform and convolve, the result for exp-UHE will likewise be identical, scaled only by a constant.

As a result, we have shown that we get the same exact results for reduced and unreduced rationals, differing only by a multiplicative scaling.

Lastly, you will note that for the special value ~~<math>~~w=0.5~~</math>~~, corresponding to the usual <math>\sqrt{nd}</math> weighting, we end up dividing by the term ~~<math>\zeta~~(1)~~</math>~~. This is the only pole in the zeta function, so we wind up dividing by infinity, making the entire function zero, as pointed out by Martin Gough. However, as we can get arbitrarily close to ~~<math>~~w=0.5~~</math>~~ and still exhibit the behavior that the unreduced and reduced functions are scaled versions of one another, we can simply use the unreduced version of exp-UHE for ~~<math>~~w=0.5~~</math>~~ and consider it equivalent to reduced exp-UHE in the limit.

Lastly, you will note that for the special value {{nowrap|''w'' {{=}} 0.5}}, corresponding to the usual <math>\sqrt{nd}</math> weighting, we end up dividing by the term ζ(1). This is the only pole in the zeta function, so we wind up dividing by infinity, making the entire function zero, as pointed out by Martin Gough. However, as we can get arbitrarily close to {{nowrap|''w'' {{=}} 0.5}} and still exhibit the behavior that the unreduced and reduced functions are scaled versions of one another, we can simply use the unreduced version of exp-UHE for {{nowrap|''w'' {{=}} 0.5}} and consider it equivalent to reduced exp-UHE in the limit.

== To Do ==

There are a number of things that need to be added to this article. Below are listed some for reference:

* 3HE, both for finite HE and for ~~<math>~~N ~~\to \infty</math>~~

* 3HE, both for finite HE and for {{nowrap|''N'' → ∞}}

* ~~write~~-up of fast computation for infinite zeta-UHE, perhaps with a zeta table

* Write-up of fast computation for infinite zeta-UHE, perhaps with a zeta table

* ~~addition~~ of many more pictures

* Addition of many more pictures

== References ==

@@ Line 694: / Line 694: @@
-However, to see why this doesn't work, let's compare the analytically continued version of the denominator (i.e. the normalization term) with the finite versions. Look at the following picture, which is a plot of <math>\mathcal{F}^{-1}\left\{\overline \phi \cdot |\zeta_{0.5}|^2\right\}^a</math>:
+However, to see why this doesn't work, let's compare the analytically continued version of the denominator (i.e. the normalization term) with the finite versions. Look at the following picture, which is a plot of {{nowrap|ℱ{{inv}}{{(}}{{overline|φ}} · {{!}}ζ<sub>0.5</sub>{{!}}<sup>2</sup>{{)}}<sup>''a''</sup>}}:
 [[File:HE_normalization_terms.png|800px]]
-This picture shows how the denominator changes as ''N'' increases: you can see that in general, the function is shifted upward, increasing without bound. The thin plots reflect this for N=1000, 5000, 10000, 50000, and 100000, where you can see them increasing.
+This picture shows how the denominator changes as ''N'' increases: you can see that in general, the function is shifted upward, increasing without bound. The thin plots reflect this for {{nowrap|''N'' {{=}} 1000}}, 5000, 10000, 50000, and 100000, where you can see them increasing.
-You will note that the denominator also looks exactly like unnormalized HE, just upside down. Normalized HE is the quotient of two functions that both look like this, which are slightly different. This quotient produces the usual HE curve, which is flipped upside down relative to the denominator, and which also increases without bound. That all these functions increase without bound is just another way to state that these things generally don't converge as <math>N \to \infty</math>.
+You will note that the denominator also looks exactly like unnormalized HE, just upside down. Normalized HE is the quotient of two functions that both look like this, which are slightly different. This quotient produces the usual HE curve, which is flipped upside down relative to the denominator, and which also increases without bound. That all these functions increase without bound is just another way to state that these things generally don't converge as {{nowrap|''N'' → ∞}}.
 However, look at what happens with our analytic continuation, which is given by the thicker blue line at the bottom. Despite our sequence of finite-''N'' denominator terms increasing on the y-axis, the analytically continued version suddenly "snaps" back to zero. Although the curve shape is roughly the same, the vertical offset is almost completely eliminated when the analytic continuation is done.
-The problem here is that the original HE function was the quotient of two very large, strictly positive functions - the numerator and denominator. However, performing the analytic continuation on each separately has caused both to "snap" back to zero, so that the denominator, while retaining the same shape, now has points where it touches the x-axis. As a result, the quotient of the two will have poles where the denominator is zero.
+The problem here is that the original HE function was the quotient of two very large, strictly positive functions: the numerator and denominator. However, performing the analytic continuation on each separately has caused both to "snap" back to zero, so that the denominator, while retaining the same shape, now has points where it touches the ''x''-axis. As a result, the quotient of the two will have poles where the denominator is zero.
 The resulting quotient of analytically continued functions looks like this, and does not remotely resemble HE:
@@ Line 716: / Line 716: @@
 But in the case of "normalized HE," we analytically continued the Fourier transforms of the numerator and denominator, separately, transformed both out of the Fourier domain, and then took the quotient. Complex analysis ''really'' makes no guarantee on the behavior of the quotient of two Fourier transforms of the analytic continuations of holomorphic functions, and in this case the behavior is very strange. A different approach to analytically continuing the expression would be required.
-This same principle explains why we plotted the exp of UHE, rather than UHE itself. Were we to take the log of finite UHE, we would be taking the log of a strictly positive function. However, the analytically continued exp-UHE snaps back to the x-axis, so that there are points where the function is zero or even negative. Taking the log of the analytically continued exp-UHE would yield a complex-valued function where it is negative, due to this snapping effect. However, looking at exp-UHE directly has no such problem.
+This same principle explains why we plotted the exp of UHE, rather than UHE itself. Were we to take the log of finite UHE, we would be taking the log of a strictly positive function. However, the analytically continued exp-UHE snaps back to the ''x''-axis, so that there are points where the function is zero or even negative. Taking the log of the analytically continued exp-UHE would yield a complex-valued function where it is negative, due to this snapping effect. However, looking at exp-UHE directly has no such problem.
-Finally, it is noteworthy that for <math>a>2</math>, we end up looking at slices of the zeta function for which <math>\Re(z)>1</math>. This is where our original unnormalized HE function should converge as <math>N \to \infty</math>, corresponding to the region where the Riemann zeta function Dirichlet series converges. For these values of ''a'', the exp-UHE ''is'' positive. So, we can take the log again and look at the usual UHE. This can be useful for plotting, since exp-UHE tends to "flatten" out the curve for high values of ''a'', whereas taking the log accentuates the minima and maxima (and more closely resembles the usual HRE).
+Finally, it is noteworthy that for {{nowrap|''a'' &gt; 2}}, we end up looking at slices of the zeta function for which {{nowrap|Re(''z'') &gt; 1}}. This is where our original unnormalized HE function should converge as {{nowrap|''N'' → ∞}}, corresponding to the region where the Riemann zeta function Dirichlet series converges. For these values of ''a'', the exp-UHE ''is'' positive. So, we can take the log again and look at the usual UHE. This can be useful for plotting, since exp-UHE tends to "flatten" out the curve for high values of ''a'', whereas taking the log accentuates the minima and maxima (and more closely resembles the usual HRE).
 === Interpretation as a New Free Parameter: the Weighting Exponent ===
-In our original derivation of the analytic continuation, we temporarily changed the weighting for rationals from <math>(nd)^{0.5}</math> to some other <math>(nd)^w</math>, with <math>w > 1</math>, for the sake of obtaining a series that converges. We then changed the exponent back to <math>0.5</math>.
+In our original derivation of the analytic continuation, we temporarily changed the weighting for rationals from (''nd'')<sup>0.5</sup> to some other (''nd'')<sup>''w''</sup>, with {{nowrap|''w'' &gt; 1}}, for the sake of obtaining a series that converges. We then changed the exponent back to 0.5.
-This can be thought of as giving us another free parameter to HE, in addition to ''s'' and ''a'': the exponent for the weighting for each rational. That is, although Paul originally derived the <math>(nd)^{0.5}</math> exponent empirically by studying the behavior of mediant-to-mediant HE for Tenney-bounded rationals, there is no reason we can't simply that exponent to something else. As shown before, so long as that exponent is greater than 1, unnormalized HE will converge in the limit as <math>N -> \infty</math>, and will converge to the same thing whether we are bounding <math>nd < N</math>, <math>\max(n,d) < N</math>, or anything else (see again [https://math.stackexchange.com/questions/2593993/convergence-of-product-of-series-to-zeta-function here]). We can then analytically continue to the case where <math>w < 1</math>.
+This can be thought of as giving us another free parameter to HE, in addition to ''s'' and ''a'': the exponent for the weighting for each rational. That is, although Paul originally derived the (''nd'')<sup>0.5</sup> exponent empirically by studying the behavior of mediant-to-mediant HE for Tenney-bounded rationals, there is no reason we can't simply that exponent to something else. As shown before, so long as that exponent is greater than 1, unnormalized HE will converge in the limit as {{nowrap|''N'' → ∞}}, and will converge to the same thing whether we are bounding {{nowrap|''nd'' &lt; ''N''}}, {{nowrap|max(''n'', ''d'') &lt; ''N''}}, or anything else (see again [https://math.stackexchange.com/questions/2593993/convergence-of-product-of-series-to-zeta-function here]). We can then analytically continue to the case where {{nowrap|''w'' &lt; 1}}.
 If we add this as a third parameter, called ''w'' we can modify our definition of exp-UHE as follows:
@@ Line 729: / Line 729: @@
 $$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$
-So that our vertical slice of the zeta function is given by $\Re(z) = w\cdot \a$.
+So that our vertical slice of the zeta function is given by {{nowrap|Re(''z'') {{=}} ''wa''}}.
-=== Equivalence of the Weighting Exponent and ''a'' for Generalized Normal Distributions ===
+=== Equivalence of the weighting exponent and ''a'' for generalized normal distributions ===
+We get a very interesting result if our spreading distribution is a {{w|generalized normal distribution}}, which a family that encompasses both the Gaussian and the Laplace distributions (sometimes referred to as the "Vos curve" in Paul's work).
-We get a very interesting result if our spreading distribution is a [https://en.wikipedia.org/wiki/Generalized_normal_distribution generalized normal distribution], which a family that encompasses both the Gaussian and the Laplace distributions (sometimes referred to as the "Vos curve" in Paul's work).
 Let's go back to our three-parameter definition of exp-UHE above:
@@ Line 739: / Line 738: @@
 $$\displaystyle \exp((1-a) \text{UHE}_{a,w}(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{w a}|^2\right\}$$
-We can see that, in a sense, the need for both ''a'' and ''w'' is almost redundant. Their product specifies the vertical slice of the zeta function. If you set <math>w=0.5</math> and <math>a=1</math>, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set <math>w=0.25</math> and <math>a=2</math>, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.
+We can see that, in a sense, the need for both ''a'' and ''w'' is almost redundant. Their product specifies the vertical slice of the zeta function. If you set {{nowrap|''w'' {{=}} 0.5}} and {{nowrap|''a'' {{=}} 1}}, corresponding to the Shannon entropy with <math>\sqrt{nd}</math> weighting, you get the same vertical slice as if you set {{nowrap|''w'' {{=}} 0.25}} and {{nowrap|''w'' {{=}} 2}}, corresponding to the collision entropy with <math>^4\sqrt{nd}</math> weighting: in both cases this is the critical line of the zeta function.
-The only reason that these expressions are different is due to the <math>\phi_a</math> above. We had previously defined that as:
+The only reason that these expressions are different is due to the φ<sub>''a''</sub> above. We had previously defined that as:
 $$\displaystyle \phi_a(t) = \mathcal{F}\left\{S(n)^a\right\}(t)$$
-or, the Fourier transform of the spreading distribution, raised to the power of ''a''. So if you hold the product <math>w a</math> as constant, but change the balance of ''w'' and ''a'', you will indeed get different results, simply because only the choice of ''a'' changes the <math>\phi_a</math>.
+or, the Fourier transform of the spreading distribution, raised to the power of ''a''. So if you hold the product ''wa'' as constant, but change the balance of ''w'' and ''a'', you will indeed get different results, simply because only the choice of ''a'' changes the φ<sub>''a''</sub>.
-However, we get a very neat result if we are using the generalized normal distribution. In that case, if we take the generalized normal distribution to a power ''a'', we get another instance of the same generalized normal distribution. The difference is, the variance will be divided by <math>a^{\frac{1}{\beta}}</math>, where <math>\beta</math> is the shape parameter for the distribution (a value of 1 is the Laplace distribution, a value of 2 is the Gaussian distribution, etc). The whole distribution will also no longer have an integral of 1, since we have also raised the scaling coefficient to a power, but this won't change anything, as it just corresponds to a uniform scaling of the end result.
+However, we get a very neat result if we are using the generalized normal distribution. In that case, if we take the generalized normal distribution to a power ''a'', we get another instance of the same generalized normal distribution. The difference is, the variance will be divided by ''a''<sup>{{frac|1|β}}</sup>, where β is the shape parameter for the distribution (a value of 1 is the Laplace distribution, a value of 2 is the Gaussian distribution, etc). The whole distribution will also no longer have an integral of 1, since we have also raised the scaling coefficient to a power, but this won't change anything, as it just corresponds to a uniform scaling of the end result.
 In practice, what this means is that if you are using one of the above distributions, and you change ''a'', this is ''equivalent'' to changing the weighting exponent ''w'', and tweaking the standard deviation ''s'' according to the above equation.
-This gives us a very nice interpretation of our ''a'' coefficient from HRE: it basically represents the weighting exponent on the rationals, with a corresponding adjustment to the standard deviation. The collision entropy <math>a=2</math> with the standard weighting <math>\sqrt{nd}</math> is totally equivalent to the Shannon entropy <math>a=1</math> with the weighting ''nd'' on the rationals, so long as the value of ''s'' is adjusted according to the equation above. However, it should be noted that this definition only holds for the "unnormalized HRE" given above.
+This gives us a very nice interpretation of our ''a'' coefficient from HRE: it basically represents the weighting exponent on the rationals, with a corresponding adjustment to the standard deviation. The collision entropy {{nowrap|''a'' {{=}} 2}} with the standard weighting <math>\sqrt{nd}</math> is totally equivalent to the Shannon entropy {{nowrap|''a'' {{=}} 1}} with the weighting ''nd'' on the rationals, so long as the value of ''s'' is adjusted according to the equation above. However, it should be noted that this definition only holds for the "unnormalized HRE" given above.
-=== Reduced Rationals Only ===
+=== Reduced rationals only ===
 In our derivation, we assumed the use of unreduced rationals. It turns out that with a minor adjustment, the same model gives us reduced rationals, up to a constant multiplicative scaling. Let's go back to our analytic continuation of the convolution kernel, for some arbitrary weighting:
 $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i  t \log (j_n/j_d)}}{(j_n \cdot j_d)^{w}}$$
-Now, suppose we want to analytically continue this so that the set ''J'' is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as <math>\frac{n}{d}</math>, but rather as <math>\frac{n}{d} \cdot \frac{c}{c}</math>, where <math>n'</math> and <math>d'</math> are coprime, and ''c'' is the gcd of both. For example, we would express <math>\frac{6}{4}</math> as <math>\frac{3}{2} \cdot \frac{2}{2}</math>. Doing so, and assuming that we denote the set of unreduced rationals by <math>\mathbb{U}</math>, we get the following equivalent expression of the same convolution kernel above:
+Now, suppose we want to analytically continue this so that the set ''J'' is the set of all reduced rational numbers. We can first do so by starting again with unreduced rationals, but expressing each rational not as {{sfrac|''n''|''d''}}, but rather as {{nowrap|{{sfrac|''n''{{'}}|''d''{{-'}}}} · {{sfrac|''c''|''c''}}}}, where ''n''{{'}} and ''d''{{-'}} are coprime, and ''c'' is the gcd of both. For example, we would express {{sfrac|6|4}} as {{nowrap|{{sfrac|3|2}} · {{sfrac|2|2}}}}. Doing so, and assuming that we denote the set of unreduced rationals by ''U'', we get the following equivalent expression of the same convolution kernel above:
 $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_c j_{n'}}{j_c j_{d'}})}}{(j_c j_{n'} \cdot j_c j_{d'})^{w}} = |\zeta(w+i t)|^2$$
@@ Line 769: / Line 767: @@
 $$\displaystyle |\zeta(w+i t)|^2 = \sum_{j \in \mathbb{U}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{({j_c}^2 \cdot j_{n'} j_{d'})^{w}} = \sum_{j \in \mathbb{U}} \left[ \frac{1}{{j_c}^{2w}} \cdot \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
-Now, assuming we have <math>w>1</math> and everything is absolutely convergent, we can factor this into a product of series as follows:
+Now, assuming we have {{nowrap|''w'' &gt; 1}} and everything is absolutely convergent, we can factor this into a product of series as follows:
 $$\displaystyle |\zeta(w+i t)|^2 = \left[ \sum_{j_c \in \mathbb{N}^+} \frac{1}{{j_c}^{2w}} \right] \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
-where the left summation now has <math>j_c \in \mathbb{N}^+</math>, the set of strictly positive rational numbers, and the right summation now has <math>j \in \mathbb{Q}</math> the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the ''j_c''.
+where the left summation now has {{nowrap|''j''<sub>''c''</sub> ∈ ℕ{{mpp}}}}, the set of strictly positive rational numbers, and the right summation now has {{nowrap|''j'' ∈ ℚ}} the set of reduced rationals. Note again that the product above yields all unreduced rationals, thanks to the ''j''<sub>''c''</sub>.
 Now, note that that left series is, itself, just another Dirichlet series that converges to the zeta function. We have
@@ Line 779: / Line 777: @@
 $$\displaystyle |\zeta(w+i t)|^2 = \zeta(2w) \cdot \left[ \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}} \right]$$
-and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by <math>\zeta(2w)</math>:
+and now we are done. The right series is the thing that we want, representing the Fourier transform of the convolution kernel where only reduced fractions are allowed. To get that, we simply divide the whole thing by ζ(2''w''):
 $$\displaystyle \frac{|\zeta(w+i t)|^2}{\zeta(2w)} = \sum_{j \in \mathbb{Q}} \frac{e^{i  t \log (\frac{j_{n'}}{j_{d'}})}}{(j_{n'} j_{d'})^{w}}$$
-This function then becomes our new <math>\mathcal{F}\left\{K(n)\right\}</math>.
+This function then becomes our new ℱ{''K''(''n'')}.
-However, you will note that <math>\zeta(2w)</math> is a constant not depending at all on ''t''. As a result, the reduced rational kernel is exactly equal to the unreduced rational kernel, times a constant depending only on ''w''. This means that when we take the inverse Fourier transform and convolve, the result for exp-UHE will likewise be identical, scaled only by a constant.
+However, you will note that ζ(2''w'') is a constant not depending at all on ''t''. As a result, the reduced rational kernel is exactly equal to the unreduced rational kernel, times a constant depending only on ''w''. This means that when we take the inverse Fourier transform and convolve, the result for exp-UHE will likewise be identical, scaled only by a constant.
 As a result, we have shown that we get the same exact results for reduced and unreduced rationals, differing only by a multiplicative scaling.
-Lastly, you will note that for the special value <math>w=0.5</math>, corresponding to the usual <math>\sqrt{nd}</math> weighting, we end up dividing by the term <math>\zeta(1)</math>. This is the only pole in the zeta function, so we wind up dividing by infinity, making the entire function zero, as pointed out by Martin Gough. However, as we can get arbitrarily close to <math>w=0.5</math> and still exhibit the behavior that the unreduced and reduced functions are scaled versions of one another, we can simply use the unreduced version of exp-UHE for <math>w=0.5</math> and consider it equivalent to reduced exp-UHE in the limit.
+Lastly, you will note that for the special value {{nowrap|''w'' {{=}} 0.5}}, corresponding to the usual <math>\sqrt{nd}</math> weighting, we end up dividing by the term ζ(1). This is the only pole in the zeta function, so we wind up dividing by infinity, making the entire function zero, as pointed out by Martin Gough. However, as we can get arbitrarily close to {{nowrap|''w'' {{=}} 0.5}} and still exhibit the behavior that the unreduced and reduced functions are scaled versions of one another, we can simply use the unreduced version of exp-UHE for {{nowrap|''w'' {{=}} 0.5}} and consider it equivalent to reduced exp-UHE in the limit.
 == To Do ==
 There are a number of things that need to be added to this article. Below are listed some for reference:
-* 3HE, both for finite HE and for <math>N \to \infty</math>
+* 3HE, both for finite HE and for {{nowrap|''N'' → ∞}}
-* write-up of fast computation for infinite zeta-UHE, perhaps with a zeta table
+* Write-up of fast computation for infinite zeta-UHE, perhaps with a zeta table
-* addition of many more pictures
+* Addition of many more pictures
 == References ==