Harmonic entropy: Difference between revisions
ArrowHead294 (talk | contribs) m Charlie Echo |
ArrowHead294 (talk | contribs) mNo edit summary |
||
Line 349: | Line 349: | ||
All of the models described above involve a finite set of rational numbers, bounded by some weighting function, and where the weighting is less than some max value ''N''. | All of the models described above involve a finite set of rational numbers, bounded by some weighting function, and where the weighting is less than some max value ''N''. | ||
It so happens that we are more or less able to analytically continue this definition to the situation where | It so happens that we are more or less able to analytically continue this definition to the situation where {{nowrap|''N'' {{=}} ∞}}. More precisely, we are able to analytically continue the exponential of HE, which yields the same relative interval rankings as standard HE. | ||
The only technical caveat is that we use the HE of the "unnormalized" probability distribution. However, in the large limit of ''N'', this appears to agree closely with the usual HE. We go into more detail below about this. | The only technical caveat is that we use the HE of the "unnormalized" probability distribution. However, in the large limit of ''N'', this appears to agree closely with the usual HE. We go into more detail below about this. | ||
Our basic approach is: rather than weighting intervals by | Our basic approach is: rather than weighting intervals by (''nd'')<sup>0.5</sup>, we choose a different exponent, such as (''nd'')<sup>2</sup>. For an exponent which is large enough (we will show that it must be greater than 1), HE does indeed converge as {{nowrap|''N'' → ∞}}, and we show that this yields an expression related to the [[The Riemann zeta function and tuning|Riemann zeta function]]. We can then use the analytic continuation of the zeta function to obtain an analytically continued curve for the (''nd'')<sup>0.5</sup> weighting, which we then show empirically does indeed appear to be what HE converges on for large values of ''N''. | ||
In short, what we will show is that the Fourier Transform of this unnormalized Shannon | In short, what we will show is that the Fourier Transform of this unnormalized harmonic Shannon entropy is given by | ||
$$|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}$$ | $$|\zeta(0.5+it)|^2 \cdot \overline {\phi(t)}$$ | ||
where | where φ(''t'') is the characteristic function of the spreading distribution and {{overline|φ(''t'')}} is complex conjugation. Below we also give an expression for the Rényi entropy for arbitrary choice of the parameter ''a''. | ||
This enables us to speak cognizantly of the harmonic entropy of an interval as measured against ''all'' rational numbers. | This enables us to speak cognizantly of the harmonic entropy of an interval as measured against ''all'' rational numbers. | ||
=== Background: Unnormalized | === Background: Unnormalized entropy === | ||
Our derivation only analytically continues the entropy function for the "unnormalized" set of probabilities, which we previously wrote as | Our derivation only analytically continues the entropy function for the "unnormalized" set of probabilities, which we previously wrote as ''Q''(''j''|''c''). For this definition to be philosophically perfect, we would want to analytically continue the entropy function for the normalized sense of probabilities, previously written as ''P''(''j''|''c''). | ||
However, in practice, the "unnormalized entropy" appears to be an extremely good approximation to the normalized entropy for large values of ''N''. The resulting curve has approximately the same minima and maxima as HE, the same general shape, and for all intents and purposes looks exactly like HE, just shifted on the y-axis. | However, in practice, the "unnormalized entropy" appears to be an extremely good approximation to the normalized entropy for large values of ''N''. The resulting curve has approximately the same minima and maxima as HE, the same general shape, and for all intents and purposes looks exactly like HE, just shifted on the y-axis. | ||
Here are some examples for different values of ''s''. All of these are Shannon HE (<math>a=1</math>), using <math>\sqrt{nd}</math> weights, with unreduced rationals (more on this below), with the bound that | Here are some examples for different values of ''s''. All of these are Shannon HE (<math>a=1</math>), using <math>\sqrt{nd}</math> weights, with unreduced rationals (more on this below), with the bound that {{nowrap|''nd'' < 1,000,000}}, just with different values of ''s''. All have been scaled so that the minimum entropy is 0, and the maximum entropy is 1: | ||
[[File:HE vs UHE s=0.5%.png|800px]] | [[File:HE vs UHE s=0.5%.png|800px]] | ||
Line 376: | Line 376: | ||
[[File:HE vs UHE s=1.5%.png|800px]] | [[File:HE vs UHE s=1.5%.png|800px]] | ||
As you can see, the unnormalized version is extremely close to a linear function of the normalized one. A similar situation holds for larger values of ''a''. The Pearson correlation coefficient of | As you can see, the unnormalized version is extremely close to a linear function of the normalized one. A similar situation holds for larger values of ''a''. The Pearson correlation coefficient of ρ is also given, and is typically very close to 1—for example, for {{nowrap|''s'' {{=}} 1%}}, it's equal to 0.99922. The correlation also seems to get better with increasing values of ''N'', such that the correlation for N=1,000,000 (shown above) is much better than the one for {{nowrap|''N'' {{=}} 10,000}} (not pictured). | ||
In the above examples, note that there are slightly adjusted values of ''s'' (usually by | In the above examples, note that there are slightly adjusted values of ''s'' (usually by < 1{{c}}) between the normalized and unnormalized comparisons for each plot. For example, in the plot for {{nowrap|''s'' {{=}} 1%}}, corresponding to 17.2264{{c}}, we compare to a slightly adjusted UHE of 16.4764{{c}}. This is because, empirically, sometimes a very slight adjustment corresponds to a better correlation coefficient, suggesting that the UHE may be equivalent to the HE with a miniscule adjustment in the value of ''s''. | ||
It would be nice to show the exact relationship of unnormalized entropy to the normalized entropy in the limit of large ''N'', and whether the two converge to be exactly equal (perhaps given some miniscule adjustment in ''s'' or ''a''). However, we will leave this for future research, as well as the question of how to do an exact derivation of normalized HE. | It would be nice to show the exact relationship of unnormalized entropy to the normalized entropy in the limit of large ''N'', and whether the two converge to be exactly equal (perhaps given some miniscule adjustment in ''s'' or ''a''). However, we will leave this for future research, as well as the question of how to do an exact derivation of normalized HE. | ||
For now, we will start with a derivation of the unnormalized entropy for | For now, we will start with a derivation of the unnormalized entropy for {{nowrap|''N'' {{=}} ∞}}, as an interesting function worthy of study in its own right—not only because it looks exactly like HE, but because it leads to an expression for unnormalized HE in terms of the [[The Riemann zeta function and tuning|Riemann zeta function]]. | ||
=== Derivation === | === Derivation === | ||
For now, our derivation is limited to the case of <math>\sqrt{nd}</math> Tenney-weighted rationals, although it may be possible to derive a similar result for | For now, our derivation is limited to the case of <math>\sqrt{nd}</math> Tenney-weighted rationals, although it may be possible to derive a similar result for max(''n'', ''d'') weighting as well. | ||
Additionally, because it simplifies the derivation, we will use ''unreduced rationals'' in our basis set, meaning that we will even allow unreduced fractions such as | Additionally, because it simplifies the derivation, we will use ''unreduced rationals'' in our basis set, meaning that we will even allow unreduced fractions such as 4/2 so long as {{nowrap|''nd'' < ''N''}} for our bound ''N''. The use of HE with unreduced rationals has previously been studied by Paul and shown to be not that much different than HE with reduced ones. However, we will easily show later unreduced and reduced rationals converge to the same thing in the limit as {{nowrap|''N'' → ∞}}, up to a constant multiplicative scaling. | ||
==== Definition of the | ==== Definition of the unnormalized harmonic Rényi entropy ==== | ||
Let's start by recalling the original definition for | Let's start by recalling the original definition for harmonic Rényi entropy, using simple weighted probabilities: | ||
$$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$ | $$\displaystyle \text{HE}_a(c) = \frac{1}{1-a} \log \sum_{j \in J} P(j|c)^a$$ | ||
Remember also that the definition of | Remember also that the definition of ''P''(''j''|''c'') is as follows: | ||
$$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$ | $$\displaystyle P(j|c) = \frac{Q(j|c)}{\sum_{j \in J} Q(j|c)}$$ | ||
where the | where the ''Q''(''j''|''c'') is the "unnormalized probability"—the raw value of the spreading function, evaluated at the ratio in question, divided by the ratio's weighting. The above equation tells us that the normalized probability is equal to the unnormalized probability, divided by the sum of all unnormalized probabilities. | ||
Line 415: | Line 415: | ||
$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$ | $$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left( S^a \ast K^a \right)(-c)$$ | ||
where, as before, < | where, as before, ''S''<sup>''a''</sup> is our spreading function, taken to the ''a''th power, and ''K''<sup>''a''</sup> is our convolution kernel, with the weights on the delta functions taken to the ''a''th power as described previously. | ||
Note that if ''S'' is symmetric, as in the case of the Gaussian or Laplace distributions, then the inverted argument of | Note that if ''S'' is symmetric, as in the case of the Gaussian or Laplace distributions, then the inverted argument of (−''c'') on the end is redundant, and can be replaced by (''c''). | ||
Lastly, it so happens that it will be much easier to understand our analytic continuation if we look at the exponential of the UHE times | Lastly, it so happens that it will be much easier to understand our analytic continuation if we look at the exponential of the UHE times ({{nowrap|1 − a}}), rather than the UHE itself. The reasons for this will become clear later. If we do so, we get | ||
$$\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$ | $$\displaystyle \exp((1-a) \text{UHE}_a(c)) = \left( S^a \ast K^a \right)(-c)$$ | ||
Note that this function is simply a monotonic transformation of the original, and so preserves the exact same concordance ranking on all intervals. | Note that this function is simply a monotonic transformation of the original, and so preserves the exact same concordance ranking on all intervals. | ||
==== Analytic Continuation of the Convolution Kernel ==== | ==== Analytic Continuation of the Convolution Kernel ==== | ||
Line 432: | Line 431: | ||
$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$ | $$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{\|j\|}$$ | ||
where | where |''j''| represents the weighting of the JI basis ratio ''j''. In the particular case of Tenney weighting, we get: | ||
$$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}$$ | $$\displaystyle K(c) = \sum_{j \in J} \frac{\delta_{-\cent(j)}}{(j_n \cdot j_d)^{0.5}}$$ | ||
where '' | where ''j''<sub>''n''</sub> and ''j''<sub>''d''</sub> are the numerator and denominator of ''j'', respectively. | ||
Line 443: | Line 442: | ||
$$\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i t \cent(j)}}{(j_n \cdot j_d)^{0.5}}$$ | $$\displaystyle \mathcal{F}\left\{K(c)\right\}(t) = \sum_{j \in J} \frac{e^{i t \cent(j)}}{(j_n \cdot j_d)^{0.5}}$$ | ||
Furthermore, for simplicity, we can change the units, so that rather than the argument being given in cents, it is given in "natural" units of " | Furthermore, for simplicity, we can change the units, so that rather than the argument being given in cents, it is given in "natural" units of "{{w|nepers}}", a technique often used by Martin Gough in his work on [[logarithmic approximants]]. The representation of any interval in nepers is given by simply taking is natural logarithm. Doing so, by defining the change of variables {{nowrap|''c'' {{=}} {{sfrac|1200|log(2)}}''n''}}, we obtain | ||
$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}$$ | $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{j \in J} \frac{e^{i t \log (j_n/j_d)}}{(j_n \cdot j_d)^{0.5}}$$ | ||
Line 462: | Line 461: | ||
Now, we note our summation is currently written simply as <math>\sum_{j \in J}</math>. For a Tenney height weighting, we typically bound by <math>\sqrt{nd} < N</math> for some ''N''. However, although it is unusual, for the sake of simplifying the derivation, we will bound by | Now, we note our summation is currently written simply as <math>\sum_{j \in J}</math>. For a Tenney height weighting, we typically bound by <math>\sqrt{nd} < N</math> for some ''N''. However, although it is unusual, for the sake of simplifying the derivation, we will bound by {{nowrap|max(''n'', ''d'') < ''N''}} instead, despite the use of Tenney height for our weighting. This will not end up being much of a problem, as the two will converge on the same result anyway.0 | ||
Bounding by | Bounding by {{nowrap|max(''n'', ''d'') < ''N''}} is the same as specifying that {{nowrap|''j''<sub>''n''</sub> < ''N''}} and {{nowrap|''j''<sub>''d''</sub> < ''N''}}. Doing so, we get | ||
$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]$$ | $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \sum_{1\leq j_n, j_d<N} \left[ \frac{1}{{j_n}^{0.5 -i t}} \cdot \frac{1}{{j_d}^{0.5 + i t}} \right]$$ | ||
Line 471: | Line 470: | ||
We can now factor the above product to obtain: | We can now factor the above product to obtain: | ||
$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i | $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^N \frac{1}{{j_n}^{0.5 -i t}} \right] \cdot \left[ \sum_{j_d=1}^N\frac{1}{{j_d}^{0.5 + i t}} \right]$$ | ||
Now, we can see that as | Now, we can see that as {{nowrap|''N'' → ∞}} above, the summations do not converge. However, incredibly enough, each of the above expressions has a very well-known analytic continuation, which is the Riemann zeta function. | ||
To perform the analytic continuation, we temporarily change the | To perform the analytic continuation, we temporarily change the 0.5 in the denominator to some other weight {{nowrap|''w'' > 1}}. This is equivalent to changing our original <math>\sqrt{nd}</math> weighting to some other exponent, such as (''nd'')<sup>2</sup> or (''nd'')<sup>1.5</sup>. Doing this causes both of the summations above to converge, so that we obtain | ||
$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i t}} \right]$$ | $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \left[ \sum_{j_n=1}^\infty \frac{1}{{j_n}^{w -i t}} \right] \cdot \left[ \sum_{j_d=1}^\infty\frac{1}{{j_d}^{w + i t}} \right]$$ | ||
Line 485: | Line 484: | ||
Rewriting as a function of a complex variable | Rewriting as a function of a complex variable {{nowrap|''z'' {{=}} ''w'' + ''it''}}, and noting that the zeta function obeys the property that {{nowrap|ζ({{overline|''z''}}) {{=}} {{overline|ζ(''z'')}}}}, where {{overline|''s''}} represents complex conjugation, we get | ||
$$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2$$ | $$\displaystyle \mathcal{F}\left\{K(n)\right\}(t) = \overline{\zeta(z)} \cdot \zeta(z) = |\zeta(z)|^2$$ | ||
And we have now obtained a very interesting result: if we had instead gone with something like | And we have now obtained a very interesting result: if we had instead gone with something like (''nd'')<sup>2</sup> weighting on rationals, rather than <math>\sqrt{nd}</math>, that our HE setup ''would'' have converged as {{nowrap|''N'' → ∞}}, and our original HE convolution kernel would have been the Fourier transform of a particular vertical "slice" of the Riemann zeta function where {{nowrap|Re(''z'') {{=}} 2}}. | ||
Furthermore, although the above series doesn't converge for | Furthermore, although the above series doesn't converge for {{nowrap|''w'' {{=}} 0.5}}, we can simply use the analytic continuation of the Riemann zeta function to obtain a meaningful function at that point, so that our original convolution kernel can be written as | ||
$$\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)$$ | $$\displaystyle K(n) = \mathcal{F}^{-1}\left\{| \zeta(0.5+ t) |^2\right\}(n)$$ | ||
Line 498: | Line 497: | ||
which is the inverse Fourier transform of the squared absolute value of the Riemann zeta function, taken at the critical line. | which is the inverse Fourier transform of the squared absolute value of the Riemann zeta function, taken at the critical line. | ||
Lastly, to do some cleanup, we previously went with <math>\max(n,d)<N</math> bounds, rather than <math>\sqrt{nd}<N</math> bounds, despite using <math>\sqrt{nd}</math> weighting on ratios. However, it is easy to show above that regardless of which bounds you use, both choices converge to the same function when | Lastly, to do some cleanup, we previously went with <math>\max(n,d) < N</math> bounds, rather than <math>\sqrt{nd} < N</math> bounds, despite using <math>\sqrt{nd}</math> weighting on ratios. However, it is easy to show above that regardless of which bounds you use, both choices converge to the same function when {{nowrap|''w'' > 1}} in the limit as {{nowrap|''N'' → ∞}}. Since these series agree on this right half-plane of the Riemann zeta function, they share the same analytic continuation, so that we get the same result, despite using our technique to simplify the derivation. A good explanation fo this can be found on StackExchange [https://math.stackexchange.com/questions/2593993/convergence-of-product-of-series-to-zeta-function here]. | ||
It is likewise easy to show that the function < | It is likewise easy to show that the function ''K''<sup>a</sup>(''n''), taken from the numerator of our original Harmonic Rényi Entropy convolution expression, can be expressed as | ||
$$\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)$$ | $$\displaystyle K^a(n) = \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t) |^2\right\}(n)$$ | ||
Line 507: | Line 506: | ||
so that the choice of ''a'' simply changes our choice of vertical slice of the Riemann zeta function, as well as the shape of our spreading function (because it is also being raised to a power). If our spreading function is a Gaussian, then we simply get another Gaussian with a different standard deviation. | so that the choice of ''a'' simply changes our choice of vertical slice of the Riemann zeta function, as well as the shape of our spreading function (because it is also being raised to a power). If our spreading function is a Gaussian, then we simply get another Gaussian with a different standard deviation. | ||
==== Analytic | ==== Analytic continuation of unnormalized harmonic Rényi entropy ==== | ||
We can put this back into our equation for the unnormalized harmonic Rényi entropy. To do so, we will continue with our change of units from cents to nepers, corresponding to a change of our variable from ''c'' to ''n''. We will likewise assume the spreading probability distribution ''S'' has been scaled to reflect the new choice of units. | |||
We can put this back into our equation for the | |||
Line 517: | Line 515: | ||
Using our expression for < | Using our expression for ''K''<sup>a</sup> as {{nowrap|''N'' → ∞}}, we get | ||
$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)$$ | $$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \left( S^a \ast \mathcal{F}^{-1}\left\{|\zeta(0.5a+ t)|^2\right\} \right)(-n)$$ | ||
Line 531: | Line 529: | ||
We can simplify the expression of the above if we likewise take the Fourier transform of ''S''. If we do, we obtain the [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) characteristic function] of the distribution, which is typically denoted by | We can simplify the expression of the above if we likewise take the Fourier transform of ''S''. If we do, we obtain the [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) characteristic function] of the distribution, which is typically denoted by φ(''t''). We will use the following definitions: | ||
$$\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)$$ | $$\displaystyle \phi(t) = \mathcal{F}\left\{S(n)\right\}(t)$$ | ||
Line 543: | Line 541: | ||
Lastly, we note that for any real function | Lastly, we note that for any real function ''f''(''x''), we have {{nowrap|ℱ{{(}}''f''(−''x''){{)}} {{=}} ℱ{{(}}{{overline|''f''(''x'')}}{{)}}}}. For simplicity's sake, we can this write as ℱ{{(}}{{overline|''f''(''x'')}}{{)}}. Putting that all together, we get | ||
$$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$ | $$\displaystyle \exp((1-a) \text{UHE}_a(n)) = \mathcal{F}^{-1}\left\{\overline \phi_a \cdot |\zeta_{0.5a}|^2\right\}$$ | ||
where we can drop the overline on < | where we can drop the overline on |ζ<sub>0.5''a''</sub>|<sup>2</sup> because it is purely real, and its complex conjugate is itself. | ||
=== Examples === | === Examples === | ||
It is very easy to see empirically that our expression does seem to converge be the thing that UHE converges on in the limit of large N. Here are some examples for different values of ''s'' and ''a'', showing that as ''N'' increases it converges on the our analytically continued "zeta HE." | |||
In all these examples, the left plot is a series of plots of UHE at different ''N'', with the zeta HE being a slightly thicker line in the background. The right plot is the largest ''N'' plotted against zeta HE, being almost a perfect line. | |||
In all these examples, the left plot is a series of plots of UHE at different N, with the zeta HE being a slightly thicker line in the background. The right plot is the largest N plotted against zeta HE, being almost a perfect line. | |||
Note also that we don't plot the UHE directly, but rather < | Note also that we don't plot the UHE directly, but rather {{nowrap|''e''<sup>(1 − ''a'')UHE<sub>''a''</sub>}}, as described previously. The units have also been converted back to cents. Each HE function has been scaled so that the minimum entropy is 0 and the maximum entropy is 1. | ||
Lastly, note that, due to our multiplication of UHE by | Lastly, note that, due to our multiplication of UHE by {{nowrap|1 − ''a''}} above, the UHE would be flipped upside down relative to what we're used to, with higher values corresponding to more concordant intervals. In the pictures below we have flipped it back upside down, for consistency with the earlier pictures. | ||
==== s=0.5%, a=1.00001 ==== | ==== ''s'' {{=}} 0.5%, ''a'' {{=}} 1.00001 ==== | ||
[[File:ExpUHE vs zeta s=0.5%.png|800px]] | [[File:ExpUHE vs zeta s=0.5%.png|800px]] | ||
==== s=1%, a=1.00001 ==== | ==== ''s'' {{=}} 1%, ''a'' {{=}} 1.00001 ==== | ||
[[File:ExpUHE vs zeta s=1%.png|800px]] | [[File:ExpUHE vs zeta s=1%.png|800px]] | ||
==== s=1.5%, a=1.00001 ==== | ==== ''s'' {{=}} 1.5%, ''a'' {{=}} 1.00001 ==== | ||
[[File:ExpUHE vs zeta s=1.5%.png|800px]] | [[File:ExpUHE vs zeta s=1.5%.png|800px]] | ||
Note that in all these plots, the value of ''a'' is chosen to be | Note that in all these plots, the value of ''a'' is chosen to be 1.00001 rather than exactly ''1'', so as to prevent the ({{nowrap|1 − ''a''}}) term becoming 0. Similar results are seen for other choices of ''a'': | ||
==== s=1%, a=2.2 ==== | ==== ''s'' {{=}} 1%, ''a'' {{=}} 2.2 ==== | ||
[[File:ExpUHE vs zeta s=1% a=2.2.png|800px]] | [[File:ExpUHE vs zeta s=1% a=2.2.png|800px]] | ||
Note that you have to be careful if you choose to compute the analytically continued | Note that you have to be careful if you choose to compute the analytically continued {{nowrap|''a'' {{=}} 2}} numerically: this corresponds to the vertical slice of zeta function at {{nowrap|Re(''z'') {{=}} 1}}, where there is a pole. | ||
=== Apparent equivalence of exp-UHE and UHE for ''a'' ≤ 2 === | |||
''Note: this section is for future research; some of it needs to be put on more rigorous footing, but we've left it as it's certainly interesting.'' | ''Note: this section is for future research; some of it needs to be put on more rigorous footing, but we've left it as it's certainly interesting.'' | ||
Line 591: | Line 587: | ||
$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)$$ | $$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log U(c)$$ | ||
Now let's consider the auxiliary function | Now let's consider the auxiliary function {{nowrap|Ũ(''c'') {{=}} U(''c'') − U(0)}}, which gives us a "shifted" version of the UHE which has the UHE of 1/1 normalized to 0. Then we can re-express the UHE as follows: | ||
$$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)$$ | $$\displaystyle \text{UHE}_a(c) = \frac{1}{1-a} \log \left(U(0) + \tilde{U}(c) \right)$$ | ||
Lastly, suppose we only care about the entropy function up to a vertical shift and scaling: in other words, we want to declare two functions | Lastly, suppose we only care about the entropy function up to a vertical shift and scaling: in other words, we want to declare two functions (''f''(''x''), ''g''(''x'') to be '''linearly equivalent''', and write {{nowrap|''f''(''x'') ≈ ''g''(''x'')}}, if for some ''a'', ''b'' that don't depend on ''x'', we have {{nowrap|''f''(''x'') {{=}} ''a g''(''x'') + ''b''}}. This means we want to view two entropy functions as equivalent if one is just a scaled and shifted version of the other, so that when "normalizing" them (so that the entropy goes from 0 to 1), we get identical functions. Then we have all of the following relationships: | ||
$$\displaystyle \text{UHE}_a(c) \approx \log U(c) \approx \log \left( U(c)^{\frac{1}{1-a}} \right)$$ | $$\displaystyle \text{UHE}_a(c) \approx \log U(c) \approx \log \left( U(c)^{\frac{1}{1-a}} \right)$$ | ||
$$U(c) \approx \tilde{U}(c)$$ | $$U(c) \approx \tilde{U}(c)$$ | ||
where we have just dropped the constants of | where we have just dropped the constants of {{sfrac|1|1 − ''a''}} and the constant vertical shift of U(0) which doesn't depend on ''c''. | ||
Now, the main thing is that, if we are in the region where | Now, the main thing is that, if we are in the region where {{nowrap|''a'' ≤ 2}}, then this is also the region where the U(0) term goes to infinity as ''N'' increases: the entropy doesn't converge. And in general, we have the asymptotic expansion | ||
$$ | $$ | ||
Line 609: | Line 605: | ||
$$ | $$ | ||
and, for large ''k'', '''as long as''' | and, for large ''k'', '''as long as''' {{nowrap|''x'' ≪ ''k''}}, the higher-order terms become negligible. This means, for all ''c'', we would need to show that {{nowrap|Ũ(''c'') ≪ U(0)}} as {{nowrap|''N'' → ∞}}. We would then be able to rewrite the above as | ||
$$\displaystyle \log U(c) \sim \frac{1}{1-a} \left (\log (U(0)) + \frac{\tilde{U}(c)}{U(0)} \right)$$ | $$\displaystyle \log U(c) \sim \frac{1}{1-a} \left (\log (U(0)) + \frac{\tilde{U}(c)}{U(0)} \right)$$ | ||
This means that, as | This means that, as {{nowrap|''N'' → ∞}}, we would also get the following linear equivalence: | ||
$$\displaystyle \text{UHE}_a(c) \approx \tilde{U}(c)$$ | $$\displaystyle \text{UHE}_a(c) \approx \tilde{U}(c)$$ | ||
Putting this with our earlier result that | Putting this with our earlier result that {{nowrap|U(''c'') ≈ Ũ(''c'')}}, we get | ||
$$\displaystyle \text{UHE}_a(c) \approx U(c) \approx \log U(c)$$ | $$\displaystyle \text{UHE}_a(c) \approx U(c) \approx \log U(c)$$ | ||
Line 628: | Line 624: | ||
$$ | $$ | ||
We can perform the same substitution of | We can perform the same substitution of {{nowrap|U(''c'') {{=}} U(0) + Ũ(''c'')}} again to get | ||
$$ | $$ | ||
Line 635: | Line 631: | ||
$$ | $$ | ||
Given that | Given that {{nowrap|''a'' ≠ 0}}, we have a Taylor expansion of {{nowrap|(''k'' + ''x'')<sup>{{sfrac|1|1 − ''a''}}</sup>}} around {{nowrap|''x'' {{=}} 0}} of the form | ||
$$ | $$ | ||
Line 642: | Line 638: | ||
$$ | $$ | ||
where we have factored out the leading term for clarity. Now, again, we have that if | where we have factored out the leading term for clarity. Now, again, we have that if {{nowrap|''x'' ≪ ''k''}}—meaning that {{nowrap|Ũ(''c'') ≪ U(0)}}—then the higher-order terms become negligible, and we simply have the asymptotic relationship of | ||
$$ | $$ | ||
Line 656: | Line 652: | ||
$$ | $$ | ||
So that, overall, as | So that, overall, as {{nowrap|''N'' → ∞}}, the following equivalence holds: | ||
$$ | $$ | ||
Line 663: | Line 659: | ||
$$ | $$ | ||
And since we have already shown that < | And since we have already shown that {{nowrap|UHE<sub>''a''(''c'')</sub> ≈ Ũ(''c'')}}, we have | ||
$$ | $$ | ||
Line 670: | Line 666: | ||
$$ | $$ | ||
Now, the only missing piece needed for all of this is to show that we really do have | Now, the only missing piece needed for all of this is to show that we really do have {{nowrap|Ũ(''c'') ≪ U(0)}} in the region of interest. For now, absent mathematical proof, we will simply plot the behavior for the Shannon entropy as {{nowrap|''N'' → ∞}}. | ||
What we see is that, while the function diverges, it diverges in a certain "uniform" sense. That is, as ''N'' increases, a constant vertical offset is added to | What we see is that, while the function diverges, it diverges in a certain "uniform" sense. That is, as ''N'' increases, a constant vertical offset is added to U(''c''), so that the function blows up to infinity. However, if this vertical offset is corrected for, for example by subtracting U(0), the resulting curve doesn't seem to grow at all, but rather shrinks in height slightly until it seems to converge. We would like to prove this formally, but for now, we can at least see this from the following plot: | ||
[[File:ExpUHE-asymptotic-growth.png|800px]] | [[File:ExpUHE-asymptotic-growth.png|800px]] | ||
In other words, we can see that as ''N'' increases, the growth rate of | In other words, we can see that as ''N'' increases, the growth rate of U(0) dwarfs that of Ũ(''c'')</math>, which does not seem to grow at all. | ||
So, this is a fairly weak conjecture to make, given that empirical evidence suggests something much | So, this is a fairly weak conjecture to make, given that empirical evidence suggests something much stronger—that not only does it grow more slowly, but that it seems to not grow at all—it converges! In particular, it seems to converge on our analytic continuation from before. However, a strict proof of any of these things would be nice. | ||
Note again that this does not hold for | Note again that this does not hold for {{nowrap|''a'' > 2}}, where the graph does display a very large difference between UHE and exp-UHE. | ||
=== Why not | === Why not normalized HE? === | ||
On the surface, everything we did with the convolution theorem, and subsequent analytic continuation, should appear to work for normalized HE as well. For example, let's review our result for the exp of unnormalized HE: | On the surface, everything we did with the convolution theorem, and subsequent analytic continuation, should appear to work for normalized HE as well. For example, let's review our result for the exp of unnormalized HE: | ||