N2D3P9: Difference between revisions

Revision as of 07:16, 28 August 2020

[math]\displaystyle{ \text{N2D3P9} }[/math], or Entoo-Deethree-Peenine, is a fictional character in the Star Wars franchise. In an alternative timeline, the young Anakin Skywalker assembles the droid N2D3P9 from the parts of three other droids: R2D2, C3P0 and NR-N99. We're only joking, but we hope this helps with remembering and pronouncing the name.

[math]\displaystyle{ \text{N2D3P9} }[/math] is a mathematical function which was developed to help in designing the Sagittal microtonal notation. Given a pitch ratio [math]\displaystyle{ \frac{n}{d} }[/math], [math]\displaystyle{ \text{N2D3P9} }[/math] estimates its rank in popularity among all rational pitches in musical use. A low value of [math]\displaystyle{ \text{N2D3P9} }[/math] indicates that the ratio is used often, and so should have a simple accidental symbol, while a high value indicates that the ratio is used rarely and so can have a more complex symbol if necessary. It may also be useful in designing rational scales or tunings. The name "N2D3P9" is an abbreviation of key components of its formula, as described below.

Formula

Before describing how to calculate [math]\displaystyle{ \text{N2D3P9} }[/math], we define three simpler terms that are used in its formula:

2,3-free ratios, which are also known as "5-rough" ratios. Because factors of [math]\displaystyle{ 2 }[/math] and [math]\displaystyle{ 3 }[/math] in pitch ratios are already notated by changing octaves or moving along the chain of fifths (... B♭♭ F♭ C♭ G♭ D♭ A♭ E♭ B♭ F C G D A E B F♯ C♯ G♯ D♯ A♯ E♯ B♯ Fx ...), N2D3P9 only operates on ratios that have had their factors of [math]\displaystyle{ 2 }[/math] and [math]\displaystyle{ 3 }[/math] removed. For example, there are various numbers of factors of [math]\displaystyle{ 2 }[/math] and [math]\displaystyle{ 3 }[/math] in the following ratios: [math]\displaystyle{ \frac{16}{15}, \frac{10}{9}, \frac{6}{5}, \frac{5}{4}, \frac{27}{20}, \frac{45}{32}, \frac{64}{45}, \frac{40}{27}, \frac{8}{5}, \frac{5}{3}, \frac{9}{5}, \frac{15}{8} }[/math], but when their factors of [math]\displaystyle{ 2 }[/math] and [math]\displaystyle{ 3 }[/math] are removed, they all reduce to [math]\displaystyle{ \frac{1}{5} }[/math] or [math]\displaystyle{ \frac{5}{1} }[/math], and so they can all be notated using the same microtonal accidental, pointing either up or down, combined with different letters and sharps or flats. We say that [math]\displaystyle{ \frac{1}{5} }[/math] or [math]\displaystyle{ \frac{5}{1} }[/math] is the 2,3-removed or 2,3-free form of these pitch ratios, and because [math]\displaystyle{ \frac{1}{5} }[/math] and [math]\displaystyle{ \frac{5}{1} }[/math] use the same accidental pointing either up or down, and because N2D3P9 only operates on ratios whose numerator is larger than their denominator (superunison ratios), [math]\displaystyle{ \frac{5}{1} }[/math] can represent this entire 2,3-equivalent pitch ratio class or 2,3-equivalence-class for the purpose of notation design.
The copfr function, which stands for "Count Of Prime Factors with Repeats". It applies to any positive integer. For example [math]\displaystyle{ 175 }[/math] has the prime factorization [math]\displaystyle{ 5 × 5 × 7 }[/math], which has 3 factors including the repeat of [math]\displaystyle{ 5 }[/math], so [math]\displaystyle{ \text{copfr}(175) = 3 }[/math]. [math]\displaystyle{ \text{copfr}(1) = 0 }[/math]. [math]\displaystyle{ \text{copfr} }[/math] is also called the "big omega" function, [math]\displaystyle{ Ω }[/math].
The prime-limit function, which is also known as [math]\displaystyle{ \text{gpf} }[/math], which stands for greatest prime factor. [math]\displaystyle{ \text{prime-limit}(175) = 7 }[/math]. Some authors leave [math]\displaystyle{ \text{prime-limit}(1) }[/math] undefined; we avoid the question because we define [math]\displaystyle{ \text{N2D3P9}(\frac{1}{1}) }[/math] ≡ [math]\displaystyle{ \text{N2D3P9}(\frac{3}{1}) = 1 }[/math]. This is because the ratios in the equivalence class represented by the 2,3-removed [math]\displaystyle{ \frac{1}{1} }[/math] actually have a prime limit of 3.

Now we can give the formula for [math]\displaystyle{ \text{N2D3P9} }[/math]([math]\displaystyle{ \frac{n}{d} }[/math]) as: $$ \begin{cases} \large{\text{N2D3P9}(\frac{n}{d})=\frac{n}{2^{\text{copfr}(n)}}×\frac{d}{3^{\text{copfr}(d)}}×\frac{\text{prime-limit}(nd)}{9}}, \\ \small{\text{ where }n\text{ and }d\text{ are 2,3-free positive integers and }n>d.} \\ \large{\text{N2D3P9}(\frac{1}{1})=1} \\ \end{cases} $$ Note that where

[math]\displaystyle{ n = 5^{n_5}×7^{n_7}×11^{n_{11}}×... }[/math]

we have

[math]\displaystyle{ 2^{\text{copfr}(n)}=2^{n_5}×2^{n_7}×2^{n_{11}}×... }[/math]

and so

[math]\displaystyle{ \frac{n}{2^{\text{copfr}(n)}}=(\frac{5}{2})^{n_5}×(\frac{7}{2})^{n_7}×(\frac{11}{2})^{n_{11}}×... }[/math]

and similarly

[math]\displaystyle{ \frac{d}{3^{\text{copfr}(d)}}=(\frac{5}{3})^{d_5}×(\frac{7}{3})^{d_7}×(\frac{11}{3})^{d_{11}}×... }[/math]

These can be described respectively as "product of half prime factors of the numerator (with repeats)" and "product of one-third prime factors of the denominator (with repeats)". So we can describe the procedure for calculating [math]\displaystyle{ \text{N2D3P9} }[/math]([math]\displaystyle{ \frac{n}{d} }[/math]) as:

Take the prime factorization of the numerator and divide all the primes by 2, then multiply it out again. Do the same with the denominator but divide the primes by 3 instead of 2. Multiply these two results together then multiply by the prime limit of the ratio and divide by 9.

[math]\displaystyle{ \text{N2D3P9} }[/math] can also be written as: $$\text{N2D3P9}(\frac{n}{d})=\frac{nd⋅\text{gpf}(nd)}{2^{Ω(n)}3^{Ω(d) + 2}} $$ where [math]\displaystyle{ nd }[/math] is established in music theory as a ratio's "product complexity" or Benedetti height.

The division by 9 does not affect the ranking, but it has the convenient effect that [math]\displaystyle{ \text{N2D3P9} }[/math] values are almost the same as the ranks they produce when applied to all 2,3-free superunison ratios. Putting it another way, there are approximately [math]\displaystyle{ N }[/math] 2,3-free pitch ratios with [math]\displaystyle{ \text{N2D3P9}≤N }[/math]. For example, [math]\displaystyle{ \text{N2D3P9}(\frac{77}{5}) = \frac{7}{2} × \frac{11}{2} × \frac{5}{3} × \frac{11}{9} ≈ 39 }[/math], suggesting there are approximately 38 other 2,3-free pitch ratios more popular than [math]\displaystyle{ \frac{77}{5} }[/math]. There are actually about 4% fewer than that on average. In this case there are 36.

Justification

Why should we believe that [math]\displaystyle{ \text{N2D3P9} }[/math] accurately ranks the popularity of 2,3-equivalent pitch classes?

[math]\displaystyle{ \text{N2D3P9} }[/math] was developed (or discovered) rather late in the development of Sagittal notation. The Sagittal designers previously relied on actual ratio usage data from the Huygens-Fokker Foundation's scale archive, kindly provided by Manuel Op de Coul.

All scales in the archive were treated equally, as there was no information about their relative importance. Each occurrence of a pitch ratio in a scale was counted as one vote for that ratio. Then the ratios were grouped into 2,3-equivalent pitch classes and a single figure obtained for each 2,3-free superunison ratio (representing the class). There were 29,403 votes, allocated to 820 2,3-free ratios.

Like the frequency of use of letters in an alphabet, when sorted in order of decreasing popularity, the ratios obeyed an approximate Zipf's law distribution, with the Nth most popular ratio having votes proportional to approximately [math]\displaystyle{ \frac{1}{N^{1.37}} }[/math]. This meant that about half the ratios had only one vote each, and three quarters of them had 3 votes or less. Such low numbers of votes meant that the data on the less popular ratios was vulnerable to "historical noise". In other words, the position of such a ratio in the list might not be a good predictor of its relative frequency of use in the future.

In the early stages of Sagittal design, when allocating symbols for the most popular ratios, the designers could rely on the Scala archive data, but when they moved on to less popular ratios they needed some "less noisy" way to rank them.

Blumeyer and Keenan found that [math]\displaystyle{ \text{N2D3P9} }[/math] is a psychoacoustically plausible function of a ratio's prime factorization that:

ranks 10 of the 11 most popular ratios in exactly the same way as the archive data, and
ranks all 820 ratios in a way that has a low sum of squared errors in their ranks, relative to the archive data, and
is sufficiently simple, having only two parameters, that it cannot be overfitting the data, and should therefore serve to average out the historical noise in the ranking of the less popular ratios, including ratios that do not occur in the archive at all.

However, their approach was not able to consider all possible psychoacoustic reasons for a ratio's popularity. For example, [math]\displaystyle{ \text{N2D3P9} }[/math] does not evaluate whether some member of a 2,3-equivalence-class might be very close in pitch to some member of another 2,3-equivalence-class, such as [math]\displaystyle{ \frac{65}{64} }[/math] being very close to [math]\displaystyle{ \frac{1}{1} }[/math].

Development/Discovery

From May to August 2020, a collaborative effort to find such a function, was carried out by members of the Sagittal forum, led by Sagittal co-creator Dave Keenan and Douglas Blumeyer. Many functions besides [math]\displaystyle{ \text{N2D3P9} }[/math] were considered before selecting it as the best function for its purpose.

Estimation of pitch ratio popularity is possible because it correlates with numeric simplicity. [math]\displaystyle{ \text{N2D3P9} }[/math] is most useful when comparing ranks of more complex ratios, because usage data about such ratios is sparse. By fitting a function to the statistical usage data which is available for simpler ratios, [math]\displaystyle{ \text{N2D3P9} }[/math] enables the extension of the patterns found in these simpler ratios.

Rather than attempt to fit functions to the exact counts of votes for each ratio, the functions were fit to the rank indices of each ratio; in other words, a function only needed to sort ratios the same as the actual data, and within each rank position it was unimportant how close its estimate of votes was. In technical parlance, the goal was to maximize the Spearman’s rank coefficient between the estimated ranks and the actual ranks. For purposes of comparing competing functions, maximizing Spearman’s rank coefficient could be simplified to minimizing the sum of squared differences between the ranks. But because fitting to the simpler ratios which had more votes is more important, a Zipf's-law weighting was applied to the ranks by taking their reciprocals before calculating their squared differences. A fractional ranking strategy was used to ensure that stretches of the data with tied vote counts did not distort the measurement.

The overall strategy, then, was to minimize this weighted rank error, while also minimizing the complexity of the function, to avoid overfitting. An earlier notational popularity ranking function for 2,3-removed-ratios, that had been used by the creators of Sagittal was [math]\displaystyle{ \text{sopfr} }[/math] (sum of prime factors with repetition). It does a remarkably good job of estimating the rank of pitch ratios given how simple it is. However the weighted sum of squared errors that [math]\displaystyle{ \text{sopfr} }[/math] gives for the Scala stats is about 0.026, while [math]\displaystyle{ \text{N2D3P9} }[/math] reduces that to about 0.010. Functions giving sums of squares as low as 0.008 were found, however, these functions were so complex that they probably were fitting to noise in the Scala stats instead of to the true nature of musical pitch. An informal “chunk” metric was devised to compare function complexity in terms of ability to fit to the data, with considered functions ranging from one chunk ([math]\displaystyle{ \text{sopfr} }[/math]) to eight chunks; the winning function [math]\displaystyle{ \text{N2D3P9} }[/math] has five chunks.

Several techniques were used to find and decide on [math]\displaystyle{ \text{N2D3P9} }[/math] as the best 2,3-removed-ratio notational-popularity rank-estimation function. Initial observations about shortcomings of [math]\displaystyle{ \text{sopfr} }[/math], such as its failure to differentiate balanced ratios from their imbalanced equivalents — such as [math]\displaystyle{ \frac{11}{5} }[/math] versus [math]\displaystyle{ \frac{55}{1} }[/math] — or those with different prime limits such as [math]\displaystyle{ \frac{13}{5} }[/math] and [math]\displaystyle{ \frac{11}{7} }[/math], despite those pairs of ratios exhibiting remarkably different actual ranks in the Scala stats, formed the basis of the investigation. Psychoacoustic plausibility of functions was used as a top-down guide for experimentation. Optimization tools such as Excel's Evolutionary Solver were used to navigate toward ideal values for each parameter. The approach that was finally successful was a brute-force approach implemented by Douglas Blumeyer, whereby nearly 2 billion functions combined out of constituent "submetrics" were checked automatically. In the end, one of the functions on the short-list generated from the brute-force checker was recognized as being re-writable in a much simpler form with parameter values rounded to whole numbers without doing much damage to its sum-of-squares, and thus [math]\displaystyle{ \text{N2D3P9} }[/math] was born.

After deciding upon [math]\displaystyle{ \text{N2D3P9} }[/math], the Sagittal forum members checked the ratios for the existing Sagittal symbols against it, to see how well they'd been served by the Scala archive stats and the earlier [math]\displaystyle{ \text{sopfr} }[/math] metric. Each symbol in Sagittal's JI notations has a default value, or primary comma, which allows it to exactly notate ratios in a 2,3-equivalence-class, and based on [math]\displaystyle{ \text{N2D3P9} }[/math], it was found that only a couple of these commas should be changed (these were among the rarest-used symbols in Sagittal). This was as expected; [math]\displaystyle{ \text{N2D3P9} }[/math] was developed primarily in order to add new symbols to Sagittal, to enable it to exactly notate even rarer JI pitches than it already does.

Table of Top 100 (2,3-equivalent) Pitch Ratio Classes by N2D3P9


2,3-equivalent pitch ratio class	N2D3P9	N2D3P9 rank	Scala archive rank	Scala archive occurrences
1/1	1	1	1	7624
5/1	1.39	2	2	5371
7/1	2.72	3	3	3016
25/1	3.47	4	4	1610
7/5	4.54	5	5	1318
11/1	6.72	6	6	1002
35/1	6.81	7	7	875
125/1	8.68	8	8	492
13/1	9.39	9	10	447
49/1	9.53	10	9	463
11/5	11.2	11	11	339
25/7	11.34	12	14	312
13/5	15.65	13	16	205
11/7	15.69	14	12	324
49/5	15.88	15	15	246
17/1	16.06	16	13	318
55/1	16.81	17	24	119
175/1	17.01	18	17	168
19/1	20.06	19	18	166
625/1	21.7	20	21	143
13/7	21.91	21	20	145
65/1	23.47	22	50	40
77/1	23.53	23	25	111
245/1	23.82	24	19	165
49/25	26.47	25	23	134
17/5	26.76	26	26	108
25/11	28.01	27	47	42
125/7	28.36	28	33	62
23/1	29.39	29	22	136
91/1	32.86	30	57	30
343/1	33.35	31	31	70
19/5	33.43	32	27	97
13/11	34.43	33	29	89
121/1	36.97	34	42.5	46
17/7	37.46	35	40	50
25/13	39.12	36	52.5	34
77/5	39.21	38	28	92
55/7	39.21	38	34	61
35/11	39.21	38	35.5	55
85/1	40.14	40	78	20
275/1	42.01	41	147	7
875/1	42.53	42	76	21
29/1	46.72	43	32	67
19/7	46.8	44	37.5	52
23/5	48.98	45	44	45
95/1	50.14	46	72	23
143/1	51.64	47	66	26
31/1	53.39	48	30	80
3125/1	54.25	49	52.5	34
91/5	54.77	51	68	25
65/7	54.77	51	102.5	11
35/13	54.77	51	102.5	11
49/11	54.9	53	54	33
343/5	55.58	54	55.5	31
119/1	56.19	55	252.5	3
325/1	58.68	56	604.5	1
385/1	58.82	57	37.5	52
17/11	58.87	58	35.5	55
1225/1	59.55	59	41	47
169/1	61.03	60	86	14
121/5	61.62	61	147	7
77/25	65.35	62	63	27
125/49	66.17	63	63	27
25/17	66.9	64	134.5	8
23/7	68.57	65	47	42
17/13	69.57	66	47	42
125/11	70.02	67	147	7
133/1	70.19	68	329	2
625/7	70.89	69	113	10
115/1	73.47	70	604.5	1
19/11	73.54	71	55.5	31
37/1	76.06	72	42.5	46
49/13	76.68	73	147	7
29/5	77.87	74	59.5	28
455/1	82.15	75	186.5	5
539/1	82.35	76	186.5	5
1715/1	83.37	77	76	21
25/19	83.56	78	79.5	19
143/5	86.06	80	217.5	4
65/11	86.06	80	164	6
55/13	86.06	80	90	13
121/7	86.27	82	164	6
19/13	86.91	83	45	44
187/1	88.31	84	217.5	4
31/5	88.98	85	68	25
91/25	91.28	86	329	2
55/49	91.5	87	39	51
605/1	92.43	88	94.5	12
343/25	92.63	89	68	25
41/1	93.39	90	72	23
119/5	93.66	92	123.5	9
85/7	93.66	92	-	0
35/17	93.66	92	217.5	4
125/13	97.8	94	604.5	1
275/7	98.03	95.5	102.5	11
175/11	98.03	95.5	147	7
425/1	100.35	97	329	2
169/5	101.71	98	329	2
121/25	102.7	99	217.5	4
43/1	102.72	100	58	29

@@ Line 80: / Line 80: @@
 Estimation of pitch ratio popularity is possible because it correlates with numeric simplicity. <math>\text{N2D3P9}</math> is most useful when comparing ranks of more complex ratios, because usage data about such ratios is sparse. By fitting a function to the statistical usage data which is available for simpler ratios, <math>\text{N2D3P9}</math> enables the extension of the patterns found in these simpler ratios.
-Rather than attempt to fit functions to the exact counts of votes for each ratio, the functions were fit to the rank indices of each ratio; in other words, a function only needed to sort ratios the same as the actual data, and within each rank position it was unimportant how close its estimate of votes was. In technical parlance, the goal was to minimize the [https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient Spearman’s rank coefficient] between the estimated ranks and the actual ranks. For purposes of comparing competing functions, minimizing Spearman’s rank coefficient could be simplified to minimizing the sum of squared differences between the ranks. But because fitting to the simpler ratios which had more votes is more important, a Zipf's-law weighting was applied to the ranks by taking their reciprocals before calculating their squared differences. A [https://en.wikipedia.org/wiki/Ranking#Fractional_ranking_(%221_2.5_2.5_4%22_ranking) fractional ranking] strategy was used to ensure that stretches of the data with tied vote counts did not distort the measurement.
+Rather than attempt to fit functions to the exact counts of votes for each ratio, the functions were fit to the rank indices of each ratio; in other words, a function only needed to sort ratios the same as the actual data, and within each rank position it was unimportant how close its estimate of votes was. In technical parlance, the goal was to maximize the [https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient Spearman’s rank coefficient] between the estimated ranks and the actual ranks. For purposes of comparing competing functions, maximizing Spearman’s rank coefficient could be simplified to minimizing the sum of squared differences between the ranks. But because fitting to the simpler ratios which had more votes is more important, a Zipf's-law weighting was applied to the ranks by taking their reciprocals before calculating their squared differences. A [https://en.wikipedia.org/wiki/Ranking#Fractional_ranking_(%221_2.5_2.5_4%22_ranking) fractional ranking] strategy was used to ensure that stretches of the data with tied vote counts did not distort the measurement.
-The overall strategy, then, was to minimize this weighted rank correlation, while also minimizing the complexity of the function, to avoid overfitting. An earlier notational popularity ranking function for  2,3-removed-ratios, that had been used by the creators of Sagittal was <math>\text{sopfr}</math> ([https://mathworld.wolfram.com/SumofPrimeFactors.html sum of prime factors with repetition]). It does a remarkably good job of estimating the rank of pitch ratios given how simple it is. However the weighted sum of squared errors that <math>\text{sopfr}</math> gives for the Scala stats is about 0.026, while <math>\text{N2D3P9}</math> reduces that to about 0.010. Functions giving sums of squares as low as 0.008 were found, however, these functions were so complex that they probably were fitting to noise in the Scala stats instead of to the true nature of musical pitch. An informal “chunk” metric was devised to compare function complexity in terms of ability to fit to the data, with considered functions ranging from one chunk (<math>\text{sopfr}</math>) to eight chunks; the winning function <math>\text{N2D3P9}</math> has five chunks.
+The overall strategy, then, was to minimize this weighted rank error, while also minimizing the complexity of the function, to avoid overfitting. An earlier notational popularity ranking function for  2,3-removed-ratios, that had been used by the creators of Sagittal was <math>\text{sopfr}</math> ([https://mathworld.wolfram.com/SumofPrimeFactors.html sum of prime factors with repetition]). It does a remarkably good job of estimating the rank of pitch ratios given how simple it is. However the weighted sum of squared errors that <math>\text{sopfr}</math> gives for the Scala stats is about 0.026, while <math>\text{N2D3P9}</math> reduces that to about 0.010. Functions giving sums of squares as low as 0.008 were found, however, these functions were so complex that they probably were fitting to noise in the Scala stats instead of to the true nature of musical pitch. An informal “chunk” metric was devised to compare function complexity in terms of ability to fit to the data, with considered functions ranging from one chunk (<math>\text{sopfr}</math>) to eight chunks; the winning function <math>\text{N2D3P9}</math> has five chunks.
 Several techniques were used to find and decide on <math>\text{N2D3P9}</math> as the best 2,3-removed-ratio notational-popularity rank-estimation function. Initial observations about shortcomings of <math>\text{sopfr}</math>, such as its failure to differentiate balanced ratios from their imbalanced equivalents — such as <math>\frac{11}{5}</math> versus <math>\frac{55}{1}</math> — or those with different prime limits such as <math>\frac{13}{5}</math> and <math>\frac{11}{7}</math>, despite those pairs of ratios exhibiting remarkably different actual ranks in the Scala stats, formed the basis of the investigation. Psychoacoustic plausibility of functions was used as a top-down guide for experimentation. [https://en.wikipedia.org/wiki/Mathematical_optimization Optimization] tools such as [https://www.microsoft.com/en-us/microsoft-365/blog/2009/09/21/new-and-improved-solver/ Excel's Evolutionary Solver] were used to navigate toward ideal values for each parameter. The approach that was finally successful was a brute-force approach implemented by Douglas Blumeyer, whereby nearly 2 billion functions combined out of constituent "submetrics" were checked automatically. In the end, one of the functions on the short-list generated from the brute-force checker was recognized as being re-writable in a much simpler form with parameter values rounded to whole numbers without doing much damage to its sum-of-squares, and thus <math>\text{N2D3P9}</math> was born.

N2D3P9: Difference between revisions

Revision as of 07:16, 28 August 2020

Contents

Formula

Justification

Development/Discovery

Table of Top 100 (2,3-equivalent) Pitch Ratio Classes by N2D3P9

Navigation menu

N2D3P9: Difference between revisions

Revision as of 07:16, 28 August 2020

Formula

Justification

Development/Discovery

Table of Top 100 (2,3-equivalent) Pitch Ratio Classes by N2D3P9

Navigation menu

Search