User:Frostburn/Theory From First Principles

From Xenharmonic Wiki
Revision as of 03:54, 24 November 2023 by Frostburn (talk | contribs) (Various tweaks, define the JIP and make a bad pun.)
Jump to navigation Jump to search

Just using Xen Wiki as a notepad, don't mind me.

I'm currently working on the grammar for Scale Workshop 3. It will naturally include monzos so I'm writing stuff down as a thinking aid. I already implemented vals because of these notes.

Time Domain

We begin our journey in the time domain where one second (1 s) passes for every 9192631770 oscillations of the radiation emited by caesium 133 during the unperturbed ground-state hyperfine transition.

Frequency Domain

We invert time to arrive in the frequency domain where oscillations are measured in repetitions per second i.e. Hertz (Hz = s-1).

Scalar Domain

Frequencies are scalar multiples of each other and especially the positive rational scalars are of particular interest in music such as the interval between 300Hz and 200Hz, the just perfect fifth.

Pitch Domain

When we take the logarithm of a positive rational scalar its factors separate into a sum e.g. [math]\displaystyle{ \log(15/8) = \log(3) + \log(5) - 3\log(2) }[/math].

Adding Geometry

By the fundamental theorem of arithmetic logarithms of primes are linearly independent over [math]\displaystyle{ \mathbb{Q} }[/math], so we can interprete [math]\displaystyle{ \log(2), \log(3), \ldots }[/math] as orthogonal basis vectors. We write [math]\displaystyle{ e_p }[/math] in place of [math]\displaystyle{ \log(p) }[/math] and enforce orthogonality

[math]\displaystyle{ e_p \cdot e_q = 0, p \ne q, p, q \in \mathbb{P} }[/math]

To make things slightly more formal we define the right-facing arrow function

[math]\displaystyle{ \overrightarrow{2^x 3^y 5^z \ldots} \mapsto x e_2 + y e_3 + z e_5 \ldots, x, y, z \in \mathbb{Q} }[/math]

which takes objects from the scalar domain to the geometric pitch domain.

We denote the inverse of the arrow function with [math]\displaystyle{ \mathrm{ratio} }[/math] i.e. it turns prime count vectors into ratios:

[math]\displaystyle{ \mathrm{ratio}(\overrightarrow{p/q}) = p/q }[/math]

Pitch is measured in cents (¢) which we define to be the vector quantity [math]\displaystyle{ ¢ := e_2 / 1200 }[/math] i.e. [math]\displaystyle{ \mathrm{ratio}(¢) = 2^{\frac{1}{1200}} \approx 1.0005777895 }[/math] .

We also define the backslash function [math]\displaystyle{ \backslash d \mapsto e_2 / d }[/math] .

In combination with implicit scalar multiplication the similarity with Scale Workshop's N-of-EDO notation is unmistakable e.g. [math]\displaystyle{ 7 \backslash 12 = 700 ¢ }[/math] .

Expanding geometry

It's instructive to construct [math]\displaystyle{ \mathrm{ratio} }[/math] more explicitly by considering the inverses of the basis vectors

[math]\displaystyle{ \begin{align} e^p :&= e_p ^ {-1},\\ e_p \cdot e^p &= 1 \end{align} }[/math]

Note that we're leaving the metric unspecified i.e. what [math]\displaystyle{ e_p \cdot e_p }[/math] equals. For the structure of the space we're constructing a positive metric suffices i.e. [math]\displaystyle{ e_p \cdot e_p = w_p^2 > 0, w_p \in \mathbb{R} }[/math]. We now have:

[math]\displaystyle{ \mathrm{ratio}(x) = \prod_{p \in \mathbb{P}} p^{x \cdot e^p} }[/math]

i.e. the superscript vectors act as measuring sticks telling us how much of each prime there is in a vector.

Let's coin a new unit called jorp (think Europe) [math]\displaystyle{ € = ¢^{-1} = 1200 \cdot \overrightarrow{2}^{-1} }[/math] which measures intervals with a resolution of 1200 ticks per octave.

One conceptually important measuring stick is the one constructed from the logarithms of the primes:

[math]\displaystyle{ \mathsf{JIP} = \sum_{p \in \mathbb{P}} e^p \log p }[/math]

I say conceptually because this just intonation point violates linear independence over [math]\displaystyle{ \mathbb{Q} }[/math] and doesn't strictly belong in the space we're constructing. Anyway:

[math]\displaystyle{ \begin{align} \mathsf{JIP} \cdot x &= \log(\mathrm{ratio}(x))\\ \mathrm{ratio}(x) &= \exp(\mathsf{JIP} \cdot x). \end{align} }[/math]

However rational approximations to the [math]\displaystyle{ \mathsf{JIP} }[/math] do belong in our space. Equal temperaments can be represented by such approximations called vals which we define as

[math]\displaystyle{ \mathrm{val}(n; a, b, \ldots, z) := n \overrightarrow{a}^{-1} + \lfloor n \log_a(b) + \frac{1}{2}\rfloor \overrightarrow{b}^{-1} + \ldots + \lfloor n \log_a(z) + \frac{1}{2}\rfloor \overrightarrow{z}^{-1} }[/math] ,

Usually the basis is obvious from context e.g. [math]\displaystyle{ a = 2, b = 3, c = 5 }[/math]. In these cases we use a left-facing arrow e.g.

[math]\displaystyle{ \overleftarrow{12} := \mathrm{val}(12; 2, 3, 5) = 12 e^2 + 19 e^3 + 28 e^5 =: \langle 12, 19, 28 \rbrack }[/math]

We can use these new objects to calculate how many steps of 12edo a tempered interval spans e.g.

[math]\displaystyle{ \overleftarrow{12} \cdot \overrightarrow{15/8} = \langle 12, 19, 28 \vert -3, 1, 1 \rangle = 11 }[/math]

The actual pitch is obtained by sandwiching the interval between the val and the step size:

[math]\displaystyle{ \overleftarrow{12} \cdot \overrightarrow{15/8} \backslash 12 = 1100 ¢ }[/math] .

The geometric inverses are mainly relevant for subgroup temperaments. Consider Barbados:

[math]\displaystyle{ \overleftarrow{5} := \mathrm{val}(5; 2, 3, 13/5) = 5 \cdot \overrightarrow{2}^{-1} + 8 \cdot \overrightarrow{3}^{-1} + 7 \cdot \overrightarrow{13/5}^{-1} = 5 e^2 + 8 e^3 - \frac{7}{2}e^5 + \frac{7}{2}e^{13} }[/math]

We can verify that the comma 676/675 indeed vanishes using this val:

[math]\displaystyle{ \overleftarrow{5} \cdot \overrightarrow{676/675} = \langle 5, 8, -\frac{7}{2}, 0, 0, \frac{7}{2} \vert 2, -3, -2, 0, 0, 2 \rangle = 0 }[/math]

Note that above I've implicitly used a convenient metric to carry out the calculations which is fine due to the new basis still being orhogonal. Explicitly we'd have [math]\displaystyle{ \overrightarrow{13/5}^{-1} = (w_{13} \hat{m} - w_5 \hat{k}) / (w_5^2 + w_{13}^2) }[/math], where the weights decide how much 13/5 "leans" toward 5/1 or 13/1.

On units

Scalars do not have units. That's what makes them scalars. Do pitches have units? Maybe they are like radians, unitless but it makes no sense to add them to other kinds of objects. Whatever the case may be, prime count vectors (i.e. monzos) have inverse units to vals. This should be enough to distinguish them during SW3 runtime and prevent vals from being interpreted as pitch or turned into frequencies.

Taking these considerations more seriously and remembering that cents are a vector quantity we can try to figure out what units vals have: One cent is one hundreth of a semitone and one octave consists of twelve of these semitones. All vector quantities. Let's call the dimensioneless version of a semitone a demitone. To re-iterate: A cent is 1/100 demitones in the direction of [math]\displaystyle{ e_2 }[/math]. Let's call [math]\displaystyle{ \hat{i} }[/math] the direction of [math]\displaystyle{ e_2 }[/math] i.e. [math]\displaystyle{ e_2 = w_2 \hat{i} = 12 d \hat{i} }[/math], where [math]\displaystyle{ d }[/math] is the metric weight of a demitone. The basis vector itself has unit metric [math]\displaystyle{ \hat{i} \cdot \hat{i} = 1 }[/math].

A reciprocal cent satisfies [math]\displaystyle{ ¢^{-1} \cdot ¢ = 1 }[/math] so as per the usual definition of the geometric inverse of a vector we have [math]\displaystyle{ ¢^{-1} = ¢ / (¢ \cdot ¢) = \frac{1}{1200}e_2 / (\frac{1}{1200}^2 e_2 \cdot e_2) = 1200 w_2 \hat{i} / (w_2^2 \hat{i} \cdot \hat{i}) = \frac{1200}{w_2}\hat{i} }[/math].

Now we have [math]\displaystyle{ e^2 = e_2 / w_2^2 = \hat{i} / w_2 }[/math]. A reciprocal cent can now be expressed as [math]\displaystyle{ ¢^{-1} = 1200 e^2 = 100 d^{-1} \hat{i} }[/math] or 100 reciprocal demitones in the [math]\displaystyle{ \hat{i} }[/math] direction.

Clifford algebra nonsense

Both 12edo and 7edo temper out the syntonic comma: [math]\displaystyle{ \overleftarrow{12} \cdot \overrightarrow{81/80} = 0 = \overleftarrow{7} \cdot \overrightarrow{81/80} }[/math] .

Therefore so does any linear combination of them e. g. [math]\displaystyle{ 2 \cdot \overleftarrow{12} + \overleftarrow{7} = \overleftarrow{31} }[/math]

We can identify the plane spanned by [math]\displaystyle{ \overleftarrow{12} }[/math] and [math]\displaystyle{ \overleftarrow{7} }[/math] as the (5-limit) Meantone temperament. We can use wedges to represent it symbolically:

[math]\displaystyle{ \overleftarrow{12} \wedge \overleftarrow{7} = -4 e^3 \wedge e^5 + 4 e^5 \wedge e^2 - e^2 \wedge e^3 }[/math] ,

where the components are basis planes. E.g. [math]\displaystyle{ e^3 \wedge e^5 }[/math] is the plane where octaves are tempered out. The wedge of any vector with itself is zero i.e. you can't span a plane with only one direction. The wedge product is also antisymmetric and the planes come with signed weights but we mostly care about the orientation they represent. Note that [math]\displaystyle{ e^2 }[/math] represents the "line" where tritaves and pentaves are tempered out.

The largest possible wedge combines all of the basis vectors and represents just intonation i.e. no tempering whatsoever: [math]\displaystyle{ e^2 \wedge e^3 \wedge e^5 }[/math].

We can define the dual operator:

[math]\displaystyle{ \begin{align} \overline{e_2} &= e^3 \wedge e^5 \\ \overline{e_3} &= e^5 \wedge e^2 \\ \overline{e_5} &= e^2 \wedge e^3 , \\ \end{align} }[/math]

which is is extended linearly to all vectors. It is obvious from the definitions that the Meantone temperament is represented by [math]\displaystyle{ \overline{\overrightarrow{81/80}} }[/math].

Defining the vee operator [math]\displaystyle{ a \vee b := \overline{\overline{b} \wedge \overline{a}} }[/math], where the inner overlines are the inverses of the dual (always obvious from context). Wedges combine vals into temperaments that are closer to just intonation while vees do progressive damage to just intonation. As an exercise let's calculate the damage done by combining Meantone with Augmented:

[math]\displaystyle{ \begin{align} & \overline{\overrightarrow{81/80}} \vee \overline{\overrightarrow{128/125}} \\ & = \overline{-4 e_2 + 4 e_3 - e_5 } \vee \overline{7 e_2 - 3 e_5} \\ & = \overline{(7 e_2 - 3 e_5) \wedge (-4 e_2 + 4 e_3 - e_5)} \\ & = \overline{28 e_2 \wedge e_3 - 7 e_2 \wedge e_5 + 12 e_5 \wedge e_2 - 12 e_5 \wedge e_3} \\ & = \overline{12 e_3 \wedge e_5 + 19 e_5 \wedge e_2 + 28 e_2 \wedge e_3} \\ & = 12 e^2 + 19 e^3 + 28 e^5 \\ & = \overleftarrow{12} \end{align} }[/math]

Neat!

Exterior algebras do have a sense of lying-in-the-plane-of but we need a metric to do projection and tuning which we already touched upon in the units section. As far as data structures go, full Clifford algebras are memory-hungry. Not worth the complication in a general purpose tool like Scale Workshop.

To explore our space a bit longer we can consider [math]\displaystyle{ e_2 }[/math] as an interval class consiting of all multiples of the octave. After all we have [math]\displaystyle{ \overrightarrow{8} = 3 \cdot \overrightarrow{2} = 3 e_2 }[/math]. Wedging gives us larger classes. [math]\displaystyle{ e_2 \wedge e_3 }[/math] can be seen as representing Pythagorean just intonation. Unfortunately the wedges are not powerful enough to distinguish subflavors [math]\displaystyle{ \overrightarrow{2} \wedge \overrightarrow{9} = 2 \cdot \overrightarrow{2} \wedge \overrightarrow{3} = \overrightarrow{4} \wedge \overrightarrow{3} }[/math] but at least the "orientation" is correct. Previously we used the term "just intonation" to refer to lack of tempering but it's also used to refer to an interval class such as 5-limit just intonation representable by [math]\displaystyle{ e_2 \wedge e_3 \wedge e_5 }[/math].

One interesting pseudo-object in our space (let's call it [math]\displaystyle{ \mathbb{S} }[/math]) is the unison plane:

[math]\displaystyle{ \{\mathsf{JIP} \cdot x = 0, x \in \mathbb{S}\} = \{\overrightarrow{1}\} }[/math]

which strictly speaking consists of only a single element, namely the unison 1/1. However all commas lie close to this plane so it's informative to look at 3D subspaces of [math]\displaystyle{ \mathbb{S} }[/math] facing the unison plane to get a 2D view of all commas of interest. The null-space of all reasonable temperaments are also closely aligned with this plane.