User:Sintel/Generator optimization

From Xenharmonic Wiki
Jump to navigation Jump to search

Some derivations on optimal generators for 2-norm (Euclidean) tunings.

In what follows, [math]\displaystyle{ M }[/math] is an [math]\displaystyle{ n \times k }[/math] integer matrix that represents the temperament, where [math]\displaystyle{ n }[/math] is the rank of the temperament and [math]\displaystyle{ k }[/math] is the rank of the just intonation subgroup. [math]\displaystyle{ g }[/math] is the (unknown) row vector of generators with length [math]\displaystyle{ n }[/math], and [math]\displaystyle{ j }[/math] is a row vector of length [math]\displaystyle{ k }[/math] containing the log-primes (aka the JIP).

The goal is now to find [math]\displaystyle{ g }[/math] such that the temperament approximates JI:

[math]\displaystyle{ gM \approx j }[/math]

Solving the system

The simplest case involves solving a linear system. Since there are [math]\displaystyle{ n }[/math] unknowns, we can introduce [math]\displaystyle{ n }[/math] constraints, each forcing an interval of choice to be just.

Let [math]\displaystyle{ V }[/math] be a [math]\displaystyle{ k \times n }[/math] matrix, whose columns are the prime count vectors of the intervals we want to be just. We now want to solve:

[math]\displaystyle{ gMV = jV }[/math]

We can recognize the LHS as giving the size of the tempered intervals in [math]\displaystyle{ V }[/math], and the RHS as the size of the just intervals. Since [math]\displaystyle{ MV }[/math] is square ([math]\displaystyle{ n \times n }[/math]), we can simply invert it to obtain a solution:

[math]\displaystyle{ g = jV(MV)^{-1} }[/math]

Least squares solution

If we want more than [math]\displaystyle{ n }[/math] intervals to be just, the result is an overconstrained system, and no solutions can be found. However, we can instead obtain an approximate solution that minimizes the squared error. If [math]\displaystyle{ V }[/math] is now a [math]\displaystyle{ k \times m }[/math] matrix (where [math]\displaystyle{ m \gt n }[/math]), then the error for these intervals is:

[math]\displaystyle{ gMV - jV }[/math]

Using [math]\displaystyle{ \| \cdot \| }[/math] to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem:

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|^2 }[/math]

Note that since we are working with row vectors, the norm is defined such that [math]\displaystyle{ \| x \|^2 = xx^{\mathsf T} }[/math].

Differentiating with respect to [math]\displaystyle{ g }[/math]:

[math]\displaystyle{ \displaystyle \begin{aligned} & \, \frac{\partial }{\partial g} \| gMV - jV \|^2 \\ = & \, \frac{\partial }{\partial g} (gMV - jV)(gMV - jV)^{\mathsf T} \\ = & \, \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\ = & \, 2gMVV^{\mathsf T}M^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T} = 0 \end{aligned} }[/math]

Now we solve for [math]\displaystyle{ g }[/math]:

[math]\displaystyle{ g M V V^{\mathsf T} M^{\mathsf T} = M V V^{\mathsf T} j^{\mathsf T} }[/math]
[math]\displaystyle{ g = j V V^{\mathsf T} M^{\mathsf T} \left(M V V^{\mathsf T}M^{\mathsf T} \right)^{-1} }[/math]

If we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup. This is equivalent to picking the identity matrix [math]\displaystyle{ V = I_{k \times k} }[/math], so that the above expression simplifies to:

[math]\displaystyle{ g = j M^{\mathsf T} (M M^{\mathsf T})^{-1} }[/math]

Weighted norm

It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly. We can introduce a (usually diagonal) weighting matrix [math]\displaystyle{ W }[/math]. For technical reasons, the norm is taken for [math]\displaystyle{ W^2 }[/math] so that [math]\displaystyle{ \| x \|_{W^2}^2 = xW^2x^{\mathsf T} }[/math].

This results in a weighted least squares problem:

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|_{W^2}^2 }[/math]

We can write this as an ordinary least squares problem by setting:

[math]\displaystyle{ \begin{gather} j' = jW \\ M' = MW \end{gather} }[/math]

Note that in Tenney-Euclidean tuning [math]\displaystyle{ j' = \begin{bmatrix} 1, & \dots, & 1 \end{bmatrix} }[/math].

We then solve

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gM'V - j'V \right\|^2 }[/math]

as above.

Constraints

If we want some intervals to be exactly just, while optimizing all the others, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem:

[math]\displaystyle{ \begin{align} \underset{g}{\text{minimize}} & \ \left\| gMW - jW \right\|^2 \\ \text{subject to} & \quad gMV - jV = 0 \\ \end{align} }[/math]

This problem is convex and always has a single solution, we can use the method of Lagrange multipliers.

Let's first simplify a bit:

[math]\displaystyle{ \begin{align} A &= (MW)^{\mathsf T} &b &= (jW)^{\mathsf T} \\ C &= (MV)^{\mathsf T} &d &= (jV)^{\mathsf T} \\ \end{align} }[/math]

The problem then becomes:

[math]\displaystyle{ \begin{align} \underset{g}{\text{minimize}} & \quad \left\| Ag^{\mathsf T} - b \right\|^2 \\ \text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d = 0 \\ \end{align} }[/math]

We can write the Lagrangian as:

[math]\displaystyle{ \mathcal{L}(g,\lambda) = \frac{1}{2} \left\| Ag^{\mathsf T} - b \right\|^2 + \lambda^{\mathsf T}(Cg^{\mathsf T} - d) }[/math]

where we introduced a vector of Lagrange multipliers [math]\displaystyle{ \lambda }[/math].

Taking derivatives with respect to [math]\displaystyle{ g }[/math] and [math]\displaystyle{ \lambda }[/math] we get a linear system:

[math]\displaystyle{ \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix} \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} }[/math]

The left block matrix is square and invertible, so we can solve for [math]\displaystyle{ (g,\lambda) }[/math]:

[math]\displaystyle{ \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix}^{-1} \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} }[/math]

The lagrange multipliers have no concrete meaning for the resulting tuning, so they can be ignored.