User:Sintel/Generator optimization

From Xenharmonic Wiki
Jump to navigation Jump to search

Some derivations on optimal generators for 2-norm (Euclidean) tunings.

In what follows, [math]M[/math] is an [math]n \times k[/math] integer matrix that represents the temperament, where [math]n[/math] is the rank of the temperament and [math]k[/math] is the rank of the just intonation subgroup. [math]g[/math] is the (unknown) row vector of generators with length [math]n[/math], and [math]j[/math] is a row vector of length [math]k[/math] containing the log-primes (aka the JIP).

The goal is now to find [math]g[/math] such that the temperament approximates JI:

$$ gM \approx j $$

Solving the system

The simplest case involves solving a linear system. Since there are [math]n[/math] unknowns, we can introduce [math]n[/math] constraints, each forcing an interval of choice to be just.

Let [math]V[/math] be a [math]k \times n[/math] matrix, whose columns are the prime count vectors of the intervals we want to be just. We now want to solve: $$ gMV = jV $$ We can recognize the LHS as giving the size of the tempered intervals in [math]V[/math], and the RHS as the size of the just intervals. Since [math]MV[/math] is square ([math]n \times n[/math]), we can simply invert it to obtain a solution: $$ g = jV(MV)^{-1} $$

Least squares solution

If we want more than [math]n[/math] intervals to be just, the result is an overconstrained system, and no solutions can be found. However, we can instead obtain an approximate solution that minimizes the squared error. If [math]V[/math] is now a [math]k \times m[/math] matrix (where [math]m \gt n[/math]), then the error for these intervals is: $$ gMV - jV $$ Using [math]\left\| \cdot \right\|[/math] to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem: $$ \underset{g}{\text{minimize}} \ \| gMV - jV \|^2 $$ Note that since we are working with row vectors, the norm is defined such that [math]\left\| x \right\|^2 = xx^{\mathsf T}[/math].

Differentiating with respect to [math]g[/math]: $$ \begin{gather} & \frac{\partial }{\partial g} \| gMV - jV \|^2 \\ = & \frac{\partial }{\partial g} (gMV - jV)(gMV - jV)^{\mathsf T} \\ = & \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\ = & 2gMVV^{\mathsf T}M^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T} = 0 \end{gather} $$ Now we solve for [math]g[/math]: $$ \begin{gather} & gMVV^{\mathsf T}M^{\mathsf T} = MVV^{\mathsf T}j^{\mathsf T} \\ \Rightarrow & g = jVV^{\mathsf T}M^{\mathsf T} (MVV^{\mathsf T}M^{\mathsf T})^{-1} \end{gather} $$ Now if we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup. This is equivalent to picking the identity matrix [math]V = I_{k \times k}[/math], so that the above expression simplifies to: $$ g = jM^{\mathsf T} (MM^{\mathsf T})^{-1} $$

Weighted norm

It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly. We can introduce a (usually diagonal) weighting matrix [math]W[/math]. For technical reasons, the norm is taken for [math]W^2[/math] so that [math]\left\| x \right\|_{W^2}^2 = xW^2x^{\mathsf T}[/math].

This results in a weighted least squares problem: $$ \underset{g}{\text{minimize}} \ \| gMV - jV \|_{W^2}^2 $$ We can write this as an ordinary least squares problem by setting: $$ \begin{gather} j' = jW \\ M' = MW \end{gather} $$ Note that in Tenney-Euclidean tuning [math] j' = \begin{bmatrix} 1, & \dots, & 1 \end{bmatrix}[/math].

We then solve $$ \underset{g}{\text{minimize}} \ \| gM'V - j'V \|^2 $$ as above.

Constraints

If we want some intervals to be exactly just, while optimizing all the others, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem: $$ \begin{align} \underset{g}{\text{minimize}} & \ \| gMW - jW \|^2 \\ \text{subject to} & \quad gMV - jV = 0 \\ \end{align} $$

This problem is convex and always has a single solution, we can use the method of Lagrange multipliers.

Let's first simplify a bit:

$$ \begin{align} A &= (MW)^{\mathsf T} &b &= (jW)^{\mathsf T} \\ C &= (MV)^{\mathsf T} &d &= (jV)^{\mathsf T} \\ \end{align} $$

The problem then becomes:

$$ \begin{align} \underset{g}{\text{minimize}} & \quad \left\| Ag^{\mathsf T} - b \right\|^2 \\ \text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d = 0 \\ \end{align} $$

We can write the Lagrangian as: $$ \begin{gather} \mathcal{L}(g,\lambda) =& \frac{1}{2} \left\| Ag^{\mathsf T} - b \right\|^2 + \lambda^{\mathsf T}(Cg^{\mathsf T} - d) \end{gather} $$ where we introduced a vector of Lagrange multipliers [math]\lambda[/math].

Taking derivatives with respect to [math]g[/math] and [math]\lambda[/math] we get a linear system: $$ \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix} \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} $$ The left block matrix is square and invertible, so we can solve for [math](g,\lambda)[/math]: $$ \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix}^{-1} \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} $$ The lagrange multipliers have no concrete meaning for the resulting tuning, so they can be ignored.