User:Sintel/Generator optimization

Some derivations on optimal generators for 2-norm (Euclidean) tunings.

In what follows, $M$ is an $n \times k$ integer matrix that represents the temperament, where $n$ is the rank of the temperament and $k$ is the rank of the just intonation subgroup. $g$ is the (unknown) row vector of generators with length $n$, and $j$ is a row vector of length $k$ containing the log-primes (aka the JIP).

The goal is now to find $g$ such that the temperament approximates JI:

$$gM \approx j$$

Solving the system

The simplest case involves solving a linear system. Since there are $n$ unknowns, we can introduce $n$ constraints, each forcing an interval of choice to be just.

Let $V$ be a $k \times n$ matrix, whose columns are the prime count vectors of the intervals we want to be just. We now want to solve: $$gMV = jV$$ We can recognize the LHS as giving the size of the tempered intervals in $V$, and the RHS as the size of the just intervals. Since $MV$ is square ($n \times n$), we can simply invert it to obtain a solution: $$g = jV(MV)^{-1}$$

Least squares solution

If we want more than $n$ intervals to be just, the result is an overconstrained system, and no solutions can be found. However, we can instead obtain an approximate solution that minimizes the squared error. If $V$ is now a $k \times m$ matrix (where $m \gt n$), then the error for these intervals is: $$gMV - jV$$ Using $\left\| \cdot \right\|$ to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem: $$\underset{g}{\text{minimize}} \ \| gMV - jV \|^2$$ Note that since we are working with row vectors, the norm is defined such that $\left\| x \right\|^2 = xx^{\mathsf T}$.

Differentiating with respect to $g$: $$\begin{gather} & \frac{\partial }{\partial g} \| gMV - jV \|^2 \\ = & \frac{\partial }{\partial g} (gMV - jV)(gMV - jV)^{\mathsf T} \\ = & \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\ = & 2gMVV^{\mathsf T}M^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T} = 0 \end{gather}$$ Now we solve for $g$: $$\begin{gather} & gMVV^{\mathsf T}M^{\mathsf T} = MVV^{\mathsf T}j^{\mathsf T} \\ \Rightarrow & g = jVV^{\mathsf T}M^{\mathsf T} (MVV^{\mathsf T}M^{\mathsf T})^{-1} \end{gather}$$ Now if we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup. This is equivalent to picking the identity matrix $V = I_{k \times k}$, so that the above expression simplifies to: $$g = jM^{\mathsf T} (MM^{\mathsf T})^{-1}$$

Weighted norm

It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly. We can introduce a (usually diagonal) weighting matrix $W$. For technical reasons, the norm is taken for $W^2$ so that $\left\| x \right\|_{W^2}^2 = xW^2x^{\mathsf T}$.

This results in a weighted least squares problem: $$\underset{g}{\text{minimize}} \ \| gMV - jV \|_{W^2}^2$$ We can write this as an ordinary least squares problem by setting: $$\begin{gather} j' = jW \\ M' = MW \end{gather}$$ Note that in Tenney-Euclidean tuning $j' = \begin{bmatrix} 1, & \dots, & 1 \end{bmatrix}$.

We then solve $$\underset{g}{\text{minimize}} \ \| gM'V - j'V \|^2$$ as above.

Constraints

If we want some intervals to be exactly just, while optimizing all the others, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem: \begin{align} \underset{g}{\text{minimize}} & \ \| gMW - jW \|^2 \\ \text{subject to} & \quad gMV - jV = 0 \\ \end{align}

This problem is convex and always has a single solution, we can use the method of Lagrange multipliers.

Let's first simplify a bit:

\begin{align} A &= (MW)^{\mathsf T} &b &= (jW)^{\mathsf T} \\ C &= (MV)^{\mathsf T} &d &= (jV)^{\mathsf T} \\ \end{align}

The problem then becomes:

\begin{align} \underset{g}{\text{minimize}} & \quad \left\| Ag^{\mathsf T} - b \right\|^2 \\ \text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d = 0 \\ \end{align}

We can write the Lagrangian as: $$\begin{gather} \mathcal{L}(g,\lambda) =& \frac{1}{2} \left\| Ag^{\mathsf T} - b \right\|^2 + \lambda^{\mathsf T}(Cg^{\mathsf T} - d) \end{gather}$$ where we introduced a vector of Lagrange multipliers $\lambda$.

Taking derivatives with respect to $g$ and $\lambda$ we get a linear system: $$\begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix} \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix}$$ The left block matrix is square and invertible, so we can solve for $(g,\lambda)$: $$\begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix}^{-1} \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix}$$ The lagrange multipliers have no concrete meaning for the resulting tuning, so they can be ignored.