User:Sintel/Generator optimization: Difference between revisions

Latest revision as of 17:01, 26 April 2025

Some derivations on optimal generators for 2-norm (Euclidean) tunings.

In what follows, [math]\displaystyle{ M }[/math] is an [math]\displaystyle{ n \times k }[/math] integer matrix that represents the temperament, where [math]\displaystyle{ n }[/math] is the rank of the temperament and [math]\displaystyle{ k }[/math] is the rank of the just intonation subgroup. [math]\displaystyle{ g }[/math] is the (unknown) row vector of generators with length [math]\displaystyle{ n }[/math], and [math]\displaystyle{ j }[/math] is a row vector of length [math]\displaystyle{ k }[/math] containing the log-primes (aka the JIP).

The goal is now to find [math]\displaystyle{ g }[/math] such that the temperament approximates JI:

[math]\displaystyle{ gM \approx j }[/math]

Solving the system

The simplest case involves solving a linear system. Since there are [math]\displaystyle{ n }[/math] unknowns, we can introduce [math]\displaystyle{ n }[/math] constraints, each forcing an interval of choice to be just.

Let [math]\displaystyle{ V }[/math] be a [math]\displaystyle{ k \times n }[/math] matrix, whose columns are the prime count vectors of the intervals we want to be just. We now want to solve:

[math]\displaystyle{ gMV = jV }[/math]

We can recognize the LHS as giving the size of the tempered intervals in [math]\displaystyle{ V }[/math], and the RHS as the size of the just intervals. Since [math]\displaystyle{ MV }[/math] is square ([math]\displaystyle{ n \times n }[/math]), we can simply invert it to obtain a solution:

[math]\displaystyle{ g = jV(MV)^{-1} }[/math]

Least squares solution

If we want more than [math]\displaystyle{ n }[/math] intervals to be just, the result is an overconstrained system, and no solutions can be found. However, we can instead obtain an approximate solution that minimizes the squared error. If [math]\displaystyle{ V }[/math] is now a [math]\displaystyle{ k \times m }[/math] matrix (where [math]\displaystyle{ m \gt n }[/math]), then the error for these intervals is:

[math]\displaystyle{ gMV - jV }[/math]

Using [math]\displaystyle{ \| \cdot \| }[/math] to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem:

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|^2 }[/math]

Note that since we are working with row vectors, the norm is defined such that [math]\displaystyle{ \| x \|^2 = xx^{\mathsf T} }[/math].

Differentiating with respect to [math]\displaystyle{ g }[/math]:

[math]\displaystyle{ \displaystyle \begin{aligned} & \, \frac{\partial }{\partial g} \| gMV - jV \|^2 \\ = & \, \frac{\partial }{\partial g} (gMV - jV)(gMV - jV)^{\mathsf T} \\ = & \, \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\ = & \, 2gMVV^{\mathsf T}M^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T} = 0 \end{aligned} }[/math]

Now we solve for [math]\displaystyle{ g }[/math]:

[math]\displaystyle{ g M V V^{\mathsf T} M^{\mathsf T} = M V V^{\mathsf T} j^{\mathsf T} }[/math]

[math]\displaystyle{ g = j V V^{\mathsf T} M^{\mathsf T} \left(M V V^{\mathsf T}M^{\mathsf T} \right)^{-1} }[/math]

If we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup. This is equivalent to picking the identity matrix [math]\displaystyle{ V = I_{k \times k} }[/math], so that the above expression simplifies to:

[math]\displaystyle{ g = j M^{\mathsf T} (M M^{\mathsf T})^{-1} }[/math]

Weighted norm

It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly. We can introduce a (usually diagonal) weighting matrix [math]\displaystyle{ W }[/math]. For technical reasons, the norm is taken for [math]\displaystyle{ W^2 }[/math] so that [math]\displaystyle{ \| x \|_{W^2}^2 = xW^2x^{\mathsf T} }[/math].

This results in a weighted least squares problem:

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|_{W^2}^2 }[/math]

We can write this as an ordinary least squares problem by setting:

[math]\displaystyle{ \begin{gather} j' = jW \\ M' = MW \end{gather} }[/math]

Note that in Tenney-Euclidean tuning [math]\displaystyle{ j' = \begin{bmatrix} 1, & \dots, & 1 \end{bmatrix} }[/math].

We then solve

[math]\displaystyle{ \underset{g}{\text{minimize}} \ \left\| gM'V - j'V \right\|^2 }[/math]

as above.

Constraints

If we want some intervals to be exactly just, while optimizing all the others, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem:

[math]\displaystyle{ \begin{align} \underset{g}{\text{minimize}} & \ \left\| gMW - jW \right\|^2 \\ \text{subject to} & \quad gMV - jV = 0 \\ \end{align} }[/math]

This problem is convex and always has a single solution, we can use the method of Lagrange multipliers.

Let's first simplify a bit:

[math]\displaystyle{ \begin{align} A &= (MW)^{\mathsf T} &b &= (jW)^{\mathsf T} \\ C &= (MV)^{\mathsf T} &d &= (jV)^{\mathsf T} \\ \end{align} }[/math]

The problem then becomes:

[math]\displaystyle{ \begin{align} \underset{g}{\text{minimize}} & \quad \left\| Ag^{\mathsf T} - b \right\|^2 \\ \text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d = 0 \\ \end{align} }[/math]

We can write the Lagrangian as:

[math]\displaystyle{ \mathcal{L}(g,\lambda) = \frac{1}{2} \left\| Ag^{\mathsf T} - b \right\|^2 + \lambda^{\mathsf T}(Cg^{\mathsf T} - d) }[/math]

where we introduced a vector of Lagrange multipliers [math]\displaystyle{ \lambda }[/math].

Taking derivatives with respect to [math]\displaystyle{ g }[/math] and [math]\displaystyle{ \lambda }[/math] we get a linear system:

[math]\displaystyle{ \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix} \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} }[/math]

The left block matrix is square and invertible, so we can solve for [math]\displaystyle{ (g,\lambda) }[/math]:

[math]\displaystyle{ \begin{bmatrix} g^{\mathsf T} \\ \lambda^{\mathsf T} \end{bmatrix} = \begin{bmatrix} A^{\mathsf T}A & C^{\mathsf T} \\ C & 0 \end{bmatrix}^{-1} \begin{bmatrix} A^{\mathsf T} b\\ d \end{bmatrix} }[/math]

The lagrange multipliers have no concrete meaning for the resulting tuning, so they can be ignored.

@@ Line 1: / Line 1: @@
 Some derivations on optimal generators for 2-norm (Euclidean) tunings.
-In what follows, <math>M</math> is an <math>n \times k</math> integer matrix that represents the temperament, where <math>n</math> is the rank of the temperament and <math>k</math> is the rank of the just intonation subgroup. <math>g</math> is the (unknown) row vector of generators with length <math>n</math>, and <math>j</math> is a row vector of length <math>k</math> containing the log-primes (aka the JIP).
+In what follows, <math>M</math> is an <math>n \times k</math> integer matrix that represents the temperament, where <math>n</math> is the rank of the temperament and <math>k</math> is the rank of the just intonation subgroup.
+<math>g</math> is the (unknown) row vector of generators with length <math>n</math>, and <math>j</math> is a row vector of length <math>k</math> containing the log-primes (aka the JIP).
 The goal is now to find <math>g</math> such that the temperament approximates JI:
+:<math>
-$$
+	gM \approx j
-gM \approx j
+</math>
-$$
 == Solving the system ==
-The simplest case involves solving a linear system. Since there are <math>n</math> unknowns, we can introduce <math>n</math> constraints, each forcing an interval of choice to be just.
+The simplest case involves solving a linear system.
+Since there are <math>n</math> unknowns, we can introduce <math>n</math> constraints, each forcing an interval of choice to be just.
 Let <math>V</math> be a <math>k \times n</math> matrix, whose columns are the prime count vectors of the intervals we want to be just. We now want to solve:
-$$
+:<math>
-gMV = jV
+	gMV = jV
-$$
+</math>
-We can recognize the LHS as giving the size of the tempered intervals in <math>V</math>, and the RHS as the size of the just intervals. Since <math>MV</math> is square (<math>n \times n</math>), we can simply invert it to obtain a solution:
+We can recognize the LHS as giving the size of the tempered intervals in <math>V</math>, and the RHS as the size of the just intervals.
-$$
+Since <math>MV</math> is square (<math>n \times n</math>), we can simply invert it to obtain a solution:
-g = jV(MV)^{-1}
+:<math>
-$$
+	g = jV(MV)^{-1}
+</math>
 == Least squares solution ==
@@ Line 25: / Line 27: @@
 However, we can instead obtain an approximate solution that minimizes the squared error.
 If <math>V</math> is now a <math>k \times m</math> matrix (where <math>m \gt n</math>), then the error for these intervals is:
-$$
+:<math>
-gMV - jV
+	gMV - jV
-$$
+</math>
-Using <math>\left\| \cdot  \right\|</math> to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem:
-$$
+Using <math>\| \cdot \|</math> to denote the usual Euclidean norm, this becomes a standard ordinary least squares problem:
-\underset{g}{\text{minimize}}  \|   gMV - jV   \|^2
+:<math>
-$$
+	\underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|^2
-Note that since we are working with row vectors, the norm is defined such that <math>\left\| x  \right\|^2 = xx^{\mathsf T}</math>.
+</math>
+Note that since we are working with row vectors, the norm is defined such that <math>\| x \|^2 = xx^{\mathsf T}</math>.
 Differentiating with respect to <math>g</math>:
-$$
+:<math>\displaystyle
-\begin{gather}
+\begin{aligned}
-& \frac{\partial }{\partial g}  \|   gMV - jV   \|^2 \\
+	  & \, \frac{\partial }{\partial g}  \|   gMV - jV   \|^2 \\
-= & \frac{\partial }{\partial g}  (gMV - jV)(gMV - jV)^{\mathsf T}  \\
+	= & \, \frac{\partial }{\partial g}  (gMV - jV)(gMV - jV)^{\mathsf T}  \\
-= & \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T}  - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\
+	= & \, \frac{\partial }{\partial g} (gMVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} - 2jVV^{\mathsf T}M^{\mathsf T}g^{\mathsf T} + jVV^{\mathsf T}j^{\mathsf T}) \\
-= & 2gMVV^{\mathsf T}M^{\mathsf T}  - 2jVV^{\mathsf T}M^{\mathsf T} = 0
+	= & \, 2gMVV^{\mathsf T}M^{\mathsf T}  - 2jVV^{\mathsf T}M^{\mathsf T} = 0
-\end{gather}
+\end{aligned}
-$$
+</math>
 Now we solve for <math>g</math>:
-$$
-\begin{gather}
+:<math>
-& gMVV^{\mathsf T}M^{\mathsf T}  = MVV^{\mathsf T}j^{\mathsf T} \\
+	g M V V^{\mathsf T} M^{\mathsf T} = M V V^{\mathsf T} j^{\mathsf T}
-\Rightarrow & g  = jVV^{\mathsf T}M^{\mathsf T} (MVV^{\mathsf T}M^{\mathsf T})^{-1}
+</math>
-\end{gather}
-$$
+:<math>
-Now if we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup. This is equivalent to picking the identity matrix <math>V = I_{k \times k}</math>, so that the above expression simplifies to:
+	g  = j V V^{\mathsf T} M^{\mathsf T} \left(M V V^{\mathsf T}M^{\mathsf T} \right)^{-1}
-$$
+</math>
-g  = jM^{\mathsf T} (MM^{\mathsf T})^{-1}
-$$
+If we don't know which intervals to optimize for, a simple choice is just to optimize for each prime in the subgroup.
+This is equivalent to picking the identity matrix <math>V = I_{k \times k}</math>, so that the above expression simplifies to:
+:<math>
+	g  = j M^{\mathsf T} (M M^{\mathsf T})^{-1}
+</math>
 == Weighted norm ==
-It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly. We can introduce a diagonal weighting matrix <math>W</math>. For technical reasons, the norm is taken for <math>W^2</math> so that <math>\left\| x  \right\|_{W^2}^2 = xW^2x^{\mathsf T}</math>.
+It can be beneficial to introduce weights for each prime. For example, in Tenney-Euclidean tuning, lower primes are prioritized slightly.
+We can introduce a (usually diagonal) weighting matrix <math>W</math>.
+For technical reasons, the norm is taken for <math>W^2</math> so that <math>\| x  \|_{W^2}^2 = xW^2x^{\mathsf T}</math>.
 This results in a weighted least squares problem:
-$$
+:<math>
-\underset{g}{\text{minimize}}  \|   gMV - jV   \|_{W^2}^2
+	\underset{g}{\text{minimize}} \ \left\| gMV - jV \right\|_{W^2}^2
-$$
+</math>
-We can write this in the same as an ordinary least squares problem by setting:
+We can write this as an ordinary least squares problem by setting:
-$$
+:<math>
 \begin{gather}
-j' = jW \\
+	j' = jW \\
-M' = MW
+	M' = MW
 \end{gather}
-$$
+</math>
-Note that in Tenney-Euclidean tuning <math> j' = \begin{bmatrix}
+Note that in Tenney-Euclidean tuning <math>
-, & \dots, & 1
+	j' =
-\end{bmatrix}</math>.
+	\begin{bmatrix}
+, & \dots, & 1
+	\end{bmatrix}
+</math>.
 We then solve
-$$
+:<math>
-\underset{g}{\text{minimize}}  \|   gM'V - j'V   \|^2
+\underset{g}{\text{minimize}} \ \left\| gM'V - j'V \right\|^2
-$$
+</math>
 as above.
 == Constraints ==
-If we want some intervals to be exactly just, while optimizing all the other, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem:
+If we want some intervals to be exactly just, while optimizing all the others, we can introduce linear constraints. A good example is CTE tuning, where octaves are always just. This gives the following optimization problem:
-$$
+:<math>
 \begin{align}
-\underset{g}{\text{minimize}}   & \quad \|   gMW - jW   \|^2   \\
+\underset{g}{\text{minimize}} & \ \left\| gMW - jW \right\|^2   \\
 \text{subject to} & \quad gMV - jV = 0 \\
 \end{align}
-$$
+</math>
 This problem is convex and always has a single solution, we can use the method of Lagrange multipliers.
@@ Line 93: / Line 105: @@
 Let's first simplify a bit:
-$$
+:<math>
 \begin{align}
-A &= (MW)^{\mathsf T}
+	A  &= (MW)^{\mathsf T}
-&b &= (jW)^{\mathsf T} \\
+	&b &= (jW)^{\mathsf T} \\
-C &= (MV)^{\mathsf T}
+	C  &= (MV)^{\mathsf T}
-&d &= (jV)^{\mathsf T} \\
+	&d &= (jV)^{\mathsf T} \\
 \end{align}
-$$
+</math>
 The problem then becomes:
-$$
+:<math>
 \begin{align}
-\underset{g}{\text{minimize}}   & \quad  \left\|   Ag^{\mathsf T} - b   \right\|^2   \\
+	\underset{g}{\text{minimize}}   & \quad  \left\|   Ag^{\mathsf T} - b   \right\|^2   \\
-\text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d  = 0 \\
+	\text{s.t.} & \quad \phantom{\|} Cg^{\mathsf T} - d  = 0 \\
 \end{align}
-$$
+</math>
 We can write the Lagrangian as:
-$$
+:<math>
-\begin{gather}
+	\mathcal{L}(g,\lambda) = \frac{1}{2} \left\| Ag^{\mathsf T} - b \right\|^2  + \lambda^{\mathsf T}(Cg^{\mathsf T} - d)
-\mathcal{L}(g,\lambda) =& \frac{1}{2} \left\|   Ag^{\mathsf T} - b   \right\|^2  + \lambda^{\mathsf T}(Cg^{\mathsf T} - d)
+</math>
-\end{gather}
-$$
 where we introduced a vector of Lagrange multipliers <math>\lambda</math>.
 Taking derivatives with respect to <math>g</math> and <math>\lambda</math> we get a linear system:
-$$
+:<math>
-\begin{bmatrix}
+	\begin{bmatrix}
-A^{\mathsf T}A & C^{\mathsf T} \\
+		A^{\mathsf T}A & C^{\mathsf T} \\
-C & 0
+		C & 0
-\end{bmatrix}
+	\end{bmatrix}
-\begin{bmatrix}
+	\begin{bmatrix}
-g^{\mathsf T}  \\
+		g^{\mathsf T}  \\
-\lambda^{\mathsf T}
+		\lambda^{\mathsf T}
-\end{bmatrix}
+	\end{bmatrix}
-=
+	=
-\begin{bmatrix}
+	\begin{bmatrix}
-A^{\mathsf T} b\\
+		A^{\mathsf T} b\\
-d
+		d
-\end{bmatrix}
+	\end{bmatrix}
-$$
+</math>
 The left block matrix is square and invertible, so we can solve for <math>(g,\lambda)</math>:
-$$
+:<math>
-\begin{bmatrix}
+	\begin{bmatrix}
-g^{\mathsf T}  \\
+		g^{\mathsf T}  \\
-\lambda^{\mathsf T}
+		\lambda^{\mathsf T}
-\end{bmatrix}
+	\end{bmatrix}
-=
+	=
-\begin{bmatrix}
+	\begin{bmatrix}
-A^{\mathsf T}A & C^{\mathsf T} \\
+		A^{\mathsf T}A & C^{\mathsf T} \\
-C & 0
+		C & 0
-\end{bmatrix}^{-1}
+	\end{bmatrix}^{-1}
-\begin{bmatrix}
+	\begin{bmatrix}
-A^{\mathsf T} b\\
+		A^{\mathsf T} b\\
-d
+	d
-\end{bmatrix}
+	\end{bmatrix}
-$$
+</math>
 The lagrange multipliers have no concrete meaning for the resulting tuning, so they can be ignored.

User:Sintel/Generator optimization: Difference between revisions

Latest revision as of 17:01, 26 April 2025

Contents

Solving the system

Least squares solution

Weighted norm

Constraints

Navigation menu

User:Sintel/Generator optimization: Difference between revisions

Latest revision as of 17:01, 26 April 2025

Solving the system

Least squares solution

Weighted norm

Constraints

Navigation menu

Search