User:Arseniiv/Fitting: Difference between revisions

(4 intermediate revisions by the same user not shown)

Line 15:

A step of N-edX is <math>S := X/N</math>, so now we round <math>I_k</math> to integer multiples <math>S</math> to get their equalized variants:

:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as ~~<math>(x + \frac12) \bmod 1 - \frac12</math> using floating-point remainder or~~ <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.

:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.

Then we can get errors of this approximation as <math>\Delta_k := I_k - E_k</math>. Note that always <math>-\frac S2 \le \Delta_k \le +\frac S2</math> but the largest negative error and the largest positive error might not be “balanced” and sum to zero, nor any other measure of well-fitting might be minimized: all because we fixed ourselves to necessarily fit one of the notes exactly (<math>I_0</math> being exactly zero). We can fix that by subtracting the average <math>d</math> of all <math>\Delta_k</math> that we are interested in (arithmetic mean, RMS or something else), or, in the simplest case of minimizing the largest absolute error attained, subtract <math>d := \frac12(\min_k \Delta_k + \max_k \Delta_k)</math>.

Then, <math>E_k</math> is still our true fit of <math>I_k</math> in case of our average being the last case, arithmetic mean, RMS and several other kinds of averagings because we’ll still have <math>-\frac S2 \le \Delta_k - d \le +\frac S2</math>.

=== To a periodic scale with specific step pattern ===

If we have a pattern of steps in mind, say sLMLsLMsML, we can try fitting a scale to a scale of this form by writing an error term for each note in the period of the new scale. These terms will be integer linear combinations of our chosen step sizes <math>s, M, L</math>, an overal scale shift <math>d</math> (see [[#To edX]]) and numbers (cent values of intervals).

Having all those errors in symbolic form, it’s particularly easy to minimize RMS error constrained by all steps of the scale constituting exactly the period of the old scale, using Lagrange multiplier method. We end up with a linear system comprising of equations in step sizes, <math>d</math> and a single Lagrange multiplier related to the periodicity constraint, which is then solved to get the optimal step sizes. All these computations can be done using <code>sympy</code> Python package. In the code below I also use <code>more_itertools</code> to make my life a tad easier.

==== Code ====

''NB. This code may later be lagging behind [https://gist.github.com/arseniiv/ab128927adc3955867c1cc1bfe6b4cd7 the version at Gist]. The version there is a self-contained module with an example like one in this article.''

Should work in Python 3.12 and several previous versions. Install fresh <code>more_itertools</code> and <code>sympy</code>.

<pre>

from typing import Sequence

from collections import defaultdict

from dataclasses import dataclass

from fractions import Fraction

from math import log2, isclose, sqrt

from more_itertools import difference

import sympy as sp

def _cents(ratio: float | str) -> float:

if isinstance(ratio, str):

ratio = float(Fraction(ratio))

return 1200. * log2(ratio)

def scale_from_str(intervals: str) -> tuple[float, ...]:

"""Make a scale of pitches in cents from intervals delimited by spaces."""

return tuple(map(_cents, intervals.split()))

def match_pattern(scale: Sequence[float], pattern: str) -> dict[str, float] | None:

"""Check if the scale matches a given pattern of steps like 'LMsMs'."""

if len(scale) != len(pattern):

raise ValueError('scale and pattern should have the same length')

step_sizes = defaultdict(set)

for letter, size in zip(pattern, difference(scale), strict=True):

step_sizes[letter].add(size)

res = dict()

for letter, sizes in step_sizes.items():

mean_size = sum(sizes) / len(sizes)

if any(not isclose(size, mean_size) for size in sizes):

return None

res[letter] = mean_size

return res

@dataclass

class ScaleFit:

"""Results of a scale fit. RMS error is what’s minimized."""

pattern: str

pitches: list[float]

step_sizes: dict[str, float]

errors: list[float]

rms_error: float

avg_error: float

max_error: float

def fit_scale(scale: Sequence[float], pattern: str) -> ScaleFit:

"""Fit this scale to a given pattern of steps like 'LMsMs'."""

SHIFT, LAGRANGE = 'SHIFT', 'LAMBDA'

assert set(pattern).isdisjoint([SHIFT, LAGRANGE])

vars_ = {letter: sp.Symbol(letter)

for letter in set(pattern) | {SHIFT, LAGRANGE}}

note_errors = [vars_[SHIFT]]

sym_pitches = [sp.sympify(0)]

for letter, pitch in zip(pattern, scale):

last_pitch = sym_pitches[-1] + vars_[letter]

sym_pitches.append(last_pitch)

note_errors.append(vars_[SHIFT] + last_pitch - pitch)

periodic_constraint = note_errors[-1] - note_errors[0]

note_errors.pop()

sym_pitches.pop(0)

assert len(note_errors) == len(scale)

error_sqr_sum = sum(error ** 2 for error in note_errors)

cost = error_sqr_sum - vars_[LAGRANGE] * periodic_constraint

equations = tuple(cost.diff(x) for x in vars_)

steps_sol = sp.solve(equations, vars_)

actual_errors = [float(error.subs(steps_sol)) for error in note_errors]

rms_error = sqrt(sum(error ** 2 for error in actual_errors)) / len(actual_errors)

avg_error = sum(abs(error) for error in actual_errors) / len(actual_errors)

max_error = max(abs(error) for error in actual_errors)

new_scale = [float(pitch.subs(steps_sol)) for pitch in sym_pitches]

assert match_pattern(new_scale, pattern)

return ScaleFit(

pattern=pattern,

pitches=new_scale,

step_sizes={letter: steps_sol[vars_[letter]] for letter in set(pattern)},

errors=actual_errors,

rms_error=rms_error,

avg_error=avg_error,

max_error=max_error)

</pre>

This can be used like so:

<pre>

from pprint import pprint # more readable output

initial_scale = scale_from_str('33/32 9/8 11/9 4/3 11/8 3/2 44/27 27/16 11/6 2/1')

initial_pattern = 'sLMLsLMzML'

assert match_pattern(initial_scale, initial_pattern) # just a sanity check

# equalize s and z small steps in the scale:

fit = fit_scale(initial_scale, initial_pattern.replace('z', 's'))

pprint(fit)

</pre>

which outputs:

<pre>

ScaleFit(pattern='sLMLsLMsML', # the new pattern we fit to

pitches=[56.16718090477973, # scale data

204.97122221148004,

350.39865289776657,

499.2026942044669,

555.3698751092467,

704.1739164159469,

849.6013471022335,

905.7685280070132,

1051.1959586932996,

1200.0],

step_sizes={'L': 148.804041306700, # each step size specifically

'M': 145.427430686286,

's': 56.1671809047797},

errors=[-2.0259663722433108, # signed errors for each note compared to the initial scale

0.8682713023922943,

-0.9647458915380938,

0.9647458915412628,

-0.8682713023889406,

2.0259663722465575,

0.1929491783161552,

2.1224409613957675,

-2.122440961392499,

-0.19294917831297198],

rms_error=0.4545581250243142, # RMS error (optimized for)

avg_error=1.2348747411767853, # arithmetic mean (absolute) error

max_error=2.1224409613957675) # maximum (absolute) error

</pre>

The scale can be printed to be used in an .scl file or similar contexts with this:

<pre>

print('\n'.join(map(str, fit.pitches)))

</pre>

== Notes ==

@@ Line 15: / Line 15: @@
 A step of N-edX is <math>S := X/N</math>, so now we round <math>I_k</math> to integer multiples <math>S</math> to get their equalized variants:
-:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as <math>(x + \frac12) \bmod 1 - \frac12</math> using floating-point remainder or <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.
+:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.
 Then we can get errors of this approximation as <math>\Delta_k := I_k - E_k</math>. Note that always <math>-\frac S2 \le \Delta_k \le +\frac S2</math> but the largest negative error and the largest positive error might not be “balanced” and sum to zero, nor any other measure of well-fitting might be minimized: all because we fixed ourselves to necessarily fit one of the notes exactly (<math>I_0</math> being exactly zero). We can fix that by subtracting the average <math>d</math> of all <math>\Delta_k</math> that we are interested in (arithmetic mean, RMS or something else), or, in the simplest case of minimizing the largest absolute error attained, subtract <math>d := \frac12(\min_k \Delta_k + \max_k \Delta_k)</math>.
 Then, <math>E_k</math> is still our true fit of <math>I_k</math> in case of our average being the last case, arithmetic mean, RMS and several other kinds of averagings because we’ll still have <math>-\frac S2 \le \Delta_k - d \le +\frac S2</math>.
+=== To a periodic scale with specific step pattern ===
+If we have a pattern of steps in mind, say sLMLsLMsML, we can try fitting a scale to a scale of this form by writing an error term for each note in the period of the new scale. These terms will be integer linear combinations of our chosen step sizes <math>s, M, L</math>, an overal scale shift <math>d</math> (see [[#To edX]]) and numbers (cent values of intervals).
+Having all those errors in symbolic form, it’s particularly easy to minimize RMS error constrained by all steps of the scale constituting exactly the period of the old scale, using Lagrange multiplier method. We end up with a linear system comprising of equations in step sizes, <math>d</math> and a single Lagrange multiplier related to the periodicity constraint, which is then solved to get the optimal step sizes. All these computations can be done using <code>sympy</code> Python package. In the code below I also use <code>more_itertools</code> to make my life a tad easier.
+==== Code ====
+''NB. This code may later be lagging behind [https://gist.github.com/arseniiv/ab128927adc3955867c1cc1bfe6b4cd7 the version at Gist]. The version there is a self-contained module with an example like one in this article.''
+Should work in Python 3.12 and several previous versions. Install fresh <code>more_itertools</code> and <code>sympy</code>.
+<pre>
+from typing import Sequence
+from collections import defaultdict
+from dataclasses import dataclass
+from fractions import Fraction
+from math import log2, isclose, sqrt
+from more_itertools import difference
+import sympy as sp
+def _cents(ratio: float | str) -> float:
+    if isinstance(ratio, str):
+        ratio = float(Fraction(ratio))
+    return 1200. * log2(ratio)
+def scale_from_str(intervals: str) -> tuple[float, ...]:
+    """Make a scale of pitches in cents from intervals delimited by spaces."""
+    return tuple(map(_cents, intervals.split()))
+def match_pattern(scale: Sequence[float], pattern: str) -> dict[str, float] | None:
+    """Check if the scale matches a given pattern of steps like 'LMsMs'."""
+    if len(scale) != len(pattern):
+        raise ValueError('scale and pattern should have the same length')
+    step_sizes = defaultdict(set)
+    for letter, size in zip(pattern, difference(scale), strict=True):
+        step_sizes[letter].add(size)
+    res = dict()
+    for letter, sizes in step_sizes.items():
+        mean_size = sum(sizes) / len(sizes)
+        if any(not isclose(size, mean_size) for size in sizes):
+            return None
+        res[letter] = mean_size
+    return res
+@dataclass
+class ScaleFit:
+    """Results of a scale fit. RMS error is what’s minimized."""
+    pattern: str
+    pitches: list[float]
+    step_sizes: dict[str, float]
+    errors: list[float]
+    rms_error: float
+    avg_error: float
+    max_error: float
+def fit_scale(scale: Sequence[float], pattern: str) -> ScaleFit:
+    """Fit this scale to a given pattern of steps like 'LMsMs'."""
+    SHIFT, LAGRANGE = 'SHIFT', 'LAMBDA'
+    assert set(pattern).isdisjoint([SHIFT, LAGRANGE])
+    vars_ = {letter: sp.Symbol(letter)
+             for letter in set(pattern) | {SHIFT, LAGRANGE}}
+    note_errors = [vars_[SHIFT]]
+    sym_pitches = [sp.sympify(0)]
+    for letter, pitch in zip(pattern, scale):
+        last_pitch = sym_pitches[-1] + vars_[letter]
+        sym_pitches.append(last_pitch)
+        note_errors.append(vars_[SHIFT] + last_pitch - pitch)
+    periodic_constraint = note_errors[-1] - note_errors[0]
+    note_errors.pop()
+    sym_pitches.pop(0)
+    assert len(note_errors) == len(scale)
+    error_sqr_sum = sum(error ** 2 for error in note_errors)
+    cost = error_sqr_sum - vars_[LAGRANGE] * periodic_constraint
+    equations = tuple(cost.diff(x) for x in vars_)
+    steps_sol = sp.solve(equations, vars_)
+    actual_errors = [float(error.subs(steps_sol)) for error in note_errors]
+    rms_error = sqrt(sum(error ** 2 for error in actual_errors)) / len(actual_errors)
+    avg_error = sum(abs(error) for error in actual_errors) / len(actual_errors)
+    max_error = max(abs(error) for error in actual_errors)
+    new_scale = [float(pitch.subs(steps_sol)) for pitch in sym_pitches]
+    assert match_pattern(new_scale, pattern)
+    return ScaleFit(
+        pattern=pattern,
+        pitches=new_scale,
+        step_sizes={letter: steps_sol[vars_[letter]] for letter in set(pattern)},
+        errors=actual_errors,
+        rms_error=rms_error,
+        avg_error=avg_error,
+        max_error=max_error)
+</pre>
+This can be used like so:
+<pre>
+from pprint import pprint  # more readable output
+initial_scale = scale_from_str('33/32 9/8 11/9 4/3 11/8 3/2 44/27 27/16 11/6 2/1')
+initial_pattern = 'sLMLsLMzML'
+assert match_pattern(initial_scale, initial_pattern)  # just a sanity check
+# equalize s and z small steps in the scale:
+fit = fit_scale(initial_scale, initial_pattern.replace('z', 's'))
+pprint(fit)
+</pre>
+which outputs:
+<pre>
+ScaleFit(pattern='sLMLsLMsML', # the new pattern we fit to
+         pitches=[56.16718090477973, # scale data
+.97122221148004,
+.39865289776657,
+.2026942044669,
+.3698751092467,
+.1739164159469,
+.6013471022335,
+.7685280070132,
+.1959586932996,
+.0],
+         step_sizes={'L': 148.804041306700, # each step size specifically
+                     'M': 145.427430686286,
+                     's': 56.1671809047797},
+         errors=[-2.0259663722433108, # signed errors for each note compared to the initial scale
+.8682713023922943,
+                 -0.9647458915380938,
+.9647458915412628,
+                 -0.8682713023889406,
+.0259663722465575,
+.1929491783161552,
+.1224409613957675,
+                 -2.122440961392499,
+                 -0.19294917831297198],
+         rms_error=0.4545581250243142, # RMS error (optimized for)
+         avg_error=1.2348747411767853, # arithmetic mean (absolute) error
+         max_error=2.1224409613957675) # maximum (absolute) error
+</pre>
+The scale can be printed to be used in an .scl file or similar contexts with this:
+<pre>
+print('\n'.join(map(str, fit.pitches)))
+</pre>
 == Notes ==
 <references />