User:Arseniiv/Fitting: Difference between revisions

Arseniiv (talk | contribs)
m fix a typo
Arseniiv (talk | contribs)
m Code: a link to a gist
 
(4 intermediate revisions by the same user not shown)
Line 15: Line 15:
A step of N-edX is <math>S := X/N</math>, so now we round <math>I_k</math> to integer multiples <math>S</math> to get their equalized variants:
A step of N-edX is <math>S := X/N</math>, so now we round <math>I_k</math> to integer multiples <math>S</math> to get their equalized variants:


:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as <math>(x + \frac12) \bmod 1 - \frac12</math> using floating-point remainder or <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.
:: <math>E_k := \mathsf{round}(s I_k) / s</math><ref>if there’s no <math>\mathsf{round}(x)</math> function in your environment, it can be defined as <math>\lfloor x + \frac12 \rfloor - \frac12</math> using floor <math>\lfloor\ldots\rfloor</math>; these agree up to cases when <math>x</math> is exactly a half-odd-integer</ref>.


Then we can get errors of this approximation as <math>\Delta_k := I_k - E_k</math>. Note that always <math>-\frac S2 \le \Delta_k \le +\frac S2</math> but the largest negative error and the largest positive error might not be “balanced” and sum to zero, nor any other measure of well-fitting might be minimized: all because we fixed ourselves to necessarily fit one of the notes exactly (<math>I_0</math> being exactly zero). We can fix that by subtracting the average <math>d</math> of all <math>\Delta_k</math> that we are interested in (arithmetic mean, RMS or something else), or, in the simplest case of minimizing the largest absolute error attained, subtract <math>d := \frac12(\min_k \Delta_k + \max_k \Delta_k)</math>.
Then we can get errors of this approximation as <math>\Delta_k := I_k - E_k</math>. Note that always <math>-\frac S2 \le \Delta_k \le +\frac S2</math> but the largest negative error and the largest positive error might not be “balanced” and sum to zero, nor any other measure of well-fitting might be minimized: all because we fixed ourselves to necessarily fit one of the notes exactly (<math>I_0</math> being exactly zero). We can fix that by subtracting the average <math>d</math> of all <math>\Delta_k</math> that we are interested in (arithmetic mean, RMS or something else), or, in the simplest case of minimizing the largest absolute error attained, subtract <math>d := \frac12(\min_k \Delta_k + \max_k \Delta_k)</math>.


Then, <math>E_k</math> is still our true fit of <math>I_k</math> in case of our average being the last case, arithmetic mean, RMS and several other kinds of averagings because we’ll still have <math>-\frac S2 \le \Delta_k - d \le +\frac S2</math>.
Then, <math>E_k</math> is still our true fit of <math>I_k</math> in case of our average being the last case, arithmetic mean, RMS and several other kinds of averagings because we’ll still have <math>-\frac S2 \le \Delta_k - d \le +\frac S2</math>.
=== To a periodic scale with specific step pattern ===
If we have a pattern of steps in mind, say sLMLsLMsML, we can try fitting a scale to a scale of this form by writing an error term for each note in the period of the new scale. These terms will be integer linear combinations of our chosen step sizes <math>s, M, L</math>, an overal scale shift <math>d</math> (see [[#To edX]]) and numbers (cent values of intervals).
Having all those errors in symbolic form, it’s particularly easy to minimize RMS error constrained by all steps of the scale constituting exactly the period of the old scale, using Lagrange multiplier method. We end up with a linear system comprising of equations in step sizes, <math>d</math> and a single Lagrange multiplier related to the periodicity constraint, which is then solved to get the optimal step sizes. All these computations can be done using <code>sympy</code> Python package. In the code below I also use <code>more_itertools</code> to make my life a tad easier.
==== Code ====
''NB. This code may later be lagging behind [https://gist.github.com/arseniiv/ab128927adc3955867c1cc1bfe6b4cd7 the version at Gist]. The version there is a self-contained module with an example like one in this article.''
Should work in Python 3.12 and several previous versions. Install fresh <code>more_itertools</code> and <code>sympy</code>.
<pre>
from typing import Sequence
from collections import defaultdict
from dataclasses import dataclass
from fractions import Fraction
from math import log2, isclose, sqrt
from more_itertools import difference
import sympy as sp
def _cents(ratio: float | str) -> float:
    if isinstance(ratio, str):
        ratio = float(Fraction(ratio))
    return 1200. * log2(ratio)
def scale_from_str(intervals: str) -> tuple[float, ...]:
    """Make a scale of pitches in cents from intervals delimited by spaces."""
    return tuple(map(_cents, intervals.split()))
def match_pattern(scale: Sequence[float], pattern: str) -> dict[str, float] | None:
    """Check if the scale matches a given pattern of steps like 'LMsMs'."""
    if len(scale) != len(pattern):
        raise ValueError('scale and pattern should have the same length')
    step_sizes = defaultdict(set)
    for letter, size in zip(pattern, difference(scale), strict=True):
        step_sizes[letter].add(size)
    res = dict()
    for letter, sizes in step_sizes.items():
        mean_size = sum(sizes) / len(sizes)
        if any(not isclose(size, mean_size) for size in sizes):
            return None
        res[letter] = mean_size
    return res
@dataclass
class ScaleFit:
    """Results of a scale fit. RMS error is what’s minimized."""
    pattern: str
    pitches: list[float]
    step_sizes: dict[str, float]
    errors: list[float]
    rms_error: float
    avg_error: float
    max_error: float
def fit_scale(scale: Sequence[float], pattern: str) -> ScaleFit:
    """Fit this scale to a given pattern of steps like 'LMsMs'."""
    SHIFT, LAGRANGE = 'SHIFT', 'LAMBDA'
    assert set(pattern).isdisjoint([SHIFT, LAGRANGE])
    vars_ = {letter: sp.Symbol(letter)
            for letter in set(pattern) | {SHIFT, LAGRANGE}}
    note_errors = [vars_[SHIFT]]
    sym_pitches = [sp.sympify(0)]
    for letter, pitch in zip(pattern, scale):
        last_pitch = sym_pitches[-1] + vars_[letter]
        sym_pitches.append(last_pitch)
        note_errors.append(vars_[SHIFT] + last_pitch - pitch)
    periodic_constraint = note_errors[-1] - note_errors[0]
    note_errors.pop()
    sym_pitches.pop(0)
    assert len(note_errors) == len(scale)
    error_sqr_sum = sum(error ** 2 for error in note_errors)
    cost = error_sqr_sum - vars_[LAGRANGE] * periodic_constraint
    equations = tuple(cost.diff(x) for x in vars_)
    steps_sol = sp.solve(equations, vars_)
    actual_errors = [float(error.subs(steps_sol)) for error in note_errors]
    rms_error = sqrt(sum(error ** 2 for error in actual_errors)) / len(actual_errors)
    avg_error = sum(abs(error) for error in actual_errors) / len(actual_errors)
    max_error = max(abs(error) for error in actual_errors)
    new_scale = [float(pitch.subs(steps_sol)) for pitch in sym_pitches]
    assert match_pattern(new_scale, pattern)
    return ScaleFit(
        pattern=pattern,
        pitches=new_scale,
        step_sizes={letter: steps_sol[vars_[letter]] for letter in set(pattern)},
        errors=actual_errors,
        rms_error=rms_error,
        avg_error=avg_error,
        max_error=max_error)
</pre>
This can be used like so:
<pre>
from pprint import pprint  # more readable output
initial_scale = scale_from_str('33/32 9/8 11/9 4/3 11/8 3/2 44/27 27/16 11/6 2/1')
initial_pattern = 'sLMLsLMzML'
assert match_pattern(initial_scale, initial_pattern)  # just a sanity check
# equalize s and z small steps in the scale:
fit = fit_scale(initial_scale, initial_pattern.replace('z', 's'))
pprint(fit)
</pre>
which outputs:
<pre>
ScaleFit(pattern='sLMLsLMsML', # the new pattern we fit to
        pitches=[56.16718090477973, # scale data
                  204.97122221148004,
                  350.39865289776657,
                  499.2026942044669,
                  555.3698751092467,
                  704.1739164159469,
                  849.6013471022335,
                  905.7685280070132,
                  1051.1959586932996,
                  1200.0],
        step_sizes={'L': 148.804041306700, # each step size specifically
                    'M': 145.427430686286,
                    's': 56.1671809047797},
        errors=[-2.0259663722433108, # signed errors for each note compared to the initial scale
                0.8682713023922943,
                -0.9647458915380938,
                0.9647458915412628,
                -0.8682713023889406,
                2.0259663722465575,
                0.1929491783161552,
                2.1224409613957675,
                -2.122440961392499,
                -0.19294917831297198],
        rms_error=0.4545581250243142, # RMS error (optimized for)
        avg_error=1.2348747411767853, # arithmetic mean (absolute) error
        max_error=2.1224409613957675) # maximum (absolute) error
</pre>
The scale can be printed to be used in an .scl file or similar contexts with this:
<pre>
print('\n'.join(map(str, fit.pitches)))
</pre>


== Notes ==
== Notes ==


<references />
<references />