Curve fitting

Curve fitting in ERLabPy largely relies on lmfit, a flexible curve fitting library for Python, and xarray-lmfit, a compatibility layer between xarray objects and lmfit models.

ERLabPy also provides optional integration of lmfit models with iminuit , which is a Python interface to the Minuit C++ library developed at CERN.

Note

If you are new to lmfit or xarray-lmfit, visit the lmfit documentation and the xarray-lmfit user guide first!

In this tutorial, we begin with some convenient functions that ERLabPy provides for common tasks such as Fermi edge fitting. Next, we will introduce some models that are available in ERLabPy. Finally, we will show how to use iminuit with lmfit models.

import lmfit
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

import erlab.analysis as era
import erlab.plotting as eplt
from erlab.io.exampledata import generate_gold_edge

Fermi edge fitting

Functions related to the Fermi edge are available in erlab.analysis.gold. To fit a polynomial to a Fermi edge, you can use erlab.analysis.gold.poly().

Hint

The interactive Fermi edge fitting tool can be used to generate the code below interactively.

gold = generate_gold_edge(temp=100, seed=1)

result = era.gold.poly(
    gold,
    angle_range=(-15, 15),
    eV_range=(-0.2, 0.2),
    temp=100.0,
    vary_temp=False,
    bkg_slope=False,
    degree=2,
    plot=True,
)
../_images/93823118a721bb550a927baa69643500f7c6738dca7275eb240a7efc72759b12.svg

The resulting polynomial can be used to correct the Fermi edge with erlab.analysis.gold.correct_with_edge():

corrected = era.gold.correct_with_edge(gold, result)

corrected.qplot(cmap="Greys")  # Plot the corrected data
eplt.fermiline()  # Annotate the Fermi level
<matplotlib.lines.Line2D at 0x7e321eab65d0>
../_images/0c94ffb61cc47a7196e1a256bc275665e89fd1cf1b39a3860fa56de2f2dbdefb.svg

Pre-defined models

Creating composite models with different prefixes every time can be cumbersome, so ERLabPy provides some pre-defined models in erlab.analysis.fit.models.

Before fitting, let us generate a Gaussian peak on a linear background:

# Generate toy data
x = np.linspace(0, 10, 50)
y = -0.1 * x + 2 + 3 * np.exp(-((x - 5) ** 2) / (2 * 1**2))

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(x, 0.3)
y = rng.normal(y, yerr)

# Plot the data
plt.errorbar(x, y, yerr, fmt="o")
<ErrorbarContainer object of 3 artists>
../_images/1b5a3e19ad1b488aed5f2b5678b717afdb5dd80e79072d2efb5eb2c601e6fbd3.svg

Fitting multiple peaks

One example is MultiPeakModel, which works like a composite model of multiple Gaussian or Lorentzian peaks.

By supplying keyword arguments, you can specify the number of peaks, their shapes, whether to multiply with a Fermi-Dirac distribution, the shape of the background, and whether to convolve the result with experimental resolution.

For a detailed explanation of all the arguments, see its documentation.

The model can be constructed as follows:

model = era.fit.models.MultiPeakModel(
    npeaks=1, peak_shapes=["gaussian"], fd=False, background="linear", convolve=False
)
params = model.make_params(p0_center=5.0, p0_width=0.2, p0_height=3.0)
params

We can now fit the model to the toy data:

result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()
/home/docs/checkouts/readthedocs.org/user_builds/erlabpy/envs/stable/lib/python3.13/site-packages/IPython/core/events.py:100: UserWarning: There are no gridspecs with layoutgrids. Possibly did not call parent GridSpec with the "figure" keyword
  func(*args, **kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/erlabpy/envs/stable/lib/python3.13/site-packages/IPython/core/pylabtools.py:170: UserWarning: There are no gridspecs with layoutgrids. Possibly did not call parent GridSpec with the "figure" keyword
  fig.canvas.print_figure(bytes_io, **kw)
../_images/28a72a402858729e463fa54a7ef63b198fc83b6cab94ff16ae23f3f7ecd60ca2.svg

We can also plot components.

comps = result.eval_components()
plt.errorbar(x, y, yerr, fmt="o", zorder=-1, alpha=0.3)
plt.plot(x, result.eval(), label="Best fit")
plt.plot(x, comps["1Peak_p0"], "--", label="Peak")
plt.plot(x, comps["1Peak_linear"], "--", label="Background")
plt.legend()
<matplotlib.legend.Legend at 0x7e321c2daad0>
../_images/621af714c22bf78d832540f81686844354b589b48c8916a84a98021e48342a9a.svg

Now, let us try fitting MDCs cut from simulated data with multiple Lorentzian peaks, convolved with a common instrumental resolution.

from erlab.io.exampledata import generate_data

data = generate_data(seed=1).T
cut = data.qsel(ky=0.3)
cut.qplot(colorbar=True)
<matplotlib.image.AxesImage at 0x7e321c1779d0>
../_images/0ef5beef8f1e19fcd48c6d65d50e4d19bee7f7ac23584dbe7674aa6648cfba72.svg
mdc = cut.qsel(eV=0.0)
mdc.qplot()
[<matplotlib.lines.Line2D at 0x7e321c1eed50>]
../_images/6777fc9f631ce7db571920d9ff7bc38e8dbc4705dd901df1bbf60cd2bfa05846.svg

First, we define the model and set the initial parameters.

model = era.fit.models.MultiPeakModel(
    npeaks=2, peak_shapes=["lorentzian"], fd=False, background="linear", convolve=True
)

params = model.make_params(
    p0_height=800.0,
    p0_center=-0.5,
    p0_width=0.03,
    p1_height=800.0,
    p1_center=0.5,
    p1_width=0.03,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)
params

Then, we can fit the model to the data using xarray.DataArray.xlm.modelfit() from xarray-lmfit:

result = mdc.xlm.modelfit("kx", model=model, params=params, guess=True)
_ = result.modelfit_results.item().plot()
/home/docs/checkouts/readthedocs.org/user_builds/erlabpy/envs/stable/lib/python3.13/site-packages/IPython/core/events.py:100: UserWarning: There are no gridspecs with layoutgrids. Possibly did not call parent GridSpec with the "figure" keyword
  func(*args, **kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/erlabpy/envs/stable/lib/python3.13/site-packages/IPython/core/pylabtools.py:170: UserWarning: There are no gridspecs with layoutgrids. Possibly did not call parent GridSpec with the "figure" keyword
  fig.canvas.print_figure(bytes_io, **kw)
../_images/bc139249a6d37b591cbc5d4b3832d82c13f01bf08e6f89c3d1416628ba9827b2.svg

Fitting across multiple dimensions

Note

There is a dedicated module for Fermi edge fitting and correction, described [here](fermi edge fitting). The following example is for illustrative purposes.

Suppose you have to fit a single model to multiple data points across some dimension, or even multiple dimensions. xarray-lmfit can handle this with ease.

Let’s demonstrate this with a simulated cut that represents a curved Fermi edge at 100 K, with an energy broadening of 20 meV.

from erlab.io.exampledata import generate_gold_edge

gold = generate_gold_edge(temp=100, Eres=0.02, seed=1)
gold.qplot(cmap="Greys")
<matplotlib.image.AxesImage at 0x7e32161f3890>
../_images/5857eb9e55d761cc36b28b6c6e91c4bd6dc4387db963a22f2293c82cbff6e0bd.svg

We first select ± 0.2 eV around the Fermi level and fit the model across the energy axis for every EDC.

gold_selected = gold.sel(eV=slice(-0.2, 0.2))

model = era.fit.models.FermiEdgeModel()

params = {
    "temp": {"value": 100.0, "vary": False},
    "back1": {"value": 0.0, "vary": False},
}

result_ds = gold_selected.xlm.modelfit("eV", model, params=params, guess=True)
result_ds
<xarray.Dataset> Size: 360kB
Dimensions:                (alpha: 200, param: 7, cov_i: 7, cov_j: 7,
                            fit_stat: 9, eV: 75)
Coordinates:
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
  * param                  (param) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_i                  (cov_i) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_j                  (cov_j) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * fit_stat               (fit_stat) <U8 288B 'nfev' 'nvarys' ... 'rsquared'
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
Data variables:
    modelfit_results       (alpha) object 2kB <lmfit.model.ModelResult object...
    modelfit_coefficients  (alpha, param) float64 11kB -0.02484 100.0 ... -77.56
    modelfit_stderr        (alpha, param) float64 11kB nan nan ... 2.057 15.11
    modelfit_covariance    (alpha, cov_i, cov_j) float64 78kB nan nan ... 228.4
    modelfit_stats         (alpha, fit_stat) float64 14kB 45.0 5.0 ... 0.9873
    modelfit_data          (alpha, eV) float64 120kB 65.86 55.5 ... 1.385 0.3062
    modelfit_best_fit      (alpha, eV) float64 120kB 62.03 61.72 ... 0.6527

Notice how the data variables in the resulting Dataset now depend on the coordinate alpha. Let’s plot the center of the edge as a function of angle!

gold.qplot(cmap="Greys")
plt.errorbar(
    gold_selected.alpha,
    result_ds.modelfit_coefficients.sel(param="center"),
    result_ds.modelfit_stderr.sel(param="center"),
    fmt=".",
)
<ErrorbarContainer object of 3 artists>
../_images/4007f1ac13beefa231dc7440190cf15813bda339a9ffb01a9cb34e71ac114c72.svg

Parallelization

You can achieve parallel fitting by leveraging Dask.

Note

If you are new to Dask, please check out the xarray documentation and tutorial on using Dask with xarray.

Before fitting the model, you need to ensure that your data is properly chunked. You can do this by using xarray.DataArray.chunk():

gold_chunked = gold_selected.chunk({"alpha": 20})
gold_chunked
<xarray.DataArray (eV: 75, alpha: 200)> Size: 120kB
dask.array<xarray-<this-array>, shape=(75, 200), dtype=float64, chunksize=(75, 20), chunktype=numpy.ndarray>
Coordinates:
  * eV       (eV) float64 600B -0.1977 -0.1923 -0.187 ... 0.1876 0.193 0.1983
  * alpha    (alpha) float64 2kB -15.0 -14.85 -14.7 -14.55 ... 14.7 14.85 15.0
Attributes:
    sample_temp:  100

We have created chunks along the alpha dimension, allowing for parallel processing.

When we call xarray.DataArray.xlm.modelfit() with chunked data, the fitting is not performed immediately. Instead, it returns a lazy dask array that represents the computation graph:

result_ds = gold_chunked.xlm.modelfit("eV", model, params=params, guess=True)
result_ds
<xarray.Dataset> Size: 360kB
Dimensions:                (alpha: 200, param: 7, cov_i: 7, cov_j: 7,
                            fit_stat: 9, eV: 75)
Coordinates:
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
  * param                  (param) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_i                  (cov_i) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_j                  (cov_j) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * fit_stat               (fit_stat) <U8 288B 'nfev' 'nvarys' ... 'rsquared'
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
Data variables:
    modelfit_results       (alpha) object 2kB dask.array<chunksize=(20,), meta=np.ndarray>
    modelfit_coefficients  (alpha, param) float64 11kB dask.array<chunksize=(20, 7), meta=np.ndarray>
    modelfit_stderr        (alpha, param) float64 11kB dask.array<chunksize=(20, 7), meta=np.ndarray>
    modelfit_covariance    (alpha, cov_i, cov_j) float64 78kB dask.array<chunksize=(20, 7, 7), meta=np.ndarray>
    modelfit_stats         (alpha, fit_stat) float64 14kB dask.array<chunksize=(20, 9), meta=np.ndarray>
    modelfit_data          (alpha, eV) float64 120kB dask.array<chunksize=(20, 75), meta=np.ndarray>
    modelfit_best_fit      (alpha, eV) float64 120kB dask.array<chunksize=(20, 75), meta=np.ndarray>

We can compute the result by calling the compute method on the lazy dask array, which will trigger the actual computation:

result_ds = result_ds.compute()
result_ds
<xarray.Dataset> Size: 360kB
Dimensions:                (alpha: 200, param: 7, cov_i: 7, cov_j: 7,
                            fit_stat: 9, eV: 75)
Coordinates:
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
  * param                  (param) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_i                  (cov_i) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_j                  (cov_j) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * fit_stat               (fit_stat) <U8 288B 'nfev' 'nvarys' ... 'rsquared'
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
Data variables:
    modelfit_results       (alpha) object 2kB <lmfit.model.ModelResult object...
    modelfit_coefficients  (alpha, param) float64 11kB -0.02484 100.0 ... -77.56
    modelfit_stderr        (alpha, param) float64 11kB nan nan ... 2.057 15.11
    modelfit_covariance    (alpha, cov_i, cov_j) float64 78kB nan nan ... 228.4
    modelfit_stats         (alpha, fit_stat) float64 14kB 45.0 5.0 ... 0.9873
    modelfit_data          (alpha, eV) float64 120kB 65.86 55.5 ... 1.385 0.3062
    modelfit_best_fit      (alpha, eV) float64 120kB 62.03 61.72 ... 0.6527

Fitting multidimensional models

Fitting is not limited to 1D models. The following example demonstrates a global fit to the cut with a multidimensional model. First, we normalize the data with the averaged intensity of each EDC and then fit the data to FermiEdge2dModel.

gold_norm = gold_selected / gold_selected.mean("eV")
result_2d = gold_norm.T.xlm.modelfit(
    coords=["eV", "alpha"],
    model=era.fit.models.FermiEdge2dModel(),
    params={"temp": {"value": 100.0, "vary": False}},
    guess=True,
)
result_2d
<xarray.Dataset> Size: 244kB
Dimensions:                (param: 8, cov_i: 8, cov_j: 8, fit_stat: 9, eV: 75,
                            alpha: 200)
Coordinates:
  * param                  (param) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
  * cov_i                  (cov_i) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
  * cov_j                  (cov_j) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
  * fit_stat               (fit_stat) <U8 288B 'nfev' 'nvarys' ... 'rsquared'
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
Data variables:
    modelfit_results       object 8B <lmfit.model.ModelResult object at 0x7e3...
    modelfit_coefficients  (param) float64 64B 0.03855 1.225e-05 ... 0.01538
    modelfit_stderr        (param) float64 64B 0.0001811 1.198e-05 ... 0.0009251
    modelfit_covariance    (cov_i, cov_j) float64 512B 3.279e-08 ... 8.558e-07
    modelfit_stats         (fit_stat) float64 72B 82.0 7.0 ... -5.884e+04 0.9741
    modelfit_data          (eV, alpha) float64 120kB 2.598 2.471 ... 0.01212
    modelfit_best_fit      (eV, alpha) float64 120kB 1.972 1.972 ... 0.02022

Let’s plot the fit results and the residuals.

best_fit = result_2d.modelfit_best_fit.transpose(*gold_norm.dims)

fig, axs = eplt.plot_slices(
    [gold_norm, best_fit, best_fit - gold_norm],
    figsize=(4, 5),
    cmap=["viridis", "viridis", "bwr"],
    norm=[plt.Normalize(), plt.Normalize(), eplt.CenteredPowerNorm(1.0, vcenter=0)],
    colorbar="all",
    hide_colorbar_ticks=False,
    colorbar_kw={"width": 7},
)
eplt.set_titles(axs, ["Data", "FermiEdge2dModel", "Residuals"])
../_images/20b96e71e71e5adb8260347e605ae71a58a5d9d046b0745d8aaa0cda1232fd49.svg

Various tips on curve fitting

Visualizing fits

Note

If you are viewing this documentation online, the plots will not be interactive. Run the code locally to try it out.

If hvplot is installed, we can visualize the fit results interactively with the xarray.Dataset.qshow() accessor.

To plot the data with the fit and fit components:

result_ds.qshow(plot_components=True)

To plot each parameter as a function of the coordinate:

result_ds.qshow.params()

Also, the interactive curve fitting tool supports loading data from fit result Datasets.

Saving and loading fits

See the xarray-lmfit documentation for details on saving and loading fit results.

GUI equivalent

Use ftool, goldtool, or restool when you want to tune fit windows, model options, and guesses interactively.

Using iminuit

Note

This part requires the optional iminuit dependency.

iminuit is a powerful Python interface to the Minuit C++ library developed at CERN. To learn more, see the iminuit documentation.

ERLabPy provides a thin wrapper around iminuit.Minuit that allows you to use lmfit models with iminuit. The example below conducts the same fit as the previous one, but using iminuit.

model = era.fit.models.MultiPeakModel(
    npeaks=2, peak_shapes=["lorentzian"], fd=False, convolve=True
)

m = era.fit.minuit.Minuit.from_lmfit(
    model,
    mdc,
    mdc.kx,
    p0_center=-0.5,
    p1_center=0.5,
    p0_width=0.03,
    p1_width=0.03,
    p0_height=1000,
    p1_height=1000,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)

m.migrad()
m.minos()
m.hesse()
Migrad
FCN = 938.1 (χ²/ndof = 3.9) Nfcn = 1413
EDM = 0.00013 (Goal: 0.0002) time = 0.2 sec
Valid Minimum Below EDM threshold (goal x 10)
No parameters at limit Below call limit
Hesse ok Covariance accurate
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 const_bkg 0.87 0.08 -0.08 0.08
1 lin_bkg 0.0 0.1 yes
2 resolution 26.89e-3 0.12e-3 -0.12e-3 0.12e-3 0
3 p0_center -518.935e-3 0.017e-3 -0.017e-3 0.017e-3
4 p0_width 30.38e-3 0.11e-3 -0.11e-3 0.11e-3 0
5 p0_height 1.0804e3 0.0031e3 -0.0031e3 0.0031e3 0
6 p1_center 519.207e-3 0.017e-3 -0.017e-3 0.017e-3
7 p1_width 30.32e-3 0.11e-3 -0.11e-3 0.11e-3 0
8 p1_height 1.0912e3 0.0031e3 -0.0031e3 0.0031e3 0
const_bkg resolution p0_center p0_width p0_height p1_center p1_width p1_height
Error -0.08 0.08 -0.12e-3 0.12e-3 -0.017e-3 0.017e-3 -0.11e-3 0.11e-3 -3.1 3.1 -0.017e-3 0.017e-3 -0.11e-3 0.11e-3 -3.1 3.1
Valid True True True True True True True True True True True True True True True True
At Limit False False False False False False False False False False False False False False False False
Max FCN False False False False False False False False False False False False False False False False
New Min False False False False False False False False False False False False False False False False
const_bkg lin_bkg resolution p0_center p0_width p0_height p1_center p1_width p1_height
const_bkg 0.00666 0.000 4.119e-6 (0.431) -0.54e-9 -4.642e-6 (-0.505) 0.112 (0.443) -5.03e-9 (-0.004) -4.638e-6 (-0.504) 0.113 (0.443)
lin_bkg 0.000 0 0 0 0 0 0 0 0
resolution 4.119e-6 (0.431) 0 1.37e-08 0.01e-9 (0.005) -0.012e-6 (-0.878) 327.631e-6 (0.904) -0.03e-9 (-0.014) -0.012e-6 (-0.881) 332.441e-6 (0.906)
p0_center -0.54e-9 0 0.01e-9 (0.005) 2.97e-10 -0 63.40e-9 (0.001) -0 -0.01e-9 (-0.004) 241.84e-9 (0.004)
p0_width -4.642e-6 (-0.505) 0 -0.012e-6 (-0.878) -0 1.27e-08 -339.761e-6 (-0.974) 0.02e-9 (0.012) 0.010e-6 (0.792) -283.558e-6 (-0.803)
p0_height 0.112 (0.443) 0 327.631e-6 (0.904) 63.40e-9 (0.001) -339.761e-6 (-0.974) 9.57 -658.23e-9 (-0.012) -280.112e-6 (-0.804) 8 (0.822)
p1_center -5.03e-9 (-0.004) 0 -0.03e-9 (-0.014) -0 0.02e-9 (0.012) -658.23e-9 (-0.012) 2.92e-10 0.02e-9 (0.008) -507.69e-9 (-0.009)
p1_width -4.638e-6 (-0.504) 0 -0.012e-6 (-0.881) -0.01e-9 (-0.004) 0.010e-6 (0.792) -280.112e-6 (-0.804) 0.02e-9 (0.008) 1.27e-08 -344.031e-6 (-0.975)
p1_height 0.113 (0.443) 0 332.441e-6 (0.906) 241.84e-9 (0.004) -283.558e-6 (-0.803) 8 (0.822) -507.69e-9 (-0.009) -344.031e-6 (-0.975) 9.81
../_images/5b3de728d4db1ffad97852ee12b55089287b69ab088bd2eaddbba48d785efe12.svg

You can also use the interactive fitting interface provided by iminuit.

Note

  • Requires ipywidgets to be installed.

  • If you are viewing this documentation online, changing the sliders won’t change the plot. run the code locally to try it out.

m.interactive()