ERLabPy

Date: May 16, 2024

Download documentation: Zipped HTML, PDF

Supported Python Versions PyPi Conda Version Last Commit

A library that provides a set of tools and utilities to handle, manipulate, and visualize data from condensed matter physics experiments, with a focus on angle-resolved photoemission spectroscopy (ARPES).

ERLabPy is built on top of the popular scientific python libraries numpy, scipy, and xarray, and is designed to be easy to use and integrate with existing scientific Python workflows so that you can quickly get started with your data analysis. The package is still under development, so if you have any questions or suggestions, please feel free to contact us. We hope you find ERLabPy useful for your research!

Getting started

The getting started guide provides installation instructions and an overview on the dependencies.

User guide

The user guide provides some tutorials and examples on how to use ERLabPy.

API reference

The reference guide provides detailed information of the API, including descriptions of most available methods and parameters.

Contributing guide

The contributing guide contains information on how to contribute to the project.

Imagetool Imagetool

Getting Started

Installing

The recommended way to install ERLabPy is via conda. If you do not have conda installed, follow the installation instructions. Once you have a working conda environment, you can install ERLabPy with the conda command line tool:

conda install -c conda-forge erlab

Add any optional dependencies you want to install to the command above.

Hint

If you are using macOS, you might experience degraded performance with the default BLAS and LAPACK libraries. For Apple Silicon macs, use Accelerate:

conda install "libblas=*=*accelerate"

For Intel macs, use MKL:

conda install "libblas=*=*mkl"

To prevent conda from switching back to the default libraries, see the conda-forge documentation.

If you don’t use conda, you can install ERLabPy with pip:

python -m pip install erlab

Optional dependency groups can be installed with the following commands:

python -m pip install erlab[viz]       # Install optional dependencies for visualization
python -m pip install erlab[perf]      # Install optional dependencies for performance
python -m pip install erlab[misc]      # Install miscellaneous optional dependencies
python -m pip install erlab[complete]  # Install all optional dependencies

If you wish to install ERLabPy from source, see the Contributing Guide.

Dependencies

ERLabPy is installed with many different python libraries. Some key packages and links to their documentation are listed below as a reference. In particular, this documentation assumes basic familiarity with the first four packages, which will be sufficient for most use cases.

Package

Used in

numpy

Computation and array manipulation, linear algebra

xarray

Data storage and manipulation

matplotlib

Plotting

scipy

Linear algebra, signal processing, and image processing

lmfit

Optimization problems including curve fitting

pyqtgraph

Interactive plotting (i.e., imagetool)

ERLabPy also requires a Qt library such as PyQt5, PyQt6, PySide2, or PySide6. To ensure compatibility and keep the namespace clean, ERLabPy imports Qt bindings from qtpy, which will automatically select the appropriate library based on what is installed.

See the User Guide to start using ERLabPy!

Optional dependencies

The following packages are optional dependencies that are not installed by default. They are only used in specific functions, or is not used at all but is listed just for convenience.

Package

Description

csaps

Multidimensional smoothing splines

ipywidgets

Interactive widgets

hvplot and bokeh

Interactive plotting

cmasher, cmocean, and colorcet

More colormaps!

numbagg and bottleneck

Fast multidimensional aggregation, accelerates xarray

For a full list of dependencies and optional dependencies, take a look at the [project] and [project.optional-dependencies] section in pyproject.toml:

dependencies = [
    "h5netcdf>=1.2.0",
    "igor2>=0.5.6",
    "joblib>=1.3.2",
    "lmfit>=1.2.0,!=1.3.0",
    "matplotlib>=3.8.0",
    "numba>=0.59.0",
    "numpy>=1.26.0",
    "pyperclip>=1.8.2",
    "pyqtgraph>=0.13.1",
    "qtawesome>=1.3.1",
    "qtpy>=2.4.1",
    "scipy>=1.12.0",
    "tqdm>=4.66.2",
    "uncertainties>=3.1.4",
    "varname>=0.13.0",
    "xarray>=2024.02.0",
]

[project.optional-dependencies]
complete = ["erlab[viz,perf,misc,dev]"]
viz = ["cmasher", "cmocean", "colorcet", "hvplot", "ipywidgets"]
perf = ["numbagg>=0.8.1", "bottleneck>=1.3.8"]
misc = ["iminuit>=2.25.2", "csaps>=1.1.0", "dask>=2024.4.1"]
dev = [
    "mypy",
    "pre-commit",
    "pytest-cov",
    "pytest-qt",
    "pytest-xdist",
    "pytest",
    "python-semantic-release",
    "ruff",
]
docs = [
    "sphinx",
    "sphinx-autodoc-typehints",
    "sphinx-copybutton",
    "sphinxcontrib-bibtex",
    "sphinx-qt-documentation",
    "pybtex",
    "nbsphinx",
    "furo",
    "sphinx-design",
]

Notes on compatibility

  • ERLabPy is tested on Python 3.11 and 3.12. It is not guaranteed to work on older versions of Python.

  • There are some known compatibility issues with PyQt5 and PySide2, so it is recommended to use the newer PyQt6 or PySide6 if possible.

  • If you meet any unexpected behaviour while using IPython’s autoreload extension, try excluding the following modules:

    %aimport -erlab.io.dataloader -erlab.accessors
    

User Guide

Work in Progress

The user guide is incomplete. For the full list of packages and modules provided by ERLabPy, see API Reference.

Introduction

Welcome to the ERLabPy user guide! This guide provides an overview of ERLabPy and its core features.

If you are new to programming with python, Scientific Python Lectures is a great place to start.

Data structures in ERLabPy are represented using xarray[1], which provides a powerful data structure for working with multi-dimensional arrays. Visit the xarray tutorial and the xarray user guide to get familiar with xarray.

Reading and writing data

In ERLabPy, most data are represented as xarray.Dataset objects or xarray.DataArray objects. xarray.DataArray are similar to waves in Igor Pro, but are much more flexible. Opposed to the maximum of 4 dimensions in Igor, xarray.DataArray can have as many dimensions as you want (up to 64). Another advantage is that the coordinates of the dimensions do not have to be evenly spaced. In fact, they are not limited to numbers but can be any type of data, such as date and time representations.

This guide will introduce you to reading and writing data from and to various file formats, and how to implement a custom reader for a experimental setup.

Skip to the corresponding section for guides on loading ARPES data.

Reading data

Python has a wide range of libraries to read and write data. Here, we will focus on working with Igor Pro and xarray objects.

From Igor Pro

Warning

Loading waves from complex .pxp files may fail or produce unexpected results. It is recommended to export the waves to a .ibw file to load them in ERLabPy. If you encounter any problems, please let us know by opening an issue.

ERLabPy can read .ibw, .pxt, .pxp, and HDF5 files exported from Igor Pro by using the functions in erlab.io.igor:

For easy access, the first two functions are also available in the erlab.io namespace.

Note

Internally, the igor2 package is used to read the data.

From arbitrary formats

Spreadsheet data can be read using pandas.read_csv() or pandas.read_excel(). The resulting DataFrame can be converted to an xarray object using pandas.DataFrame.to_xarray() or xarray.Dataset.from_dataframe().

When reading HDF5 files with arbitrary groups and metadata, you must first explore the group structure using h5netcdf or h5py. Loading a specific HDF5 group into an xarray object can be done using xarray.open_dataset() or xarray.open_mfdataset() by supplying the group argument. For an example that handles complex HDF5 groups, see the implementation of erlab.io.plugins.ssrl52.SSRL52Loader.load_single().

FITS files can be read with astropy.

Writing data

Since the state and variables of a Python interpreter are not saved, it is important to save your data in a format that can be easily read and written.

While it is possible to save and load entire Python interpreter sessions using pickle or the more versatile dill, it is out of the scope of this guide. Instead, we recommend saving your data in a format that is easy to read and write, such as HDF5 or NetCDF. These formats are supported by many programming languages and are optimized for fast read and write operations.

To save and load xarray objects, see the xarray documentation on I/O operations. ERLabPy offers convenience functions load_hdf5 and save_as_hdf5 for loading and saving xarray objects from and to HDF5 files, and save_as_netcdf for saving to NetCDF files.

To Igor Pro

As an experimental feature, save_as_hdf5 can save certain xarray.DataArrays in a format that is compatible with the Igor Pro HDF5 loader. An accompanying Igor procedure is available in the repository. If loading in Igor Pro fails, try saving again with all attributes removed.

Alternatively, igorwriter can be used to write numpy arrays to .ibw and .itx files directly.

Loading ARPES data

Warning

ERLabPy is still in development and the API may change. Some major changes regarding data loading and handling are planned:

  • The xarray datatree structure will enable much more intuitive and powerful data handling. Once the feature gets incorporated into xarray, ERLabPy will be updated to use it.

  • A universal translation layer between true data header attributes and human-readable representations will be implemented. This will allow for more consistent and user-friendly data handling.

ERLabPy’s data loading framework consists of various plugins, or loaders, each designed to load data from a different beamline or laboratory. Each loader is a class that has a load method which takes a file path or sequence number and returns data.

Let’s see the list of loaders available by default:

[1]:
import erlab.io

erlab.io.loaders
[1]:
NameAliasesLoader class
ssrlssrl52, bl5-2erlab.io.plugins.ssrl52.SSRL52Loader
merlinALS_BL4, als_bl4, BL403, bl403erlab.io.plugins.merlin.BL403Loader
da30DA30erlab.io.plugins.da30.DA30Loader
krissKRISSerlab.io.plugins.kriss.KRISSLoader

You can access each loader by its name or alias, both as an attribute or as an item. For example, to access the loader for the ALS beamline 4.0.3, you can use any of the following:

[3]:
erlab.io.loaders["merlin"]
erlab.io.loaders["bl403"]
erlab.io.loaders.merlin
erlab.io.loaders.bl403
[3]:
<erlab.io.plugins.merlin.BL403Loader at 0x7f463ccea390>

Data loading is done by calling the load method of the loader. It requires an identifier parameter, which can be a path to a file or a sequence number. It also accepts a data_dir parameter, which specifies the directory where the data is stored.

  • If identifier is a sequence number, data_dir must be provided.

  • If identifier is a string and data_dir is provided, the path is constructed by joining data_dir and identifier.

  • If identifier is a string and data_dir is not provided, identifier should be a valid path to a file.

Suppose we have data from the ALS beamline 4.0.3 stored as /path/to/data/f_001.pxt, /path/to/data/f_002.pxt, etc. To load f_001.pxt, all three of the following are valid:

loader = erlab.io.loaders["merlin"]

loader.load("/path/to/data/f_001.pxt")
loader.load("f_001.pxt", data_dir="/path/to/data")
loader.load(1, data_dir="/path/to/data")
Setting the default loader and data directory

In practice, a loader and a single directory will be used repeatedly in a session to load different data from the same experiment.

Instead of explicitly specifying the loader and directory each time, a default loader and data directory can be set with erlab.io.set_loader() and erlab.io.set_data_dir(). All subsequent calls to the shortcut function erlab.io.load() will use the specified loader and data directory.

erlab.io.set_loader("merlin")
erlab.io.set_data_dir("/path/to/data")
data_1 = erlab.io.load(1)
data_2 = erlab.io.load(2)

The loader and data directory can also be controlled with a context manager:

with erlab.io.loader_context("merlin", data_dir="/path/to/data"):
    data_1 = erlab.io.load(1)
Data across multiple files

For setups like the ALS beamline 4.0.3, some scans are stored over multiple files like f_003_S001.pxt, f_003_S002.pxt, and so on. In this case, the loader will automatically concatenate all files in the same scan. For example, all of the following will return the same concatenated data:

erlab.io.load(3)
erlab.io.load("f_003_S001.pxt")
erlab.io.load("f_003_S002.pxt")
...

If you want to cherry-pick a single file, you can pass single=True to load:

erlab.io.load("f_003_S001.pxt", single=True)
Handling multiple data directories

If you call erlab.io.set_loader() or erlab.io.set_data_dir() multiple times, the last call will override the previous ones. While this is useful for changing the loader or data directory, it makes data loading dependent on execution order. This may lead to unexpected behavior.

If you plan to use multiple loaders or data directories in the same session, it is recommended to use the context manager. If you have to load data from multiple directories multiple times, it may be convenient to define functions that set the loader and data directory and call erlab.io.load() with the appropriate arguments. For example:

def load1(identifier):
    with erlab.io.loader_context("merlin", data_dir="/path/to/data1"):
        return erlab.io.load(identifier)
Summarizing data

Some loaders have generate_summary implemented, which generates a pandas.DataFrame containing an overview of the data in a given directory. The generated summary can be viewed as a table with the summarize method. If ipywidgets is installed, an interactive widget is also displayed. This is useful for quickly skimming through the data.

Just like load, summarize can also be accessed with the shortcut function erlab.io.summarize(). For example, to display a summary of the data available in the directory /path/to/data using the 'merlin' loader:

erlab.io.set_loader("merlin")
erlab.io.set_data_dir("/path/to/data")
erlab.io.summarize()

To see what the generated summary looks like, see the example below.

Implementing a data loader plugin

It is easy to add new loaders to the framework. Any subclass of LoaderBase is automatically registered as a loader! The class must have a valid name attribute which is used to access the loader.

If the name attribute is prefixed with an underscore, the registration is skipped. This is useful for base classes that are not meant to be used directly.

Data loading flow

ARPES data from a single experiment are usually stored in one folder, with files that look like file_0001.h5, file_0002.h5, etc. If the naming scheme does not deviate from this pattern, only two methods need to be implemented: identify and load_single. The following flowchart shows the process of loading data from a single scan, given either a file path or a sequence number:

_images/flowchart_single.pdf

Here, identify is given an integer sequence number(identifier) and the path to the data folder(data_dir), and returns the full path to the corresponding data file.

The method load_single is given a full path to a single data file and must return the data as an xarray.DataArray or a xarray.Dataset. If the data cannot be combined into a single object, the method can also return a list of xarray.DataArray objects.

If only all data formats were as simple as this! Unfortunately, there are some setups where data for a single scan is saved over multiple files. In this case, the files will look like file_0001_0001.h5, file_0001_0002.h5, etc. For these kinds of setups, an additional method infer_index must be implemented. The following flowchart shows the process of loading data from multiple files:

_images/flowchart_multiple.pdf

In this case, the method identify should resolve all files that belong to the given sequence number, and return a list of file paths along with a dictionary of corresponding coordinates.

The method infer_index is given a bare file name (without the extension and path) like and must return the sequence number of the scan. For example, given the file name file_0003_0123, the method should return 3.

Conventions

There are some rules that loaded ARPES data must follow to ensure that analysis procedures such as momentum conversion and fitting works seamlessly:

  • The experimental geometry should be stored in the 'configuration' attribute as an integer. See Nomenclature and AxesConfiguration for more information.

  • All standard angle coordinates must follow the naming conventions in Nomenclature.

  • The sample temperature, if available, should be stored in the 'temp_sample' attribute.

  • The sample work function, if available, should be stored in the 'sample_workfunction' attribute.

  • Energies should be given in electronvolts.

  • Angles should be given in degrees.

  • Temperatures should be given in Kelvins.

All loaders by default does a basic check for a subset of these rules using validate and will raise a warning if some are missing. This behavior can be controlled with loader class attributes skip_validate and strict_validation.

A minimal example

Consider a setup that saves data into a .csv file named data_0001.csv and so on. A bare minimum implementation of a loader for the setup will look something like this:

[4]:
import os

import pandas as pd
from erlab.io.dataloader import LoaderBase


class MyLoader(LoaderBase):
    name = "my_loader"
    aliases = None
    name_map = {}
    coordinate_attrs = {}
    additional_attrs = {"information": "any metadata you want to load with the data"}
    skip_validate = False
    always_single = True

    def identify(self, num, data_dir):
        file = os.path.join(data_dir, f"data_{str(num).zfill(4)}.csv")
        return [file], {}

    def load_single(self, file_path):
        return pd.read_csv(file_path).to_xarray()
[5]:
erlab.io.loaders
[5]:
NameAliasesLoader class
ssrlssrl52, bl5-2erlab.io.plugins.ssrl52.SSRL52Loader
merlinALS_BL4, als_bl4, BL403, bl403erlab.io.plugins.merlin.BL403Loader
da30DA30erlab.io.plugins.da30.DA30Loader
krissKRISSerlab.io.plugins.kriss.KRISSLoader
my_loader__main__.MyLoader
[6]:
erlab.io.loaders["my_loader"]
[6]:
<__main__.MyLoader at 0x7f462c5399d0>

We can see that the loader has been registered.

A complex example

Next, let’s try to write a more realistic loader for a hypothetical setup that saves data as HDF5 files with the following naming scheme: data_001.h5, data_002.h5, and so on, with multiple scans named like data_001_S001.h5, data_001_S002.h5, etc. with the scan axis information stored in a separate file named data_001_axis.csv.

Let us first generate a data directory and place some synthetic data in it. Before saving, we rename and set some attributes that resemble real ARPES data.

[7]:
import csv
import datetime
import os
import tempfile

import erlab.io
import numpy as np
from erlab.io.exampledata import generate_data_angles


def make_data(beta=5.0, temp=20.0, hv=50.0, bandshift=0.0):
    data = generate_data_angles(
        shape=(250, 1, 300),
        angrange={"alpha": (-15, 15), "beta": (beta, beta)},
        hv=hv,
        configuration=1,
        temp=temp,
        bandshift=bandshift,
        assign_attributes=False,
        seed=1,
    ).T

    # Rename coordinates. The loader must rename them back to the original names.
    data = data.rename(
        {
            "alpha": "ThetaX",
            "beta": "Polar",
            "eV": "BindingEnergy",
            "hv": "PhotonEnergy",
            "xi": "Tilt",
            "delta": "Azimuth",
        }
    )

    # Assign some attributes that real data would have
    data = data.assign_attrs(
        {
            "LensMode": "Angular30",  # Lens mode of the analyzer
            "SpectrumType": "Fixed",  # Acquisition mode of the analyzer
            "PassEnergy": 10,  # Pass energy of the analyzer
            "UndPol": 0,  # Undulator polarization
            "DateTime": datetime.datetime.now().isoformat(),  # Acquisition time
            "TB": temp,
            "X": 0.0,
            "Y": 0.0,
            "Z": 0.0,
        }
    )
    return data


# Create a temporary directory
tmp_dir = tempfile.TemporaryDirectory()

# Define coordinates for the scan
beta_coords = np.linspace(2, 7, 10)

# Generate and save cuts with different beta values
for i, beta in enumerate(beta_coords):
    erlab.io.save_as_hdf5(
        make_data(beta=beta, temp=20.0, hv=50.0),
        filename=f"{tmp_dir.name}/data_001_S{str(i+1).zfill(3)}.h5",
        igor_compat=False,
    )

# Write scan coordinates to a csv file
with open(f"{tmp_dir.name}/data_001_axis.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(["Index", "Polar"])

    for i, beta in enumerate(beta_coords):
        writer.writerow([i + 1, beta])

# Generate some cuts with different band shifts
for i in range(4):
    erlab.io.save_as_hdf5(
        make_data(beta=5.0, temp=20.0, hv=50.0, bandshift=-i * 0.05),
        filename=f"{tmp_dir.name}/data_{str(i+2).zfill(3)}.h5",
        igor_compat=False,
    )

# List the generated files
sorted(os.listdir(tmp_dir.name))
[7]:
['data_001_S001.h5',
 'data_001_S002.h5',
 'data_001_S003.h5',
 'data_001_S004.h5',
 'data_001_S005.h5',
 'data_001_S006.h5',
 'data_001_S007.h5',
 'data_001_S008.h5',
 'data_001_S009.h5',
 'data_001_S010.h5',
 'data_001_axis.csv',
 'data_002.h5',
 'data_003.h5',
 'data_004.h5',
 'data_005.h5']

Now that we have the data, let’s implement the loader. The biggest difference from the previous example is that we need to handle multiple files for a single scan in identify. Also, we have to implement infer_index to extract the scan number from the file name.

[8]:
import glob
import re

import pandas as pd
from erlab.io.dataloader import LoaderBase


class ExampleLoader(LoaderBase):
    name = "example"

    aliases = ["Ex"]

    name_map = {
        "eV": "BindingEnergy",
        "alpha": "ThetaX",
        "beta": [
            "Polar",
            "Polar Compens",
        ],  # Can have multiple names assigned to the same name
        "delta": "Azimuth",
        "xi": "Tilt",
        "x": "X",
        "y": "Y",
        "z": "Z",
        "hv": "PhotonEnergy",
        "polarization": "UndPol",
        "temp_sample": "TB",
    }

    coordinate_attrs: tuple[str, ...] = (
        "beta",
        "delta",
        "xi",
        "hv",
        "x",
        "y",
        "z",
        "polarization",
        "photon_flux",
    )
    # Attributes to be used as coordinates. Place all attributes that we don't want to
    # lose when merging multiple file scans here.

    additional_attrs = {
        "configuration": 1,  # Experimental geometry. Required for momentum conversion
        "sample_workfunction": 4.3,
    }
    # Any additional metadata you want to add to the data. Note that attributes defined
    # here will not be transformed into coordinates. If you wish to promote some fixed
    # attributes to coordinates, add them to additional_coords.

    additional_coords = {}
    # Additional non-dimension coordinates to be added to the data, for instance the
    # photon energy for lab-based ARPES.

    skip_validate = False

    always_single = False

    def identify(self, num, data_dir):
        coord_dict = {}

        # Look for scans with data_###_S###.h5, and sort them
        files = glob.glob(f"data_{str(num).zfill(3)}_S*.h5", root_dir=data_dir)
        files.sort()

        if len(files) == 0:
            # If no files found, look for data_###.h5
            files = glob.glob(f"data_{str(num).zfill(3)}.h5", root_dir=data_dir)

        else:
            # If files found, extract coordinate values from the filenames
            axis_file = f"{data_dir}/data_{str(num).zfill(3)}_axis.csv"
            with open(axis_file) as f:
                header = f.readline().strip().split(",")

            # Load the coordinates from the csv file
            coord_arr = np.loadtxt(axis_file, delimiter=",", skiprows=1)

            # Each header entry will contain a dimension name
            for i, hdr in enumerate(header[1:]):
                key = self.name_map_reversed.get(hdr, hdr)
                coord_dict[key] = coord_arr[: len(files), i + 1].astype(np.float64)

        if len(files) == 0:
            # If no files found up to this point, raise an error
            raise FileNotFoundError(f"No files found for scan {num} in {data_dir}")

        # Files must be full paths
        files = [os.path.join(data_dir, f) for f in files]

        return files, coord_dict

    def load_single(self, file_path):
        data = erlab.io.load_hdf5(file_path)

        # To prevent conflicts when merging multiple scans, we rename the coordinates
        # prior to concatenation
        return self.process_keys(data)

    def infer_index(self, name):
        # Get the scan number from file name
        try:
            scan_num: str = re.match(r".*?(\d{3})(?:_S\d{3})?", name).group(1)
        except (AttributeError, IndexError):
            return None, None

        if scan_num.isdigit():
            # The second return value, a dictionary, is reserved for more complex
            # setups. See tips below for a brief explanation.
            return int(scan_num), {}
        else:
            return None, None
[9]:
erlab.io.loaders
[9]:
NameAliasesLoader class
ssrlssrl52, bl5-2erlab.io.plugins.ssrl52.SSRL52Loader
merlinALS_BL4, als_bl4, BL403, bl403erlab.io.plugins.merlin.BL403Loader
da30DA30erlab.io.plugins.da30.DA30Loader
krissKRISSerlab.io.plugins.kriss.KRISSLoader
my_loader__main__.MyLoader
exampleEx__main__.ExampleLoader

We can see that the example loader has been registered. Let’s test the loader by loading and plotting some data.

[10]:
erlab.io.set_loader("example")
erlab.io.set_data_dir(tmp_dir.name)
erlab.io.load(1)
[10]:
<xarray.DataArray (beta: 10, eV: 300, alpha: 250)> Size: 6MB
54.6 50.96 51.68 60.83 63.45 55.18 ... 3.076 0.9879 0.8741 2.515 1.359 1.262
Coordinates:
  * alpha         (alpha) float64 2kB -15.0 -14.88 -14.76 ... 14.76 14.88 15.0
  * beta          (beta) float64 80B 2.0 2.556 3.111 3.667 ... 5.889 6.444 7.0
  * eV            (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12
    xi            float64 8B 0.0
    delta         float64 8B 0.0
    hv            float64 8B 50.0
    x             float64 8B 0.0
    y             float64 8B 0.0
    z             float64 8B 0.0
    polarization  int64 8B 0
Attributes:
    LensMode:             Angular30
    SpectrumType:         Fixed
    PassEnergy:           10
    DateTime:             2024-05-16T02:22:00.471983
    TB:                   20.0
    temp_sample:          20.0
    configuration:        1
    sample_workfunction:  4.3
    data_loader_name:     example
[11]:
erlab.io.load(5).qplot()
[11]:
<matplotlib.image.AxesImage at 0x7f462c26a9d0>
_images/user-guide_io_16_1.svg

Brilliant! We now have a working loader for our hypothetical setup. However, we can’t use erlab.io.summarize() with our loader since we haven’t implemented generate_summary.

This method should return a pandas.DataFrame with the index containing file names. The only requirement for the DataFrame is that it should include a column named 'Path' that contains the paths to the data files. Other than that, the DataFrame can contain any metadata you wish to display in the summary. Let’s implement it in a subclass of the example loader:

[12]:
class ExampleLoaderComplete(ExampleLoader):
    name = "example_complete"
    aliases = ["ExC"]

    def generate_summary(self, data_dir):
        # Get all valid data files in directory
        files = {}
        for path in erlab.io.utilities.get_files(data_dir, extensions=[".h5"]):
            # Base name
            data_name = os.path.splitext(os.path.basename(path))[0]

            # If multiple scans, strip the _S### part
            name_match = re.match(r"(.*?_\d{3})_(?:_S\d{3})?", data_name)
            if name_match is not None:
                data_name = name_match.group(1)

            files[data_name] = path

        # Map dataframe column names to data attributes
        attrs_mapping = {
            "Lens Mode": "LensMode",
            "Scan Type": "SpectrumType",
            "T(K)": "temp_sample",
            "Pass E": "PassEnergy",
            "Polarization": "polarization",
            "hv": "hv",
            "x": "x",
            "y": "y",
            "z": "z",
            "polar": "beta",
            "tilt": "xi",
            "azi": "delta",
        }
        column_names = ["File Name", "Path", "Time", "Type", *attrs_mapping.keys()]

        data_info = []

        processed_indices = set()
        for name, path in files.items():
            # Skip already processed multi-file scans
            index, _ = self.infer_index(name)
            if index in processed_indices:
                continue
            elif index is not None:
                processed_indices.add(index)

            # Load data
            data = self.load(path)

            # Determine type of scan
            data_type = "core"
            if "alpha" in data.dims:
                data_type = "cut"
            if "beta" in data.dims:
                data_type = "map"
            if "hv" in data.dims:
                data_type = "hvdep"

            data_info.append(
                [
                    name,
                    path,
                    datetime.datetime.fromisoformat(data.attrs["DateTime"]),
                    data_type,
                ]
            )

            for k, v in attrs_mapping.items():
                # Try to get the attribute from the data, then from the coordinates
                try:
                    val = data.attrs[v]
                except KeyError:
                    try:
                        val = data.coords[v].values
                        if val.size == 1:
                            val = val.item()
                    except KeyError:
                        val = ""

                # Convert polarization values to human readable form
                if k == "Polarization":
                    if np.iterable(val):
                        val = np.asarray(val).astype(int)
                    else:
                        val = [round(val)]
                    val = [{0: "LH", 2: "LV", -1: "RC", 1: "LC"}.get(v, v) for v in val]
                    if len(val) == 1:
                        val = val[0]

                data_info[-1].append(val)

            del data

        # Sort by time and set index
        return (
            pd.DataFrame(data_info, columns=column_names)
            .sort_values("Time")
            .set_index("File Name")
        )

erlab.io.loaders
[12]:
NameAliasesLoader class
ssrlssrl52, bl5-2erlab.io.plugins.ssrl52.SSRL52Loader
merlinALS_BL4, als_bl4, BL403, bl403erlab.io.plugins.merlin.BL403Loader
da30DA30erlab.io.plugins.da30.DA30Loader
krissKRISSerlab.io.plugins.kriss.KRISSLoader
my_loader__main__.MyLoader
exampleEx__main__.ExampleLoader
example_completeExC__main__.ExampleLoaderComplete

The implementation looks complicated, but most of the code is boilerplate, and the actual logic is quite simple. You get a list of file names and paths to generate a summary for, define DataFrame columns and corresponding attributes, and then load the data one by one and extract the metadata. Let’s see how the resulting summary looks like.

Note

  • If ipywidgets is not installed, only the DataFrame will be displayed.

  • If you are viewing this documentation online, the summary will not be interactive. Run the code locally to try it out.

[13]:
erlab.io.set_loader("example_complete")
erlab.io.summarize()
  Type Lens Mode Scan Type T(K) Pass E Polarization hv x y z polar tilt azi
File Name                          
data_001 map Angular30 Fixed 20 10 LH 50 0 0 0 2→7 (0.5556, 10) 0 0
data_002 cut Angular30 Fixed 20 10 LH 50 0 0 0 5 0 0
data_003 cut Angular30 Fixed 20 10 LH 50 0 0 0 5 0 0
data_004 cut Angular30 Fixed 20 10 LH 50 0 0 0 5 0 0
data_005 cut Angular30 Fixed 20 10 LH 50 0 0 0 5 0 0
[13]:

Each cell in the summary table is formatted with formatter. If additional formatting that cannot be achieved within generate_summary is needed, formatter can be inherited in the subclass.

Tips
  • The data loading framework is designed to be simple and flexible, but it may not cover all possible setups. If you encounter a setup that cannot be loaded with the existing loaders, please let us know by opening an issue!

  • Before implementing a loader, see erlab.io.dataloader for descriptions about each attribute, and the values and types of the expected outputs. The implementation of existing loaders in the erlab.io.plugins module is a good starting point; see the source code on github.

  • If you have implemented a new loader or have improved an existing one, consider contributing it to the ERLabPy project by opening a pull request. We are always looking for new loaders to support more experimental setups! See more about contributing in the Contributing Guide.

  • If you wish to add post-processing steps that are applicable to all data loaded by that loader such as fixing the sign of the binding energy coordinates, you can inherit the post_process which by default handles coordinate and attribute renaming. This method is called after the data is loaded and can be used to modify the data before it is returned.

  • For complex data structures, constructing a full path from just the sequence number and the data directory can be difficult. In this case, the identify can be implemented to take additional keyword arguments. All keyword arguments passed to load are passed to identify!

    For instance, consider data with different prefixes like A_001.h5, A_002.h5, B_001.h5, etc. stored in the same directory. The sequence number alone is not enough to construct the full path. In this case, identify can be implemented to take an additional prefix argument which eliminates the ambiguity. Then, A_001.h5 can be loaded with erlab.io.load(1, prefix="A").

    If there are multiple file scans in this setup like A_001_S001.h5, A_001_S002.h5, etc., we would want to pass the prefix parameter to load from an identifier given as a file name. This is where the second return value of infer_index comes in handy, where you can return a dictionary which is passed to load.

Selecting and indexing data

In most cases, the powerful data manipulation and indexing methods provided by xarray are sufficient. In this page, some frequently used xarray features are summarized in addition to some utilities provided by this package. Refer to the xarray user guide for more information.

First, let us import some example data: a simple tight binding simulation of graphene.

[1]:
import xarray as xr

xr.set_options(display_expand_data=False)
[1]:
<xarray.core.options.set_options at 0x7fad49282e10>
[2]:
from erlab.io.exampledata import generate_data

dat = generate_data(seed=1).T
[3]:
dat
[3]:
<xarray.DataArray (eV: 300, ky: 250, kx: 250)> Size: 150MB
0.5243 1.033 0.6037 1.048 0.4388 ... 0.0003526 5.536e-06 2.813e-07 6.99e-08
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * ky       (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * eV       (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09

We can see that the generated data is a three-dimensional xarray.DataArray . Now, let’s extract a cut along \(k_y = 0.3\).

[4]:
dat.sel(ky=0.3, method="nearest")
[4]:
<xarray.DataArray (eV: 300, kx: 250)> Size: 600kB
1.535 1.377 0.9181 0.4302 0.5897 ... 1.171e-06 8.757e-06 0.0002878 0.001415
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
    ky       float64 8B 0.2967
  * eV       (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09

How about the Fermi surface?

[5]:
dat.sel(eV=0.0, method="nearest")
[5]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB
0.3501 0.1119 0.1255 0.1379 0.05128 ... 0.5261 0.2332 0.1398 0.1466 0.1662
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * ky       (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
    eV       float64 8B -0.000301

In many scenarios, it is necessary to perform integration across multiple array slices. This can be done by slicing and averaging. The following code integrates the intensity over a window of 50 meV centered at \(E_F\).

[6]:
dat.sel(eV=slice(-0.025, 0.025)).mean("eV")
[6]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB
0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * ky       (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89

However, doing this every time is cumbersome, and we have lost the coordinate eV. ERLabPy provides a callable accessor qsel to streamline this process.

[7]:
dat.qsel(eV=0.0, eV_width=0.05)
[7]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB
0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * ky       (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
    eV       float64 8B 0.000602

Note that the averaged coordinate is automatically added to the data array. This is useful for plotting and further analysis.

If the width is not specified, qsel behaves like passing method=’nearest’ to sel. If a slice is given instead of a single value, no integration is performed. All of these methods can be combined:

[8]:
dat.qsel(kx=slice(-0.3, 0.3), ky=0.3, eV=0.0, eV_width=0.05)
[8]:
<xarray.DataArray (kx: 84)> Size: 672B
0.3407 0.3622 0.3589 0.3659 0.2786 0.3363 ... 0.3541 0.318 0.3214 0.305 0.2766
Coordinates:
  * kx       (kx) float64 672B -0.2967 -0.2895 -0.2824 ... 0.2824 0.2895 0.2967
    ky       float64 8B 0.2967
    eV       float64 8B 0.000602
Masking

In some cases, it is necessary to mask the data. Although basic masks are supported by xarray, ERLabPy provides a way to mask data with arbitrary polygons.

Work in Progress

This part of the user guide is still under construction. For now, see erlab.analysis.mask. For the full list of packages and modules provided by ERLabPy, see API Reference.

Interpolation

In addition to the powerful interpolation methods provided by xarray, ERLabPy provides a convenient way to interpolate data along an arbitrary path.

Consider a Γ-M-K-Γ high symmetry path given as a list of kx and ky coordinates:

[9]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
import numpy as np

a = 6.97
kx = [0, 2 * np.pi / (a * np.sqrt(3)), 2 * np.pi / (a * np.sqrt(3)), 0]
ky = [0, 0, 2 * np.pi / (a * 3), 0]


dat.qsel(eV=-0.2).qplot(aspect="equal", cmap="Greys")
plt.plot(kx, ky, "o-")
[9]:
[<matplotlib.lines.Line2D at 0x7fad27b68210>]
_images/user-guide_indexing_15_1.svg

To interpolate the data along this path with a step of 0.01 Å\(^{-1}\), we can use the following code:

[10]:
import erlab.analysis as era

dat_sliced = era.interpolate.slice_along_path(
    dat, vertices={"kx": kx, "ky": ky}, step_size=0.01
)
dat_sliced
[10]:
<xarray.DataArray (eV: 300, path: 140)> Size: 336kB
0.07295 0.1004 0.4831 0.6724 0.1885 ... 1.159e-13 1.01e-07 0.00131 0.138 0.1486
Coordinates:
  * eV       (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
    kx       (path) float64 1kB 0.0 0.01021 0.02041 ... 0.01764 0.008821 0.0
    ky       (path) float64 1kB 0.0 0.0 0.0 0.0 ... 0.01528 0.01019 0.005093 0.0
  * path     (path) float64 1kB 0.0 0.01021 0.02041 ... 1.402 1.412 1.422

We can see that the data has been interpolated along the path. The new coordinate path contains the distance along the path, and the dimensions kx and ky are now expressed in terms of path.

The distance along the path can be calculated as the sum of the distances between consecutive points in the path.

[11]:
dat_sliced.plot(cmap="Greys")
eplt.fermiline()

# Distance between each pair of consecutive points
distances = np.linalg.norm(np.diff(np.vstack([kx, ky]), axis=-1), axis=0)
seg_coords = np.concatenate(([0], np.cumsum(distances)))

plt.xticks(seg_coords, labels=["Γ", "M", "K", "Γ"])
plt.xlim(0, seg_coords[-1])
for seg in seg_coords[1:-1]:
    plt.axvline(seg, ls="--", c="k", lw=1)
_images/user-guide_indexing_19_0.svg

You will learn more about plotting in the next section.

Plotting

ERLabPy provides a number of plotting functions to help visualize data and create publication quality figures.

Importing

The key module to plotting is erlab.plotting.erplot, which serves as an interface to various plotting functions in ERLabPy, similar to how matplotlib.pyplot works. To import it, use the following code:

[1]:
import matplotlib.pyplot as plt
import erlab.plotting.erplot as eplt

First, let us generate some example data from a simple tight binding model of graphene. A rigid shift of 200 meV has been applied so that the Dirac cone is visible.

[3]:
from erlab.io.exampledata import generate_data

dat = generate_data(bandshift=-0.2, seed=1).T
[4]:
dat
[4]:
<xarray.DataArray (eV: 300, ky: 250, kx: 250)> Size: 150MB
0.1517 0.1233 0.3382 0.3101 0.3476 0.8233 ... 0.117 0.116 0.4569 0.1373 0.1174
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * ky       (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
  * eV       (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09

We can see that the generated data is a three-dimensional xarray.DataArray. Now, let’s extract a cut along \(k_y = 0.3\).

[5]:
cut = dat.qsel(ky=0.3)
cut
[5]:
<xarray.DataArray (eV: 300, kx: 250)> Size: 600kB
0.7814 0.3649 0.2043 0.2815 0.7351 ... 0.07689 0.01882 0.0002918 2.866e-07
Coordinates:
  * kx       (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
    ky       float64 8B 0.2967
  * eV       (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
Plotting 2D data

The fastest way to plot a 2D array like this is to use plot_array. Each axis is automatically labeled.

[6]:
eplt.plot_array(cut)
[6]:
<matplotlib.image.AxesImage at 0x7f458fc59d50>
_images/user-guide_plotting_9_1.svg

plot_array takes many arguments that can customize the look of your plot. The following is an example of some of the functionality provided. For all arguments, see the API reference.

[7]:
eplt.plot_array(
    cut, cmap="Greys", gamma=0.5, colorbar=True, colorbar_kw=dict(width=10, ticks=[])
)
[7]:
<matplotlib.image.AxesImage at 0x7f458f883b50>
_images/user-guide_plotting_11_1.svg

plot_array can also be accessed (for 2D data) through the qplot accessor.

[8]:
cut.qplot(cmap="Greys", gamma=0.5)
[8]:
<matplotlib.image.AxesImage at 0x7f458f7a1f90>
_images/user-guide_plotting_13_1.svg
[9]:
eplt.plot_array(cut, cmap="Greys", gamma=0.5)

eplt.fermiline()
eplt.mark_points([-0.525, 0.525], ["K", "K"], fontsize=10, pad=(0, 10))
eplt.nice_colorbar(width=10, ticks=[])
[9]:
<matplotlib.colorbar.Colorbar at 0x7f458deafe10>
_images/user-guide_plotting_14_1.svg

Next, let’s add some annotations! The following code adds a line indicating the Fermi level, labels high symmetry points, and adds a colorbar. Here, unlike the previous example, the colorbar was added after plotting. Like this, adding elements separately instead of using keyword arguments can make the code more readable in complex plots.

Slices

What if we want to plot multiple slices at once? We should create subplots to place the slices. plt.subplots is very useful in managing multiple axes and figures. If you are unfamiliar with the syntax, visit the relevant matplotlib documentation.

Suppose we want to plot constant energy surfaces at specific binding energies, say, at [-0.4, -0.2, 0.0]. We could create three subplots and iterate over the axes.

[10]:
energies = [-0.4, -0.2, 0.0]

fig, axs = plt.subplots(1, 3, layout="compressed", sharey=True)
for energy, ax in zip(energies, axs):
    const_energy_surface = dat.qsel(eV=energy)
    eplt.plot_array(const_energy_surface, ax=ax, gamma=0.5, aspect="equal")
_images/user-guide_plotting_18_0.svg

Here, we plotted each constant energy surface with plot_array. To remove the duplicated y axis labels and add some annotations, we can use clean_labels and label_subplot_properties:

[11]:
fig, axs = plt.subplots(1, 3, layout="compressed", sharey=True)
for energy, ax in zip(energies, axs):
    const_energy_surface = dat.qsel(eV=energy)
    eplt.plot_array(const_energy_surface, ax=ax, gamma=0.5, aspect="equal")

eplt.clean_labels(axs)  # removes shared y labels
eplt.label_subplot_properties(axs, values={"Eb": energies})  # annotates energy
_images/user-guide_plotting_20_0.svg

Not bad. However, when it gets to multiple slices along multiple datasets, it gets cumbersome. Luckily, ERLabPy provides a function that automates the subplot creation, slicing, and annotation for you: plot_slices, which reduces the same code to a one-liner. See the API reference for a full description of all possible arguments.

[12]:
fig, axs = eplt.plot_slices([dat], eV=[-0.4, -0.2, 0.0], gamma=0.5, axis="image")
_images/user-guide_plotting_22_0.svg

We can also plot the data integrated over an energy window, in this case with a width of 200 meV by adding the eV_width argument:

[13]:
fig, axs = eplt.plot_slices(
    [dat], eV=[-0.4, -0.2, 0.0], eV_width=0.2, gamma=0.5, axis="image"
)
_images/user-guide_plotting_24_0.svg

Cuts along constant \(k_y\) can be plotted analogously.

[14]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
_images/user-guide_plotting_26_0.svg

Here, we notice that the first two plots slices through regions with less spectral weight, so the color across the three subplots are not on the same scale. This may be misleading in some occasions where intensity across different slices are important. Luckily, we have a function that can unify the color limits across multiple axes.

The same effect can be achieved by passing on same_limits=True to plot_slices.

[15]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
eplt.unify_clim(axs)
_images/user-guide_plotting_28_0.svg

We can also choose a reference axis to get the color limits from.

[16]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
eplt.unify_clim(axs, target=axs.flat[1])
_images/user-guide_plotting_30_0.svg

What if we want to plot constant energy surfaces and cuts in the same figure? We can create the subplots first and then utilize the axes argument of plot_slices.

[17]:
fig, axs = plt.subplots(2, 3, layout="compressed", sharex=True, sharey="row")
eplt.plot_slices([dat], eV=[-0.4, -0.2, 0.0], gamma=0.5, axes=axs[0, :], axis="image")
eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, axes=axs[1, :])
eplt.clean_labels(axs)
_images/user-guide_plotting_32_0.svg
2D colormaps

2D colormaps are a method to visualize two data with a single image by mapping one of the data to the lightness of the color and the other to the hue. This is useful when visualizing dichroic or spin-resolved ARPES data [2].

Let us begin with the simulated constant energy contours of Graphene, 0.3 eV below and above the Fermi level.

[18]:
dat0, dat1 = generate_data(
    shape=(250, 250, 2), Erange=(-0.3, 0.3), temp=0.0, seed=1, count=1e6
).T

_, axs = eplt.plot_slices(
    [dat0, dat1],
    order="F",
    subplot_kw={"layout": "compressed", "sharey": "row"},
    axis="scaled",
    label=True,
)
# eplt.label_subplot_properties(axs, values=dict(Eb=[-0.3, 0.3]))
_images/user-guide_plotting_34_0.svg

Suppose we want to visualize the sum and the normalized difference between the two. The simplest way is to plot them side by side.

[19]:
dat_sum = dat0 + dat1
dat_ndiff = (dat0 - dat1) / dat_sum

eplt.plot_slices(
    [dat_sum, dat_ndiff],
    order="F",
    subplot_kw={"layout": "compressed", "sharey": "row"},
    cmap=["viridis", "bwr"],
    axis="scaled",
)
eplt.proportional_colorbar()
[19]:
<matplotlib.colorbar.Colorbar at 0x7f458d6621d0>
_images/user-guide_plotting_36_1.svg

The difference array is noisy for small values of the sum. We can plot using a 2D colomap, where dat_ndiff is mapped to the color along the colormap and dat_sum is mapped to the lightness of the colormap.

[20]:
eplt.plot_array_2d(dat_sum, dat_ndiff)
[20]:
(<matplotlib.image.AxesImage at 0x7f458d36c990>,
 <matplotlib.colorbar.Colorbar at 0x7f458d36e110>)
_images/user-guide_plotting_38_1.svg

The color normalization for each axis can be set independently with lnorm and cnorm. The appearance of the colorbar axes can be customized with the returned Colorbar object.

[21]:
_, cb = eplt.plot_array_2d(
    dat_sum,
    dat_ndiff,
    lnorm=eplt.InversePowerNorm(0.5),
    cnorm=eplt.CenteredInversePowerNorm(0.7, vcenter=0.0, halfrange=1.0),
)
cb.ax.set_xticks(cb.ax.get_xlim())
cb.ax.set_xticklabels(["Min", "Max"])
[21]:
[Text(1.922459256430518e-08, 0, 'Min'), Text(218.36113081275943, 0, 'Max')]
_images/user-guide_plotting_40_1.svg
Styling figures

You can control the look and feel of matplotlib figures with style sheets and rcParams. In addition to the options provided by matplotlib, ERLabPy provides some style sheets that are listed below. Note that style sheets that change the default font requires the font to be installed on the system. To see how each one looks, try running the code provided by matplotlib.

Style Name

Description

khan

Personal preferences of the package author.

fira

Changes the default font to Fira Sans.

firalight

Changes the default font to Fira Sans Light.

times

Changes the default font to Times New Roman.

nature

Changes the default font to Arial, and tweaks some aspects such as padding and default figure size.

[22]:
with plt.style.context(["nature"]):
    eplt.plot_array(cut, cmap="Greys", gamma=0.5)
findfont: Font family ['cursive'] not found. Falling back to DejaVu Sans.
findfont: Generic family 'cursive' not found because none of the following families were found: Apple Chancery, Textile, Zapf Chancery, Sand, Script MT, Felipa, Comic Neue, Comic Sans MS, cursive
_images/user-guide_plotting_43_1.svg
Tips

Although matplotlib is a powerful library, it is heavy and slow, and better suited for static plots. For interactive plots, libraries such as Plotly or Bokeh are popular.

The hvplot library is a high-level plotting library that provides a simple interface to Bokeh, Plotly, and Matplotlib. It is particularly useful for interactive plots and can be used with xarray objects. Here are some examples that uses the Bokeh backend:

[23]:
import hvplot.xarray

cut.hvplot(x="kx", y="eV", cmap="Greys", aspect=1.5)
[23]:
[24]:
dat.hvplot(x="kx", y="ky", cmap="Greys", aspect="equal", widget_location="bottom")
[24]:

Note

If you are viewing this documentation online, the slider above will not work. To see the interactive plot, you can run the notebook locally after installing hvplot.

For more information, see the hvplot documentation.

Momentum conversion

Momentum conversion in ERLabPy is exact with no small angle approximation, but is also very fast, thanks to the numba-accelerated trilinear interpolation in erlab.analysis.interpolate.

Nomenclature

Momentum conversion in ERLabPy follows the nomenclature from Ishida and Shin [3]. All experimental geometry are classified into 4 types. Definition of angles differ for each geometry.

For instance, imagine a typical Type 1 setup with a vertical slit that acquires maps by rotating about the z axis in the lab frame. In this case, the polar angle (rotation about z) is \(β\), and the tilt angle is \(ξ\).

In all cases, \(δ\) is the azimuthal angle that indicates in-plane rotation, and \(α\) is the angle along the slit.

[1]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt

Let’s generate some example data, this time in angle coordinates.

[3]:
from erlab.io.exampledata import generate_data_angles

dat = generate_data_angles(shape=(200, 60, 300), assign_attributes=True, seed=1).T
dat
[3]:
<xarray.DataArray (eV: 300, beta: 60, alpha: 200)> Size: 29MB
6.269 9.361 7.431 4.192 4.509 ... 0.1411 0.002556 8.644e-06 6.478e-09 3.813e-13
Coordinates:
  * alpha    (alpha) float64 2kB -15.0 -14.85 -14.7 -14.55 ... 14.7 14.85 15.0
  * beta     (beta) float64 480B -15.0 -14.49 -13.98 -13.47 ... 13.98 14.49 15.0
  * eV       (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12
    xi       float64 8B 0.0
    delta    float64 8B 0.0
    hv       float64 8B 50.0
Attributes:
    configuration:        1
    temp_sample:          20.0
    sample_workfunction:  4.5

Let us define a 2D cut from the map data we just generated.

[4]:
cut = dat.sel(beta=10.0, method="nearest")
eplt.plot_array(cut)
[4]:
<matplotlib.image.AxesImage at 0x7f4b5ebd3250>
_images/user-guide_kconv_6_1.svg

Although the functions for momentum conversion are implemented in erlab.analysis.kspace, the actual conversion is performed using an xarray accessor. Let’s see how it works.

Converting to momentum space

Momentum conversion is done by the convert method of the kspace accessor. The bounds and resolution are automatically determined from the data if no input is provided. The method returns a new DataArray in momentum space.

[5]:
dat_kconv = dat.kspace.convert()
dat_kconv
Estimating bounds and resolution
Calculating destination coordinates
Converting ('eV', 'alpha', 'beta')  ->  ('eV', 'kx', 'ky')
Interpolated in 2.088 s
[5]:
<xarray.DataArray (eV: 300, kx: 310, ky: 310)> Size: 231MB
nan nan nan nan nan nan nan nan ... 2.316e-06 9.819e-07 nan nan nan nan nan nan
Coordinates:
    xi       float64 8B 0.0
    delta    float64 8B 0.0
    hv       float64 8B 50.0
  * eV       (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12
  * kx       (kx) float64 2kB -0.8956 -0.8898 -0.884 ... 0.884 0.8898 0.8956
  * ky       (ky) float64 2kB -0.8956 -0.8898 -0.884 ... 0.884 0.8898 0.8956
Attributes:
    configuration:        1
    temp_sample:          20.0
    sample_workfunction:  4.5
    delta_offset:         0.0
    xi_offset:            0.0
    beta_offset:          0.0

Let us plot the original and converted data side by side.

[6]:
fig, axs = plt.subplots(1, 2, layout="compressed")
eplt.plot_array(dat.sel(eV=-0.3, method="nearest"), ax=axs[0], aspect="equal")
eplt.plot_array(dat_kconv.sel(eV=-0.3, method="nearest"), ax=axs[1], aspect="equal")
[6]:
<matplotlib.image.AxesImage at 0x7f4b5ca34bd0>
_images/user-guide_kconv_10_1.svg
Setting parameters

Parameters that are needed for momentum conversion are the information about the experimental configuration, the inner potential \(V_0\) (for photon energy dependent data), work function, and angle offsets. These parameters are all stored as data attributes. The kspace accessor provides a convenient way to access and modify these parameters. See configuration, inner_potential, work_function, and offsets for more information.

First, let’s check the angle offsets.

[7]:
dat.kspace.offsets
[7]:
delta0.0
xi0.0
beta0.0

Since we haven’t set any offsets, they are all zero. We will set the azimuthal angle to 60 degrees and the polar offset to 30 degrees and see what happens.

[8]:
dat.kspace.offsets.update(delta=60.0, beta=30.0)
[8]:
delta60.0
xi0.0
beta30.0
[9]:
dat_kconv = dat.kspace.convert()
dat_kconv
Estimating bounds and resolution
Calculating destination coordinates
Converting ('eV', 'alpha', 'beta')  ->  ('eV', 'kx', 'ky')
Interpolated in 0.937 s
[9]:
<xarray.DataArray (eV: 300, kx: 380, ky: 398)> Size: 363MB
nan nan nan nan nan nan nan nan nan nan ... nan nan nan nan nan nan nan nan nan
Coordinates:
    xi       float64 8B 0.0
    delta    float64 8B 0.0
    hv       float64 8B 50.0
  * eV       (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12
  * kx       (kx) float64 3kB -2.495 -2.489 -2.483 ... -0.3111 -0.3053 -0.2995
  * ky       (ky) float64 3kB -0.3431 -0.3373 -0.3315 ... 1.946 1.952 1.957
Attributes:
    configuration:        1
    temp_sample:          20.0
    sample_workfunction:  4.5
    delta_offset:         60.0
    xi_offset:            0.0
    beta_offset:          30.0

Plotting the converted data again, we can see the effect of angle offsets on the conversion.

[10]:
fig, axs = plt.subplots(1, 2, layout="compressed")
eplt.plot_array(dat.sel(eV=-0.3, method="nearest"), ax=axs[0], aspect="equal")
eplt.plot_array(dat_kconv.sel(eV=-0.3, method="nearest"), ax=axs[1], aspect="equal")
[10]:
<matplotlib.image.AxesImage at 0x7f4b2eb60790>
_images/user-guide_kconv_17_1.svg
Interactive conversion

For three dimensional momentum conversion like maps or photon energy scans, an interactive window can be opened where you can adjust the parameters and see the effect right away.

The GUI is divided into two tabs.

KspaceTool 1 KspaceTool 1

The first tab is for setting momentum conversion parameters. The image is updated in real time as you change the parameters. Clicking the “Copy code” button will copy the code for conversion to the clipboard. The “Open in ImageTool” button performs a full three-dimensional conversion and opens the result in the ImageTool.

KspaceTool 2 KspaceTool 2

The second tab provides visualization options. You can overlay Brillouin zones and high symmetry points on the result, adjust colors, and apply binning.

There are two ways to invoke the GUI. The first one is to call the interactive method on the accessor:

data.kspace.interactive()

The second option is to invoke the GUI directly with the erlab.interactive.ktool function. The difference is that the latter will automatically determine the name of the input data and apply it to the generated code that is copied to the clipboard.

import erlab.interactive as eri

eri.ktool(data)

Curve fitting

Curve fitting in ERLabPy largely relies on lmfit. Along with some convenient models for common fitting tasks, ERLabPy provides a powerful accessor that streamlines curve fitting on multidimensional xarray objects.

ERLabPy also provides optional integration of lmfit models with iminuit, which is a Python interface to the Minuit C++ library developed at CERN.

In this tutorial, we will start with the basics of curve fitting using lmfit, introduce some models that are available in ERLabPy, and demonstrate curve fitting with the modelfit accessor to fit multidimensional xarray objects. Finally, we will show how to use iminuit with lmfit models.

Basic fitting with lmfit
[1]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from erlab.io.exampledata import generate_data

Let’s start by defining a model function and the data to fit.

[3]:
def poly1(x, a, b):
    return a * x + b


# Generate some toy data
x = np.linspace(0, 10, 20)
y = poly1(x, 1, 2)

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(1)
yerr = np.full_like(x, 0.5)
y = rng.normal(y, yerr)
Fitting a simple function

A lmfit model can be created by calling lmfit.Model class with the model function and the independent variable(s) as arguments.

[4]:
import lmfit

model = lmfit.Model(poly1)
params = model.make_params(a=1.0, b=2.0)
result = model.fit(y, x=x, params=params, weights=1 / yerr)

result.plot()
result
[4]:

Fit Result

Model: Model(poly1)

_images/user-guide_curve-fitting_6_1.svg

By passing dictionaries to make_params, we can set the initial values of the parameters and also set the bounds for the parameters.

[5]:
model = lmfit.Model(poly1)
params = model.make_params(
    a={"value": 1.0, "min": 0.0},
    b={"value": 2.0, "vary": False},
)
result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()
_images/user-guide_curve-fitting_8_0.svg

result is a lmfit.model.ModelResult object that contains the results of the fit. The best-fit parameters can be accessed through the result.params attribute.

Note

Since all weights are the same in this case, it has little effect on the fit results. However, if we are confident that we have a good estimate of yerr, we can pass scale_covar=True to the fit method to obtain accurate uncertainties.

[6]:
result.params
[6]:
[7]:
result.params["a"].value, result.params["a"].stderr
[7]:
(0.9938903037183116, 0.011256084507121714)

The parameters can also be retrieved in a form that allows easy error propagation calculation, enabled by the uncertainties package.

[8]:
a_uvar = result.uvars["a"]
print(a_uvar)
print(a_uvar**2)
0.994+/-0.011
0.988+/-0.022
Fitting with composite models

Before fitting, let us generate a Gaussian peak on a linear background.

[9]:
# Generate toy data
x = np.linspace(0, 10, 50)
y = -0.1 * x + 2 + 3 * np.exp(-((x - 5) ** 2) / (2 * 1**2))

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(x, 0.3)
y = rng.normal(y, yerr)

# Plot the data
plt.errorbar(x, y, yerr, fmt="o")
[9]:
<ErrorbarContainer object of 3 artists>
_images/user-guide_curve-fitting_15_1.svg

A composite model can be created by adding multiple models together.

[10]:
from lmfit.models import GaussianModel, LinearModel

model = GaussianModel() + LinearModel()
params = model.make_params(slope=-0.1, center=5.0, sigma={"value": 0.1, "min": 0})
params
[10]:
[11]:
result = model.fit(y, x=x, params=params, weights=1 / yerr)
result.plot()
result
[11]:

Fit Result

Model: (Model(gaussian) + Model(linear))

_images/user-guide_curve-fitting_18_1.svg

How about multiple gaussian peaks? Since the parameter names overlap between the models, we must use the prefix argument to distinguish between them.

[12]:
model = GaussianModel(prefix="p0_") + GaussianModel(prefix="p1_") + LinearModel()
model.make_params()
[12]:

For more information, see the lmfit documentation.

Fitting with pre-defined models

Creating composite models with different prefixes every time can be cumbersome, so ERLabPy provides some pre-defined models in erlab.analysis.fit.models.

Fitting multiple peaks

One example is MultiPeakModel, which is a composite model of multiple Gaussian or Lorentzian peaks on a linear background. By supplying keyword arguments, you can specify the number of peaks, their shapes, whether to multiply with a Fermi-Dirac distribution, and whether to convolve the result with experimental resolution.

[13]:
from erlab.analysis.fit.models import MultiPeakModel

model = MultiPeakModel(npeaks=1, peak_shapes=["gaussian"], fd=False, convolve=False)
params = model.make_params(p0_center=5.0, p0_width=0.2, p0_height=3.0)
params
[13]:
[14]:
result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()
_images/user-guide_curve-fitting_23_0.svg

We can also plot components.

[15]:
comps = result.eval_components()
plt.errorbar(x, y, yerr, fmt="o", zorder=-1, alpha=0.3)
plt.plot(x, result.eval(), label="Best fit")
plt.plot(x, comps["1Peak_p0"], "--", label="Peak")
plt.plot(x, comps["1Peak_bkg"], "--", label="Background")
plt.legend()
[15]:
<matplotlib.legend.Legend at 0x7f3d52022d90>
_images/user-guide_curve-fitting_25_1.svg

Now, let us try fitting MDCs cut from simulated data with multiple Lorentzian peaks, convolved with a common instrumental resolution.

[16]:
data = generate_data(bandshift=-0.2, count=5e8, seed=1).T
cut = data.qsel(ky=0.3)
cut.qplot(colorbar=True)
[16]:
<matplotlib.image.AxesImage at 0x7f3d520979d0>
_images/user-guide_curve-fitting_27_1.svg
[17]:
mdc = cut.qsel(eV=0.0)
mdc.qplot()
[17]:
[<matplotlib.lines.Line2D at 0x7f3d51dbcb50>]
_images/user-guide_curve-fitting_28_1.svg

First, we define the model and set the initial parameters.

[18]:
from erlab.analysis.fit.models import MultiPeakModel

model = MultiPeakModel(npeaks=4, peak_shapes=["lorentzian"], fd=False, convolve=True)

params = model.make_params(
    p0_center=-0.6,
    p1_center=-0.45,
    p2_center=0.45,
    p3_center=0.6,
    p0_width=0.02,
    p1_width=0.02,
    p2_width=0.02,
    p3_width=0.02,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)
params
[18]:

Then, we can fit the model to the data:

[19]:
result = model.fit(mdc, x=mdc.kx, params=params)
result.plot()
result
[19]:

Fit Result

Model: Model(4Peak)

_images/user-guide_curve-fitting_32_1.svg
Fitting xarray objects

ERLabPy provides accessors for xarray objects that allows you to fit data with lmfit models: xarray.DataArray.modelfit or xarray.Dataset.modelfit, depending on whether you want to fit a single DataArray or multiple DataArrays in a Dataset. The syntax is similar to xarray.DataArray.curvefit() and xarray.Dataset.curvefit(). The accessor returns a xarray.Dataset with the best-fit parameters and the fit statistics. The example below illustrates how to use the accessor to conduct the same fit as in the previous example.

[20]:
result_ds = mdc.modelfit("kx", model, params=params)
result_ds
[20]:
<xarray.Dataset> Size: 14kB
Dimensions:                (param: 23, cov_i: 23, cov_j: 23, fit_stat: 8,
                            kx: 250)
Coordinates:
    ky                     float64 8B 0.2967
    eV                     float64 8B -0.000301
  * kx                     (kx) float64 2kB -0.89 -0.8829 ... 0.8829 0.89
  * param                  (param) <U12 1kB 'p0_center' ... 'p3_amplitude'
  * fit_stat               (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic'
  * cov_i                  (cov_i) <U12 1kB 'p0_center' ... 'p3_amplitude'
  * cov_j                  (cov_j) <U12 1kB 'p0_center' ... 'p3_amplitude'
Data variables:
    modelfit_results       object 8B <lmfit.model.ModelResult object at 0x7f3...
    modelfit_coefficients  (param) float64 184B -0.5965 0.02403 ... 44.26
    modelfit_stderr        (param) float64 184B 2.54e-05 0.0001685 ... 0.08273
    modelfit_covariance    (cov_i, cov_j) float64 4kB 6.451e-10 ... nan
    modelfit_stats         (fit_stat) float64 64B 305.0 14.0 ... 232.5 281.8
    modelfit_data          (kx) float64 2kB 2.337 2.061 2.908 ... 3.202 4.133
    modelfit_best_fit      (kx) float64 2kB 2.336 2.437 2.545 ... 2.495 2.392

modelfit_results contains the underlying lmfit.model.ModelResult objects. The coefficients and their errors are stored in modelfit_coefficients and modelfit_stderr, respectively. The goodness-of-fit statistics are stored in modelfit_stats.

Fitting across multiple dimensions

The accessor comes in handy when you have to fit a single model to multiple data points across arbitrary dimensions. Let’s demonstrate this with a simulated cut that represents a curved Fermi edge at 100 K, with an energy broadening of 20 meV.

[21]:
from erlab.io.exampledata import generate_gold_edge
from erlab.analysis.fit.models import FermiEdgeModel

gold = generate_gold_edge(temp=100, Eres=0.02, count=5e5, seed=1)
gold.qplot(cmap="Greys")
[21]:
<matplotlib.image.AxesImage at 0x7f3d51609690>
_images/user-guide_curve-fitting_36_1.svg

We first select ± 0.2 eV around the Fermi level and fit the model across the energy axis for every EDC.

[22]:
gold_selected = gold.sel(eV=slice(-0.2, 0.2))
result_ds = gold_selected.modelfit(
    coords="eV",
    model=FermiEdgeModel(),
    params={"temp": {"value": 100.0, "vary": False}},
    guess=True,
)
result_ds
[22]:
<xarray.Dataset> Size: 358kB
Dimensions:                (alpha: 200, param: 7, cov_i: 7, cov_j: 7,
                            fit_stat: 8, eV: 75)
Coordinates:
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
  * param                  (param) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * fit_stat               (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic'
  * cov_i                  (cov_i) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
  * cov_j                  (cov_j) <U10 280B 'center' 'temp' ... 'dos0' 'dos1'
Data variables:
    modelfit_results       (alpha) object 2kB <lmfit.model.ModelResult object...
    modelfit_coefficients  (alpha, param) float64 11kB -0.01774 100.0 ... -3.53
    modelfit_stderr        (alpha, param) float64 11kB 0.005598 0.0 ... 4.111
    modelfit_covariance    (alpha, cov_i, cov_j) float64 78kB 3.134e-05 ... 16.9
    modelfit_stats         (alpha, fit_stat) float64 13kB 106.0 6.0 ... -26.34
    modelfit_data          (alpha, eV) float64 120kB 5.1 7.021 7.453 ... 0.0 0.0
    modelfit_best_fit      (alpha, eV) float64 120kB 6.642 6.565 ... -0.0711

Let’s plot the fitted parameters as a function of angle!

[23]:
gold.qplot(cmap="Greys")
plt.errorbar(
    gold_selected.alpha,
    result_ds.modelfit_coefficients.sel(param="center"),
    result_ds.modelfit_stderr.sel(param="center"),
    fmt=".",
)
[23]:
<ErrorbarContainer object of 3 artists>
_images/user-guide_curve-fitting_40_1.svg

We can also conduct a global fit to all the EDCs with a multidimensional model. First, we normalize the data with the averaged intensity of each EDC and then fit the data to FermiEdge2dModel.

[24]:
from erlab.analysis.fit.models import FermiEdge2dModel

gold_norm = gold_selected / gold_selected.mean("eV")
result_ds = gold_norm.T.modelfit(
    coords=["eV", "alpha"],
    model=FermiEdge2dModel(),
    params={"temp": {"value": 100.0, "vary": False}},
    guess=True,
)
result_ds
[24]:
<xarray.Dataset> Size: 244kB
Dimensions:                (param: 8, cov_i: 8, cov_j: 8, fit_stat: 8,
                            alpha: 200, eV: 75)
Coordinates:
  * alpha                  (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0
  * eV                     (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983
  * param                  (param) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
  * fit_stat               (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic'
  * cov_i                  (cov_i) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
  * cov_j                  (cov_j) <U10 320B 'c0' 'c1' ... 'offset' 'resolution'
Data variables:
    modelfit_results       object 8B <lmfit.model.ModelResult object at 0x7f3...
    modelfit_coefficients  (param) float64 64B 0.03604 8.09e-06 ... 0.04345
    modelfit_stderr        (param) float64 64B 0.0004463 2.749e-05 ... 0.001243
    modelfit_covariance    (cov_i, cov_j) float64 512B 1.992e-07 ... 1.545e-06
    modelfit_stats         (fit_stat) float64 64B 82.0 7.0 ... -3.98e+04
    modelfit_data          (alpha, eV) float64 120kB 2.027 2.791 ... 0.0 0.0
    modelfit_best_fit      (alpha, eV) float64 120kB 1.965 1.957 ... 0.02488

Let’s plot the fit results and the residuals.

[25]:
best_fit = result_ds.modelfit_best_fit.transpose(*gold_norm.dims)

fig, axs = eplt.plot_slices(
    [gold_norm, best_fit, best_fit - gold_norm],
    figsize=(4, 5),
    cmap=["viridis", "viridis", "bwr"],
    norm=[plt.Normalize(), plt.Normalize(), eplt.CenteredPowerNorm(1.0, vcenter=0)],
    colorbar="all",
    hide_colorbar_ticks=False,
    colorbar_kw={"width": 7},
)
eplt.set_titles(axs, ["Data", "FermiEdge2dModel", "Residuals"])
_images/user-guide_curve-fitting_45_0.svg
Providing multiple initial guesses

You can also provide different initial guesses and bounds for different coordinates. To demonstrate, let’s create some data containing multiple MDCs, each with a different peak position.

[26]:
# Define angle coordinates for 2D data
alpha = np.linspace(-5.0, 5.0, 100)
beta = np.linspace(-1.0, 1.0, 3)

# Center of the peaks along beta
center = np.array([-2.0, 0.0, 2.0])[:, np.newaxis]

# Gaussian peak on a linear background
y = -0.1 * alpha + 2 + 3 * np.exp(-((alpha - center) ** 2) / (2 * 1**2))

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(y, 0.1)
y = rng.normal(y, yerr)

# Transform to DataArray
darr = xr.DataArray(y, dims=["beta", "alpha"], coords={"beta": beta, "alpha": alpha})
darr.qplot()
[26]:
<matplotlib.image.AxesImage at 0x7f3d5080f450>
_images/user-guide_curve-fitting_47_1.svg

We can provide different initial guesses for the peak positions along the beta dimension by passing a dictionary of DataArrays to the params argument.

[27]:
result_ds = darr.modelfit(
    coords="alpha",
    model=GaussianModel() + LinearModel(),
    params={
        "center": xr.DataArray([-2, 0, 2], coords=[darr.beta]),
        "slope": -0.1,
    },
)
result_ds
[27]:
<xarray.Dataset> Size: 7kB
Dimensions:                (beta: 3, param: 5, cov_i: 5, cov_j: 5, fit_stat: 8,
                            alpha: 100)
Coordinates:
  * beta                   (beta) float64 24B -1.0 0.0 1.0
  * alpha                  (alpha) float64 800B -5.0 -4.899 -4.798 ... 4.899 5.0
  * param                  (param) <U9 180B 'amplitude' 'center' ... 'intercept'
  * fit_stat               (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic'
  * cov_i                  (cov_i) <U9 180B 'amplitude' 'center' ... 'intercept'
  * cov_j                  (cov_j) <U9 180B 'amplitude' 'center' ... 'intercept'
Data variables:
    modelfit_results       (beta) object 24B <lmfit.model.ModelResult object ...
    modelfit_coefficients  (beta, param) float64 120B 7.324 -2.0 ... 1.993
    modelfit_stderr        (beta, param) float64 120B 0.1349 0.011 ... 0.01735
    modelfit_covariance    (beta, cov_i, cov_j) float64 600B 0.01819 ... 0.00...
    modelfit_stats         (beta, fit_stat) float64 192B 31.0 5.0 ... -450.2
    modelfit_data          (beta, alpha) float64 2kB 2.453 2.402 ... 1.404 1.64
    modelfit_best_fit      (beta, alpha) float64 2kB 2.553 2.553 ... 1.526 1.504

Let’s overlay the fitted peak positions on the data.

[28]:
result_ds.modelfit_data.qplot()
result_center = result_ds.sel(param="center")
plt.plot(result_center.modelfit_coefficients, result_center.beta, '.-')
[28]:
[<matplotlib.lines.Line2D at 0x7f3d509da2d0>]
_images/user-guide_curve-fitting_51_1.svg

The same can be done with all parameter attributes that can be passed to lmfit.create_params() (e.g., vary, min, max, etc.). For example:

[29]:
result_ds = darr.modelfit(
    coords="alpha",
    model=GaussianModel() + LinearModel(),
    params={
        "center": {
            "value": xr.DataArray([-2, 0, 2], coords=[darr.beta]),
            "min": -5.0,
            "max": xr.DataArray([0, 2, 5], coords=[darr.beta]),
        },
        "slope": -0.1,
    },
)
result_ds
[29]:
<xarray.Dataset> Size: 7kB
Dimensions:                (beta: 3, param: 5, cov_i: 5, cov_j: 5, fit_stat: 8,
                            alpha: 100)
Coordinates:
  * beta                   (beta) float64 24B -1.0 0.0 1.0
  * alpha                  (alpha) float64 800B -5.0 -4.899 -4.798 ... 4.899 5.0
  * param                  (param) <U9 180B 'amplitude' 'center' ... 'intercept'
  * fit_stat               (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic'
  * cov_i                  (cov_i) <U9 180B 'amplitude' 'center' ... 'intercept'
  * cov_j                  (cov_j) <U9 180B 'amplitude' 'center' ... 'intercept'
Data variables:
    modelfit_results       (beta) object 24B <lmfit.model.ModelResult object ...
    modelfit_coefficients  (beta, param) float64 120B 7.324 -2.0 ... 1.993
    modelfit_stderr        (beta, param) float64 120B 0.1349 0.011 ... 0.01735
    modelfit_covariance    (beta, cov_i, cov_j) float64 600B 0.01819 ... 0.00...
    modelfit_stats         (beta, fit_stat) float64 192B 31.0 5.0 ... -450.2
    modelfit_data          (beta, alpha) float64 2kB 2.453 2.402 ... 1.404 1.64
    modelfit_best_fit      (beta, alpha) float64 2kB 2.553 2.553 ... 1.526 1.504
Visualizing fits

If hvplot is installed, we can visualize the fit results interactively with the qshow accessor.

Note

If you are viewing this documentation online, the plot will not be interactive. Run the code locally to try it out.

[30]:
result_ds.qshow(plot_components=True)
[30]:

Note

There is a dedicated module for Fermi edge fitting and correction, described here.

Parallelization

The accessors are tightly integrated with xarray, so passing a dask array will parallelize the fitting process. See Parallel Computing with Dask for more information.

For non-dask objects, you can achieve joblib-based parallelization:

  • For non-dask Datasets, basic parallelization across multiple data variables can be achieved with the parallel argument to xarray.Dataset.modelfit.

  • For parallelizing fits on a single DataArray along a dimension with many points, the xarray.DataArray.parallel_fit accessor can be used. This method is similar to xarray.DataArray.modelfit, but requires the name of the dimension to parallelize over instead of the coordinates to fit along. For example, to parallelize the fit in the previous example, you can use the following code:

    gold_selected.parallel_fit(
        dim="alpha",
        model=FermiEdgeModel(),
        params={"temp": {"value": 100.0, "vary": False}},
        guess=True,
    )
    

    Note

    • Note that the initial run will take a long time due to the overhead of creating parallel workers. Subsequent calls will run faster, since joblib’s default backend will try to reuse the workers.

    • The accessor has some intrinsic overhead due to post-processing. If you need the best performance, handle the parallelization yourself with joblib and lmfit.Model.fit.

Saving and loading fits

Since the resulting dataset contains no Python objects, it can be saved and loaded reliably as a netCDF file using xarray.Dataset.to_netcdf() and xarray.open_dataset(), respectively. If you wish to obtain the lmfit.model.ModelResult objects from the fit, you can use the output_result argument.

Warning

Saving full model results that includes the model functions can be difficult. Instead of saving the fit results, it is recommended to save the code that can reproduce the fit. See the relevant lmfit documentation for more information.

Fermi edge fitting

Functions related to the Fermi edge are available in erlab.analysis.gold. To fit a polynomial to a Fermi edge, you can use erlab.analysis.gold.poly().

[32]:
import erlab.analysis as era
import erlab.plotting.erplot as eplt

gold = generate_gold_edge(temp=100, seed=1)
result = era.gold.poly(
    gold,
    angle_range=(-15, 15),
    eV_range=(-0.2, 0.2),
    temp=100.0,
    vary_temp=False,
    degree=2,
    plot=True,
)
_images/user-guide_curve-fitting_58_1.svg

The resulting polynomial can be used to correct the Fermi edge with erlab.analysis.gold.correct_with_edge().

[33]:
era.correct_with_edge(gold, result).qplot(cmap="Greys")
eplt.fermiline()
[33]:
<matplotlib.lines.Line2D at 0x7f3d49fa2150>
_images/user-guide_curve-fitting_60_1.svg

Also check out the interactive Fermi edge fitting tool, erlab.interactive.goldtool().

Using iminuit

Note

This part requires iminuit.

iminuit is a powerful Python interface to the Minuit C++ library developed at CERN. To learn more, see the iminuit documentation.

ERLabPy provides a thin wrapper around iminuit.Minuit that allows you to use lmfit models with iminuit. The example below conducts the same fit as the previous one, but using iminuit.

[34]:
from erlab.analysis.fit.minuit import Minuit
from erlab.analysis.fit.models import MultiPeakModel

model = MultiPeakModel(npeaks=4, peak_shapes=["lorentzian"], fd=False, convolve=True)

m = Minuit.from_lmfit(
    model,
    mdc,
    mdc.kx,
    p0_center=-0.6,
    p1_center=-0.45,
    p2_center=0.45,
    p3_center=0.6,
    p0_width=0.02,
    p1_width=0.02,
    p2_width=0.02,
    p3_width=0.02,
    p0_height=1500,
    p1_height=50,
    p2_height=50,
    p3_height=1500,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)

m.migrad()
m.minos()
m.hesse()
[34]:
Migrad
FCN = 566.5 (χ²/ndof = 2.4) Nfcn = 5120
EDM = 4.13e-06 (Goal: 0.0002)
Valid Minimum Below EDM threshold (goal x 10)
No parameters at limit Below call limit
Hesse ok Covariance accurate
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 p0_center -596.505e-3 0.016e-3 -0.016e-3 0.016e-3
1 p0_width 24.03e-3 0.11e-3 -0.11e-3 0.11e-3 0
2 p0_height 1.161e3 0.004e3 -0.004e3 0.004e3 0
3 p1_center -444.89e-3 0.30e-3 -0.30e-3 0.30e-3
4 p1_width 18e-3 1e-3 -1e-3 1e-3 0
5 p1_height 68.5 2.8 -2.7 2.9 0
6 p2_center 443.53e-3 0.34e-3 -0.34e-3 0.34e-3
7 p2_width 0.0237 0.0010 -0.0010 0.0010 0
8 p2_height 57.1 1.8 -1.7 1.8 0
9 p3_center 596.065e-3 0.016e-3 -0.016e-3 0.017e-3
10 p3_width 24.44e-3 0.11e-3 -0.11e-3 0.11e-3 0
11 p3_height 1.153e3 0.004e3 -0.004e3 0.004e3 0
12 lin_bkg 0.0 0.1 yes
13 const_bkg 0.27 0.09 -0.09 0.09
14 resolution 26.34e-3 0.10e-3 -0.10e-3 0.10e-3 0
p0_center p0_width p0_height p1_center p1_width p1_height p2_center p2_width p2_height p3_center p3_width p3_height const_bkg resolution
Error -0.016e-3 0.016e-3 -0.11e-3 0.11e-3 -4 4 -0.3e-3 0.3e-3 -1e-3 1e-3 -2.7 2.9 -0.34e-3 0.34e-3 -0.001 0.001 -1.7 1.8 -0.016e-3 0.017e-3 -0.11e-3 0.11e-3 -4 4 -0.09 0.09 -0.1e-3 0.1e-3
Valid True True True True True True True True True True True True True True True True True True True True True True True True True True True True
At Limit False False False False False False False False False False False False False False False False False False False False False False False False False False False False
Max FCN False False False False False False False False False False False False False False False False False False False False False False False False False False False False
New Min False False False False False False False False False False False False False False False False False False False False False False False False False False False False
p0_center p0_width p0_height p1_center p1_width p1_height p2_center p2_width p2_height p3_center p3_width p3_height lin_bkg const_bkg resolution
p0_center 2.68e-10 0.01e-9 (0.004) -309.64e-9 (-0.005) 0.02e-9 (0.004) -0.47e-9 (-0.029) 1.09497e-6 (0.024) -0 -0.04e-9 (-0.002) 24.04e-9 -0 0.01e-9 (0.004) -283.08e-9 (-0.004) 0 9.90e-9 (0.007) -0.01e-9 (-0.005)
p0_width 0.01e-9 (0.004) 1.17e-08 -442.668e-6 (-0.981) 0.001e-6 (0.036) 0.001e-6 (0.007) -10.032e-6 (-0.033) -0.002e-6 (-0.045) 0.006e-6 (0.054) -13.035e-6 (-0.068) 0.01e-9 (0.008) 0.009e-6 (0.785) -352.815e-6 (-0.801) 0 -3.894e-6 (-0.419) -0.010e-6 (-0.879)
p0_height -309.64e-9 (-0.005) -442.668e-6 (-0.981) 17.4 -41.28e-6 (-0.033) -63.7e-6 (-0.015) 1 (0.043) 66.39e-6 (0.047) -173.7e-6 (-0.040) 0.5 (0.062) -523.22e-9 (-0.008) -362.570e-6 (-0.802) 14 (0.823) 0 0.132 (0.369) 394.794e-6 (0.906)
p1_center 0.02e-9 (0.004) 0.001e-6 (0.036) -41.28e-6 (-0.033) 8.75e-08 0 (0.016) -7.69e-6 (-0.009) -0 (-0.001) 0 (0.004) -1.73e-6 (-0.003) 0 0.001e-6 (0.026) -31.38e-6 (-0.026) 0 -0.59e-6 (-0.023) -0.001e-6 (-0.028)
p1_width -0.47e-9 (-0.029) 0.001e-6 (0.007) -63.7e-6 (-0.015) 0 (0.016) 9.86e-07 -2.6249e-3 (-0.949) -0 0.1e-6 (0.057) -53.5e-6 (-0.030) 0.04e-9 (0.003) 0.007e-6 (0.067) -212.3e-6 (-0.053) 0e-6 -20.8e-6 (-0.244) -0.005e-6 (-0.045)
p1_height 1.09497e-6 (0.024) -10.032e-6 (-0.033) 1 (0.043) -7.69e-6 (-0.009) -2.6249e-3 (-0.949) 7.76 2.16e-6 (0.002) -111.5e-6 (-0.039) 0.1 (0.023) -98.11e-9 (-0.002) -24.143e-6 (-0.080) 1 (0.072) 0 0.042 (0.177) 20.462e-6 (0.070)
p2_center -0 -0.002e-6 (-0.045) 66.39e-6 (0.047) -0 (-0.001) -0 2.16e-6 (0.002) 1.15e-07 0.03e-6 (0.084) -44.92e-6 (-0.075) 0.05e-9 (0.009) -0.002e-6 (-0.065) 83.33e-6 (0.060) 0 0.30e-6 (0.010) 0.002e-6 (0.052)
p2_width -0.04e-9 (-0.002) 0.006e-6 (0.054) -173.7e-6 (-0.040) 0 (0.004) 0.1e-6 (0.057) -111.5e-6 (-0.039) 0.03e-6 (0.084) 1.07e-06 -1.6728e-3 (-0.911) 0.57e-9 (0.033) -0.001e-6 (-0.011) -0.7e-6 0 -20.8e-6 (-0.234) -0.003e-6 (-0.031)
p2_height 24.04e-9 -13.035e-6 (-0.068) 0.5 (0.062) -1.73e-6 (-0.003) -53.5e-6 (-0.030) 0.1 (0.023) -44.92e-6 (-0.075) -1.6728e-3 (-0.911) 3.15 -752.74e-9 (-0.026) -4.459e-6 (-0.023) 0.3 (0.036) 0.0 0.021 (0.136) 11.425e-6 (0.062)
p3_center -0 0.01e-9 (0.008) -523.22e-9 (-0.008) 0 0.04e-9 (0.003) -98.11e-9 (-0.002) 0.05e-9 (0.009) 0.57e-9 (0.033) -752.74e-9 (-0.026) 2.71e-10 0.01e-9 (0.005) -373.21e-9 (-0.006) 0 -16.96e-9 (-0.012) -0.01e-9 (-0.008)
p3_width 0.01e-9 (0.004) 0.009e-6 (0.785) -362.570e-6 (-0.802) 0.001e-6 (0.026) 0.007e-6 (0.067) -24.143e-6 (-0.080) -0.002e-6 (-0.065) -0.001e-6 (-0.011) -4.459e-6 (-0.023) 0.01e-9 (0.005) 1.18e-08 -432.171e-6 (-0.980) 0 -3.878e-6 (-0.417) -0.010e-6 (-0.879)
p3_height -283.08e-9 (-0.004) -352.815e-6 (-0.801) 14 (0.823) -31.38e-6 (-0.026) -212.3e-6 (-0.053) 1 (0.072) 83.33e-6 (0.060) -0.7e-6 0.3 (0.036) -373.21e-9 (-0.006) -432.171e-6 (-0.980) 16.5 0 0.128 (0.367) 384.873e-6 (0.905)
lin_bkg 0 0 0 0 0e-6 0 0 0 0.0 0 0 0 0 0.000 0
const_bkg 9.90e-9 (0.007) -3.894e-6 (-0.419) 0.132 (0.369) -0.59e-6 (-0.023) -20.8e-6 (-0.244) 0.042 (0.177) 0.30e-6 (0.010) -20.8e-6 (-0.234) 0.021 (0.136) -16.96e-9 (-0.012) -3.878e-6 (-0.417) 0.128 (0.367) 0.000 0.00736 3.146e-6 (0.351)
resolution -0.01e-9 (-0.005) -0.010e-6 (-0.879) 394.794e-6 (0.906) -0.001e-6 (-0.028) -0.005e-6 (-0.045) 20.462e-6 (0.070) 0.002e-6 (0.052) -0.003e-6 (-0.031) 11.425e-6 (0.062) -0.01e-9 (-0.008) -0.010e-6 (-0.879) 384.873e-6 (0.905) 0 3.146e-6 (0.351) 1.09e-08
_images/user-guide_curve-fitting_62_1.svg

You can also use the interactive fitting interface provided by iminuit.

Note

  • Requires ipywidgets to be installed.

  • If you are viewing this documentation online, changing the sliders won’t change the plot. run the code locally to try it out.

[35]:
m.interactive()
[35]:

Using ImageTool

Imagetool Imagetool

Inspired by Image Tool for Igor Pro written by the Advanced Light Source at Lawrence Berkeley National Laboratory, ImageTool is a simple tool for interactively exploring images.

Features include:

  • Zooming and panning

  • Extremely fast and smooth data exploration

  • Real-time binning across multiple dimensions

  • Multiple cursors!

  • Easy and intuitive plot size adjustment with splitters

  • Advanced colormap control

ImageTool can be used to display image-like xarray.DataArrays ranging from 2 to 4 dimensions. If a coordinate of the input data happens to be non-uniform, it will automatically be converted to an index array so that the data can be displayed.

There are two main ways to invoke the ImageTool. First is to call the itool convenience function, which will create a new ImageTool instance and handle the event loop execution:

import erlab.interactive as eri
eri.itool(data)

Another way is to use the qshow accessor:

data.qshow()
Tips
  • If you don’t know what a button does, many buttons have tooltips that will appear when you hover over them.

  • Right-clicking on each plot will bring up a context menu with various options. One useful option is Copy selection code that copies the selection code which can be quickly pasted to a Python script or Jupyter notebook to reproduce the sliced data. You can also save the data corresponding to each slice as a HDF5 file.

  • ImageTool is also very extensible. At our home lab, we use a modified version of ImageTool to plot data as it is being collected in real-time!

Keyboard shortcuts

Hints for most keyboard shortcuts are displayed in the menu bar. Here, some shortcuts that are not found in the menu bar are listed. Mac users must replace Ctrl with and Alt with .

Shortcut

Description

LMB Drag

Pan around

RMB Drag

Zoom and scale

Ctrl+LMB Drag

Move current cursor around

Ctrl+Alt+LMB Drag

Move all cursors around

Alt while dragging a cursor line

Make all cursor lines move together

Further reading

API Reference

ERLabPy is organized into multiple subpackages and submodules classified by their functionality. The following table lists the subpackages and submodules of ERLabPy.

Subpackages

Subpackage

Description

erlab.analysis

Routines for analyzing ARPES data.

erlab.io

Reading and writing data.

erlab.plotting

Functions related to static plotting with matplotlib.

erlab.interactive

Interactive tools and widgets based on Qt and pyqtgraph

erlab.accessors

xarray accessors. You will not need to import this module directly.

Analysis (erlab.analysis)

Various functions for data analysis.

Modules

fit

Utilities for curve fitting.

mask

Functions related to masking.

correlation

Macros for correlation analysis.

gold

Fermi edge fitting.

image

Various image processing functions including tools for visualizing dispersive features.

interpolate

Utilities for interpolation.

kspace

Momentum conversion functions.

transform

Transformations.

utilities

erlab.analysis.correct_with_edge(darr, modelresult, shift_coords=True, plot=False, plot_kw=None, **shift_kwargs)[source]

Corrects the given data array darr with the given values or fit result.

Parameters:
  • darr (DataArray) – The input data array to be corrected.

  • modelresult (ModelResult | ndarray[Any, dtype[floating]] | Callable) – The model result that contains the fermi edge information. It can be an instance of lmfit.model.ModelResult, a numpy array containing the edge position at each angle, or a callable function that takes an array of angles and returns the corresponding energy value.

  • shift_coords (bool) – If True, the coordinates of the output data will be changed so that the output contains all the values of the original data. If False, the coordinates and shape of the original data will be retained, and only the data will be shifted. Defaults to False.

  • plot (bool) – Whether to plot the original and corrected data arrays. Defaults to False.

  • plot_kw (dict | None) – Additional keyword arguments for the plot. Defaults to None.

  • **shift_kwargs – Additional keyword arguments to erlab.analysis.utilities.shift.

Returns:

corrected – The edge corrected data.

Return type:

xarray.DataArray

erlab.analysis.mask_with_hex_bz(kxymap, a=3.54, rotate=0.0, invert=False)[source]

Return map masked with a hexagonal BZ.

erlab.analysis.mask_with_polygon(arr, vertices, dims=('kx', 'ky'), invert=False)[source]
erlab.analysis.polygon_mask(vertices, x, y, invert=False)[source]

Create a mask based on a polygon defined by its vertices.

Parameters:
  • vertices (ndarray[Any, dtype[float64]]) – The vertices of the polygon. The shape should be (N, 2), where N is the number of vertices.

  • x (ndarray[Any, dtype[float64]]) – The x-coordinates of the grid points.

  • y (ndarray[Any, dtype[float64]]) – The y-coordinates of the grid points.

  • invert (bool) – If True, invert the mask (i.e., set True where the polygon is outside and False where it is inside). Default is False.

Returns:

mask – The mask array with shape (len(x), len(y)). The mask contains True where the polygon is inside and False where it is outside (or vice versa if invert is True).

Return type:

ndarray

Note

This function uses the erlab.analysis.mask.polygon.bounded_side_bool to determine whether a point is inside or outside the polygon.

Example

>>> vertices = np.array([[0.2, 0.2], [0.2, 0.8], [0.8, 0.8], [0.8, 0.2]])
>>> x = np.linspace(0, 1, 5)
>>> y = np.linspace(0, 1, 5)
>>> polygon_mask(vertices, x, y)
array([[False, False, False, False, False],
       [False,  True,  True,  True, False],
       [False,  True,  True,  True, False],
       [False,  True,  True,  True, False],
       [False, False, False, False, False]])
erlab.analysis.polygon_mask_points(vertices, x, y, invert=False)[source]

Compute a mask indicating whether points are inside or outside a polygon.

Parameters:
  • vertices (ndarray[Any, dtype[float64]]) – The vertices of the polygon. The shape should be (N, 2), where N is the number of vertices.

  • x (ndarray[Any, dtype[float64]]) – The x-coordinates of the points.

  • y (ndarray[Any, dtype[float64]]) – The y-coordinates of the points.

  • invert (bool) – If True, invert the mask (i.e., set True where the polygon is outside and False where it is inside). Default is False.

Returns:

mask – A boolean array of shape (len(x),) indicating whether each point is inside or outside the polygon.

Return type:

ndarray

Raises:

ValueError – If the lengths of x and y are not equal.

Notes

This function uses the erlab.analysis.mask.polygon.bounded_side_bool to determine whether a point is inside or outside the polygon.

erlab.analysis.rotateinplane(data, rotate, **interp_kwargs)[source]
erlab.analysis.rotatestackinplane(data, rotate, **interp_kwargs)[source]
erlab.analysis.shift(darr, shift, along, shift_coords=False, **shift_kwargs)[source]

Shifts the values of a DataArray along a single dimension.

The shift is applied using scipy.ndimage.shift with the specified keyword arguments. Linear interpolation is used by default.

Parameters:
  • darr (DataArray) – The array to shift.

  • shift (float | DataArray) – The amount of shift to be applied along the specified dimension. If shift is a DataArray, different shifts can be applied to different coordinates. The dimensions of shift must be a subset of the dimensions of darr. For more information, see the note below. If shift is a float, the same shift is applied to all values along dimension along. This is equivalent to providing a 0-dimensional DataArray.

  • along (str) – Name of the dimension along which the shift is applied.

  • shift_coords (bool) – If True, the coordinates of the output data will be changed so that the output contains all the values of the original data. If False, the coordinates and shape of the original data will be retained, and only the data will be shifted. Defaults to False.

  • **shift_kwargs – Additional keyword arguments passed onto scipy.ndimage.shift. Default values of cval and order are set to np.nan and 1 respectively.

Returns:

The shifted DataArray.

Return type:

xarray.DataArray

Note

  • All dimensions in shift must be a dimension in darr.

  • The shift array values are divided by the step size along the along dimension.

  • NaN values in shift are treated as zero.

Example

>>> import xarray as xr
>>> darr = xr.DataArray(
...     np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]).astype(float), dims=["x", "y"]
... )
>>> shift_arr = xr.DataArray([1, 0, 2], dims=["x"])
>>> shifted = erlab.analysis.utilities.shift(darr, shift_arr, along="y")
>>> print(shifted)
<xarray.DataArray (x: 3, y: 3)> Size: 72B
nan 1.0 2.0 4.0 5.0 6.0 nan nan 7.0
Dimensions without coordinates: x, y
erlab.analysis.slice_along_path(darr, vertices, step_size=None, dim_name='path', interp_kwargs=None, **vertices_kwargs)[source]

Interpolate a DataArray along a path defined by a sequence of vertices.

Parameters:
  • darr (DataArray) – The data array to interpolate.

  • vertices (Mapping[Hashable, Sequence]) – Dictionary specifying the vertices of the path along which to interpolate the DataArray. The keys of the dictionary should correspond to the dimensions of the DataArray along which to interpolate.

  • step_size (float | None) – The step size to use for the interpolation. This determines the number of points along the path at which the data array will be interpolated. If None, the step size is determined automatically as the smallest step size of the coordinates along the dimensions of the vertices if all coordinates are evenly spaced. If there exists a dimension where the coordinates are not evenly spaced, step_size must be specified.

  • dim_name (str) – The name of the new dimension that corresponds to the distance along the interpolated path. Default is “path”.

  • interp_kwargs (dict | None) – Additional keyword arguments passed to xarray.DataArray.interp.

  • **vertices_kwargs – The keyword arguments form of vertices. One of vertices or vertices_kwargs must be provided.

Returns:

interpolated – New dataarray on the new coordinate.

Return type:

DataArray

Examples

>>> import numpy as np
>>> import xarray as xr
>>> from erlab.analysis.interpolate import slice_along_path
>>> x = np.linspace(0, 10, 11)
>>> y = np.linspace(0, 10, 11)
>>> z = np.linspace(0, 10, 11)
>>> data = np.random.rand(11, 11, 11)
>>> darr = xr.DataArray(data, coords={"x": x, "y": y, "z": z}, dims=["x", "y", "z"])
>>> vertices = {"x": [0, 5, 10], "y": [0, 5, 10], "z": [0, 5, 10]}
>>> interp = slice_along_path(darr, vertices)

Data IO (erlab.io)

Read & write ARPES data.

This module provides functions that enables loading various files such as hdf5 files, igor pro files, and ARPES data from different beamlines and laboratories.

Modules

plugins

Data loading plugins.

dataloader

Base functionality for implementing data loaders.

utilities

igor

exampledata

Generates simple simulated ARPES data for testing purposes.

characterization

Data import for characterization experiments.

For a single session, it is very common to use only one type of loader for a single folder with all your data. Hence, the module provides a way to set a default loader for a session. This is done using the set_loader() function. The same can be done for the data directory using the set_data_dir() function.

For instructions on how to write a custom loader, see erlab.io.dataloader.

Examples

  • View all registered loaders:

    >>> erlab.io.loaders
    
  • Load data by explicitly specifying the loader:

    >>> dat = erlab.io.loaders["merlin"].load(...)
    
erlab.io.load(data_dir=None, **kwargs)[source]

Load ARPES data.

Parameters:
  • identifier – Value that identifies a scan uniquely. If a string or path-like object is given, it is assumed to be the path to the data file. If an integer is given, it is assumed to be a number that specifies the scan number, and is used to automatically determine the path to the data file(s).

  • data_dir (str | os.PathLike | None) – Where to look for the data. If None, the default data directory will be used.

  • single – For some setups, data for a single scan is saved over multiple files. This argument is only used for such setups. When identifier is resolved to a single file within a multiple file scan, the default behavior when single is False is to return a single concatenated array that contains data from all files in the same scan. If single is set to True, only the data from the file given is returned. This argument is ignored when identifier is a number.

  • **kwargs – Additional keyword arguments are passed to identify.

Returns:

The loaded data.

Return type:

xarray.DataArray or xarray.Dataset or list of xarray.DataArray

erlab.io.load_experiment(filename, folder=None, *, prefix=None, ignore=None, recursive=False, **kwargs)[source]

Load waves from an igor experiment (pxp) file.

Parameters:
  • filename (str | PathLike) – The experiment file.

  • folder (str | None) – Target folder within the experiment, given as a slash-separated string. If None, defaults to the root.

  • prefix (str | None) – If given, only include waves with names that starts with the given string.

  • ignore (list[str] | None) – List of wave names to ignore.

  • recursive (bool) – If True, includes waves in child directories.

  • **kwargs – Extra arguments to load_wave().

Returns:

Dataset containing the waves.

Return type:

xarray.Dataset

erlab.io.load_hdf5(filename, **kwargs)[source]

Load data from an HDF5 file saved with save_as_hdf5.

This is a thin wrapper around xarray.load_dataarray and xarray.load_dataset.

Parameters:
Returns:

The loaded data.

Return type:

xarray.DataArray or xarray.Dataset

erlab.io.load_wave(wave, data_dir=None)[source]

Load a wave from Igor binary format.

Parameters:
  • wave (dict | WaveRecord | str | PathLike) – The wave to load. It can be provided as a dictionary, an instance of igor2.record.WaveRecord, or a string representing the path to the wave file.

  • data_dir (str | PathLike | None) – The directory where the wave file is located. This parameter is only used if wave is a string or PathLike object. If None, wave must be a valid path.

Returns:

The loaded wave.

Return type:

xarray.DataArray

Raises:
  • ValueError – If the wave file cannot be found or loaded.

  • TypeError – If the wave argument is of an unsupported type.

erlab.io.loader_context(data_dir=None)[source]

Context manager for the current data loader and data directory.

Parameters:
  • loader (str, optional) – The name or alias of the loader to use in the context.

  • data_dir (str or os.PathLike, optional) – The data directory to use in the context.

Examples

  • Load data within a context manager:

    >>> with erlab.io.loader_context("merlin"):
    ...     dat_merlin = erlab.io.load(...)
    
  • Load data with different loaders and directories:

    >>> erlab.io.set_loader("ssrl52", data_dir="/path/to/dir1")
    >>> dat_ssrl_1 = erlab.io.load(...)
    >>> with erlab.io.loader_context("merlin", data_dir="/path/to/dir2"):
    ...     dat_merlin = erlab.io.load(...)
    >>> dat_ssrl_2 = erlab.io.load(...)
    
erlab.io.open_hdf5(filename, **kwargs)[source]

Open data from an HDF5 file saved with save_as_hdf5.

This is a thin wrapper around xarray.open_dataarray and xarray.open_dataset.

Parameters:
Returns:

The opened data.

Return type:

xarray.DataArray or xarray.Dataset

erlab.io.save_as_hdf5(data, filename, igor_compat=True, **kwargs)[source]

Save data in HDF5 format.

Parameters:
erlab.io.save_as_netcdf(data, filename, **kwargs)[source]

Save data in netCDF4 format.

Discards invalid netCDF4 attributes and produces a warning.

Parameters:
erlab.io.set_data_dir(data_dir)[source]

Set the default data directory for the data loader.

All subsequent calls to load will use the data_dir set here unless specified.

Parameters:

data_dir (str | PathLike | None) – The path to a directory.

Note

This will only affect load. If the loader’s load method is called directly, it will not use the default data directory.

erlab.io.set_loader(loader)[source]

Set the current data loader.

All subsequent calls to load will use the loader set here.

Parameters:

loader (str | LoaderBase | None) – The loader to set. It can be either a string representing the name or alias of the loader, or a valid loader class.

Example

>>> erlab.io.set_loader("merlin")
>>> dat_merlin_1 = erlab.io.load(...)
>>> dat_merlin_2 = erlab.io.load(...)
erlab.io.summarize(usecache=True, *, cache=True, display=True, **kwargs)[source]

Summarize the data in the given directory.

Takes a path to a directory and summarizes the data in the directory to a table, much like a log file. This is useful for quickly inspecting the contents of a directory.

The dataframe is formatted using the style from get_styler and displayed in the IPython shell. Results are cached in a pickle file in the directory.

Parameters:
  • data_dir – Directory to summarize.

  • usecache (bool) – Whether to use the cached summary if available. If False, the summary will be regenerated. The cache will be updated if cache is True.

  • cache (bool) – Whether to cache the summary in a pickle file in the directory. If False, no cache will be created or updated. Note that existing cache files will not be deleted, and will be used if usecache is True.

  • display (bool) – Whether to display the formatted dataframe using the IPython shell. If False, the dataframe will be returned without formatting. If True but the IPython shell is not detected, the dataframe styler will be returned.

  • **kwargs – Additional keyword arguments to be passed to generate_summary.

Returns:

df – Summary of the data in the directory.

  • If display is False, the summary DataFrame is returned.

  • If display is True and the IPython shell is detected, the summary will be displayed, and None will be returned.

    • If ipywidgets is installed, an interactive widget will be returned instead of None.

  • If display is True but the IPython shell is not detected, the styler for the summary DataFrame will be returned.

Return type:

pandas.DataFrame or pandas.io.formats.style.Styler or None

erlab.io.loaders

Global instance of LoaderRegistry.

Plotting (erlab.plotting)

Everything related to plotting.

Modules

annotations

Plot annotations.

atoms

Plot atoms.

bz

Utilities for plotting Brillouin zones.

colors

Utilities related to manipulating colors.

erplot

Convenient access to various plotting functions.

general

General plotting utilities.

plot3d

Extensions to mplot3d.

Functions

load_igor_ct(fname, name)

Load a Igor CT wave file (ibw) and register as a matplotlib colormap.

Interactive (erlab.interactive)

Interactive plotting based on Qt and pyqtgraph.

Interactive tools

imagetool

Interactive data browser.

bzplot

colors

Functions for manipulating colors in Qt.

curvefittingtool

fermiedge

kspace

Interactive momentum conversion tool.

masktool

derivative

Interactive tool for visualizing dispersive data.

utilities

Various helper functions and extensions to pyqtgraph.

erlab.interactive.dtool(data, data_name=None, *, execute=None)[source]
erlab.interactive.goldtool(data, data_corr=None, *, data_name=None, **kwargs)[source]

Interactive gold edge fitting.

Parameters:
  • data (DataArray) – The data to perform Fermi edge fitting on.

  • data_corr (DataArray | None) – The data to correct with the edge. Defaults to data.

  • data_name (str | None) – Name of the data used in generating the code snipped copied to the clipboard. Overrides automatic detection.

  • **kwargs – Arguments passed onto erlab.interactive.utilities.AnalysisWindow.

erlab.interactive.itool(data, link=False, link_colors=True, execute=None, **kwargs)[source]

Create and display an ImageTool window.

Parameters:
  • data (Collection[xr.DataArray | npt.NDArray] | xr.DataArray | npt.NDArray | xr.Dataset) – Array-like object or a sequence of such object with 2 to 4 dimensions. See notes.

  • link (bool) – Whether to enable linking between multiple ImageTool windows, by default False.

  • link_colors (bool) – Whether to link the color maps between multiple linked ImageTool windows, by default True.

  • execute (bool | None) – Whether to execute the Qt event loop and display the window, by default None. If None, the execution is determined based on the current IPython shell.

  • **kwargs – Additional keyword arguments to be passed onto the underlying slicer area. For a full list of supported arguments, see the erlab.interactive.imagetool.core.ImageSlicerArea documentation.

Returns:

The created ImageTool window(s).

Return type:

ImageTool or list of ImageTool

Notes

  • If data is a sequence of valid data, multiple ImageTool windows will be created and displayed.

  • If data is a Dataset, each DataArray in the Dataset will be displayed in a separate ImageTool window. Data variables with 2 to 4 dimensions are considered as valid. Other variables are ignored.

  • If link is True, the ImageTool windows will be synchronized.

Examples

>>> itool(data, cmap="gray", gamma=0.5)
>>> itool(data_list, link=True)
erlab.interactive.ktool(data, *, data_name=None, execute=None)[source]

Interactive momentum conversion tool.

Accessors (erlab.accessors)

Some xarray accessors for convenient data analysis and visualization.

Modules
class erlab.accessors.InteractiveDataArrayAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.qshow accessor for interactive visualization.

__call__(*args, **kwargs)[source]

Visualize the data interactively.

Chooses the appropriate interactive visualization method based on the number of dimensions in the data.

Parameters:
  • *args – Positional arguments passed onto the interactive visualization function.

  • **kwargs – Keyword arguments passed onto the interactive visualization function.

hvplot(*args, **kwargs)[source]

hvplot-based interactive visualization.

Parameters:
  • *args – Positional arguments passed onto DataArray.hvplot().

  • **kwargs – Keyword arguments passed onto DataArray.hvplot().

Raises:

ImportError – If hvplot is not installed.

itool(*args, **kwargs)[source]

Shortcut for itool.

Parameters:
  • *args – Positional arguments passed onto itool.

  • **kwargs – Keyword arguments passed onto itool.

class erlab.accessors.InteractiveDatasetAccessor(xarray_obj)[source]

Bases: ERLabDatasetAccessor

xarray.Dataset.qshow accessor for interactive visualization.

__call__(*args, **kwargs)[source]

Visualize the data interactively.

Chooses the appropriate interactive visualization method based on the data variables.

Parameters:
  • *args – Positional arguments passed onto the interactive visualization function.

  • **kwargs – Keyword arguments passed onto the interactive visualization function.

fit(plot_components=False)[source]

Interactive visualization of fit results.

Parameters:

plot_components (bool) – If True, plot the components of the fit. Default is False. Requires the Dataset to have a modelfit_results variable.

Returns:

A panel containing the interactive visualization.

Return type:

panel.Column

hvplot(*args, **kwargs)[source]

hvplot-based interactive visualization.

Parameters:
  • *args – Positional arguments passed onto Dataset.hvplot().

  • **kwargs – Keyword arguments passed onto Dataset.hvplot().

Raises:

ImportError – If hvplot is not installed.

itool(*args, **kwargs)[source]

Shortcut for itool.

Parameters:
  • *args – Positional arguments passed onto itool.

  • **kwargs – Keyword arguments passed onto itool.

class erlab.accessors.ModelFitDataArrayAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.modelfit accessor for fitting lmfit models.

__call__(*args, **kwargs)[source]

Curve fitting optimization for arbitrary models.

Wraps lmfit.Model.fit with xarray.apply_ufunc().

Parameters:
  • coords (Hashable, xarray.DataArray, or Sequence of Hashable or xarray.DataArray) – Independent coordinate(s) over which to perform the curve fitting. Must share at least one dimension with the calling object. When fitting multi-dimensional functions, supply coords as a sequence in the same order as arguments in func. To fit along existing dimensions of the calling object, coords can also be specified as a str or sequence of strs.

  • model (lmfit.Model) – A model object to fit to the data. The model must be an instance of lmfit.Model.

  • reduce_dims (str, Iterable of Hashable or None, optional) – Additional dimension(s) over which to aggregate while fitting. For example, calling ds.modelfit(coords='time', reduce_dims=['lat', 'lon'], ...) will aggregate all lat and lon points and fit the specified function along the time dimension.

  • skipna (bool, default: True) – Whether to skip missing values when fitting. Default is True.

  • params (lmfit.Parameters, dict-like, or xarray.DataArray, optional) – Optional input parameters to the fit. If a lmfit.Parameters object, it will be used for all fits. If a dict-like object, it must look like the keyword arguments to lmfit.create_params. Additionally, each value of the dictionary may also be a DataArray, which will be broadcasted appropriately. If a DataArray, each entry must be a dictionary-like object, a lmfit.Parameters object, or a JSON string created with lmfit.Parameters.dumps. If given a Dataset, the name of the data variables in the Dataset must match the name of the data variables in the calling object, and each data variable will be used as the parameters for the corresponding data variable. If none or only some parameters are passed, the rest will be assigned initial values and bounds with lmfit.Model.make_params, or guessed with lmfit.Model.guess if guess is True.

  • guess (bool, default: False) – Whether to guess the initial parameters with lmfit.Model.guess. For composite models, the parameters will be guessed for each component.

  • errors ({"raise", "ignore"}, default: "raise") – If 'raise', any errors from the lmfit.Model.fit optimization will raise an exception. If 'ignore', the return values for the coordinates where the fitting failed will be NaN.

  • parallel (bool, optional) – Whether to parallelize the fits over the data variables. If not provided, parallelization is only applied for non-dask Datasets with more than 200 data variables.

  • parallel_kw (dict, optional) – Additional keyword arguments to pass to the parallelization backend joblib.Parallel if parallel is True.

  • progress (bool, default: False) – Whether to show a progress bar for fitting over data variables. Only useful if there are multiple data variables to fit.

  • output_result (bool, default: True) – Whether to include the full lmfit.model.ModelResult object in the output dataset. If True, the result will be stored in a variable named modelfit_results.

  • **kwargs (optional) – Additional keyword arguments to passed to lmfit.Model.fit.

Returns:

curvefit_results – A single dataset which contains:

modelfit_results

The full lmfit.model.ModelResult object from the fit. Only included if output_result is True.

modelfit_coefficients

The coefficients of the best fit.

modelfit_stderr

The standard errors of the coefficients.

modelfit_covariance

The covariance matrix of the coefficients. Note that elements corresponding to non varying parameters are set to NaN, and the actual size of the covariance matrix may be smaller than the array.

modelfit_stats

Statistics from the fit. See lmfit.minimize.

modelfit_data

Data used for the fit.

modelfit_best_fit

The best fit data of the fit.

Return type:

xarray.Dataset

class erlab.accessors.ModelFitDatasetAccessor(xarray_obj)[source]

Bases: ERLabDatasetAccessor

xarray.Dataset.modelfit accessor for fitting lmfit models.

__call__(coords, model, reduce_dims=None, skipna=True, params=None, guess=False, errors='raise', parallel=None, parallel_kw=None, progress=False, output_result=True, **kwargs)[source]

Curve fitting optimization for arbitrary models.

Wraps lmfit.Model.fit with xarray.apply_ufunc().

Parameters:
  • coords (Hashable, xarray.DataArray, or Sequence of Hashable or xarray.DataArray) – Independent coordinate(s) over which to perform the curve fitting. Must share at least one dimension with the calling object. When fitting multi-dimensional functions, supply coords as a sequence in the same order as arguments in func. To fit along existing dimensions of the calling object, coords can also be specified as a str or sequence of strs.

  • model (lmfit.Model) – A model object to fit to the data. The model must be an instance of lmfit.Model.

  • reduce_dims (str, Iterable of Hashable or None, optional) – Additional dimension(s) over which to aggregate while fitting. For example, calling ds.modelfit(coords='time', reduce_dims=['lat', 'lon'], ...) will aggregate all lat and lon points and fit the specified function along the time dimension.

  • skipna (bool, default: True) – Whether to skip missing values when fitting. Default is True.

  • params (lmfit.Parameters, dict-like, or xarray.DataArray, optional) – Optional input parameters to the fit. If a lmfit.Parameters object, it will be used for all fits. If a dict-like object, it must look like the keyword arguments to lmfit.create_params. Additionally, each value of the dictionary may also be a DataArray, which will be broadcasted appropriately. If a DataArray, each entry must be a dictionary-like object, a lmfit.Parameters object, or a JSON string created with lmfit.Parameters.dumps. If given a Dataset, the name of the data variables in the Dataset must match the name of the data variables in the calling object, and each data variable will be used as the parameters for the corresponding data variable. If none or only some parameters are passed, the rest will be assigned initial values and bounds with lmfit.Model.make_params, or guessed with lmfit.Model.guess if guess is True.

  • guess (bool, default: False) – Whether to guess the initial parameters with lmfit.Model.guess. For composite models, the parameters will be guessed for each component.

  • errors ({"raise", "ignore"}, default: "raise") – If 'raise', any errors from the lmfit.Model.fit optimization will raise an exception. If 'ignore', the return values for the coordinates where the fitting failed will be NaN.

  • parallel (bool, optional) – Whether to parallelize the fits over the data variables. If not provided, parallelization is only applied for non-dask Datasets with more than 200 data variables.

  • parallel_kw (dict, optional) – Additional keyword arguments to pass to the parallelization backend joblib.Parallel if parallel is True.

  • progress (bool, default: False) – Whether to show a progress bar for fitting over data variables. Only useful if there are multiple data variables to fit.

  • output_result (bool, default: True) – Whether to include the full lmfit.model.ModelResult object in the output dataset. If True, the result will be stored in a variable named [var]_modelfit_results.

  • **kwargs (optional) – Additional keyword arguments to passed to lmfit.Model.fit.

Returns:

curvefit_results – A single dataset which contains:

[var]_modelfit_results

The full lmfit.model.ModelResult object from the fit. Only included if output_result is True.

[var]_modelfit_coefficients

The coefficients of the best fit.

[var]_modelfit_stderr

The standard errors of the coefficients.

[var]_modelfit_covariance

The covariance matrix of the coefficients. Note that elements corresponding to non varying parameters are set to NaN, and the actual size of the covariance matrix may be smaller than the array.

[var]_modelfit_stats

Statistics from the fit. See lmfit.minimize.

[var]_modelfit_data

Data used for the fit.

[var]_modelfit_best_fit

The best fit data of the fit.

Return type:

xarray.Dataset

class erlab.accessors.MomentumAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.kspace accessor for momentum conversion related utilities.

This class provides convenient access to various momentum-related properties of a data object. It allows getting and setting properties such as configuration, inner potential, work function, angle resolution, slit axis, momentum axes, angle parameters, and offsets.

convert(bounds=None, resolution=None, *, silent=False, **coords)[source]

Convert to momentum space.

Parameters:
  • bounds (dict[str, tuple[float, float]] | None) – A dictionary specifying the bounds for each coordinate axis. The keys are the names of the axes, and the values are tuples representing the lower and upper bounds of the axis. If not provided, the bounds will be estimated based on the data.

  • resolution (dict[str, float] | None) – A dictionary specifying the resolution for each momentum axis. The keys are the names of the axes, and the values are floats representing the desired resolution of the axis. If not provided, the resolution will be estimated based on the data. For in-plane momentum, the resolution is estimated from the angle resolution and kinetic energy. For out-of-plane momentum, two values are calculated. One is based on the number of photon energy points, and the other is estimated as the inverse of the photoelectron inelastic mean free path given by the universal curve. The resolution is estimated as the smaller of the two values.

  • silent (bool) – If True, suppresses printing, by default False.

  • **coords – Array-like keyword arguments that specifies the coordinate array for each momentum axis. If provided, the bounds and resolution will be ignored.

Returns:

The converted data.

Return type:

xarray.DataArray

Note

This method converts the data to a new coordinate system specified by the provided bounds and resolution. It uses interpolation to map the data from the original coordinate system to the new one.

The converted data is returned as a DataArray object with updated coordinates and dimensions.

Examples

Set parameters and convert with automatic bounds and resolution:

data.kspace.offsets = {"delta": 0.1, "xi": 0.0, "beta": 0.3}
data.kspace.work_function = 4.3
data.kspace.inner_potential = 12.0
converted_data = data.kspace.convert()

Convert with specified bounds and resolution:

bounds = {"kx": (0.0, 1.0), "ky": (-1.0, 1.0)}
resolution = {"kx": 0.01, "ky": 0.01}
converted_data = data.kspace.convert(bounds, resolution)
convert_coords()[source]

Convert the coordinates to momentum space.

Assigns new exact momentum coordinates to the data. This is useful when you want to work with momentum coordinates but don’t want to interpolate the data.

Returns:

The DataArray with transformed coordinates.

Return type:

xarray.DataArray

estimate_bounds()[source]

Estimate the bounds of the data in momentum space.

Returns:

bounds – A dictionary containing the estimated bounds for each parameter. The keys of the dictionary are ‘kx’, ‘ky’, and ‘kz’ (for \(hν\)-dependent data). The values are tuples representing the minimum and maximum values.

Return type:

dict[str, tuple[float, float]]

estimate_resolution(axis, lims=None, from_numpoints=False)[source]

Estimate the resolution for a given momentum axis.

Parameters:
  • axis (Literal['kx', 'ky', 'kz']) – Axis to estimate the resolution for.

  • lims (tuple[float, float] | None) – The limits of the axis used when from_numpoints is True. If not provided, reasonable limits will be calculated by estimate_bounds(), by default None

  • from_numpoints (bool) – If True, estimate the resolution from the number of points in the relevant axis. If False, estimate the resolution based on the data, by default False

Returns:

The estimated resolution.

Return type:

float

Raises:

ValueError – If no photon energy axis is found in data for axis 'kz'.

hv_to_kz(hv)[source]

Return \(k_z\) for a given photon energy.

Useful when creating overlays on \(hν\)-dependent data.

Parameters:

hv (float) – Photon energy in eV.

Note

This will be inexact for hv-dependent cuts that do not pass through the BZ center since we lost the exact angle values, i.e. the exact momentum perpendicular to the slit, during momentum conversion.

interactive(**kwargs)[source]

Open the interactive momentum space conversion tool.

property alpha: DataArray
property angle_params: dict[str, float]

Parameters passed to erlab.analysis.kspace.get_kconv_func().

property angle_resolution: float

Retrieve the angular resolution of the data in degrees.

Checks for the angle_resolution attribute of the data. If not found, a default value of 0.1° is silently assumed.

This property is used in best_kp_resolution upon estimating momentum step sizes through estimate_resolution.

property best_kp_resolution: float

Estimate the minimum in-plane momentum resolution.

The resolution is estimated with the kinetic energy and angular resolution:

\[\Delta k_{\parallel} \sim \sqrt{2 m_e E_k/\hbar^2} \cos(\alpha) \Delta\alpha\]
property best_kz_resolution: float

Estimate the minimum out-of-plane momentum resolution.

The resolution is estimated based on the mean free path [8] and the kinetic energy.

\[\Delta k_z \sim 1/\lambda\]
property beta: DataArray
property binding_energy: DataArray
property configuration: AxesConfiguration

Return the experimental configuration.

For a properly implemented data loader, the configuration attribute must be set on data import. See erlab.analysis.kspace.AxesConfiguration for details.

property has_beta: bool

Check if the coordinate array for \(β\) has more than one element.

Returns:

Returns True if the size of the coordinate array for \(β\) is greater than 1, False otherwise.

Return type:

bool

Note

This may be True for data that are not maps. For instance, \(hν\)-dependent cuts with an in-plane momentum offset may have a \(hν\)-dependent \(β\) offset.

property has_eV: bool

Return True if object has an energy axis.

property has_hv: bool

Return True for photon energy dependent data.

property hv: DataArray
property inner_potential: float

Inner potential of the sample in eV.

The inner potential is stored in the inner_potential attribute of the data. If the inner potential is not set, a warning is issued and a default value of 10.0 eV is assumed.

Note

This property provides a setter method that takes a float value and sets the data attribute accordingly.

Example

>>> data.kspace.inner_potential = 13.0
>>> data.kspace.inner_potential
13.0
property kinetic_energy: DataArray
property momentum_axes: tuple[Literal['kx', 'ky', 'kz'], ...]

Return the momentum axes of the data after conversion.

Returns:

For photon energy dependent scans, it returns the slit axis and 'kz'. For maps, it returns 'kx' and 'ky'. Otherwise, it returns only the slit axis.

Return type:

tuple

property offsets: OffsetView

Angle offsets used in momentum conversion.

Returns:

A mapping between valid offset keys and their corresponding offsets.

Return type:

OffsetView

Examples

  • View all offsets

    >>> data.kspace.offsets
    {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
    
  • View single offset

    >>> data.kspace.offsets["beta"]
    0.0
    
  • Offsets to dictionary

    >>> dict(data.kspace.offsets)
    {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
    
  • Set single offset

    >>> data.kspace.offsets["beta"] = 3.0
    >>> data.kspace.offsets
    {'delta': 0.0, 'xi': 0.0, 'beta': 3.0}
    
  • Overwrite offsets with dictionary

    >>> data.kspace.offsets = dict(delta=1.5, xi=2.7)
    >>> data.kspace.offsets
    {'delta': 1.5, 'xi': 2.7, 'beta': 0.0}
    
  • Update offsets

    >>> data.kspace.offsets.update(beta=0.1, xi=0.0)
    {'delta': 1.5, 'xi': 0.0, 'beta': 0.1}
    
  • Reset all offsets

    >>> data.kspace.offsets.reset()
    {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
    
property other_axis: Literal['kx', 'ky']

Return the momentum axis perpendicular to the slit.

Returns:

Returns 'ky' for type 1 configurations, 'kx' otherwise.

Return type:

str

property slit_axis: Literal['kx', 'ky']

Return the momentum axis parallel to the slit.

Returns:

Returns 'kx' for type 1 configurations, 'ky' otherwise.

Return type:

str

property valid_offset_keys: tuple[str, ...]

Get valid offset angles based on the experimental configuration.

Returns:

A tuple containing the valid offset keys. For configurations with a deflector, returns ("delta", "chi", "xi"). Otherwise, returns ("delta", "xi", "beta").

Return type:

tuple

property work_function: float

Work function of the sample in eV.

The work function is stored in the sample_workfunction attribute of the data. If not found, a warning is issued and a default value of 4.5 eV is assumed.

Note

This property provides a setter method that takes a float value and sets the data attribute accordingly.

Example

>>> data.kspace.work_function = 4.5
>>> data.kspace.work_function
4.5
class erlab.accessors.OffsetView(xarray_obj)[source]

Bases: object

A class representing an offset view for an xarray.DataArray.

This class provides a convenient way to access and manipulate angle offsets associated with the given data.

Parameters:

xarray_obj (DataArray) – The xarray.DataArray for which the offset view is created.

__len__() int:

Returns the number of valid offset keys.

__iter__() Iterator[str, float]:[source]

Returns an iterator over the valid offset keys and their corresponding values.

__getitem__(key: str) float:[source]

Returns the offset value associated with the given key.

__setitem__(key: str, value: float) None:[source]

Sets the offset value for the given key.

__eq__(other: object) bool:[source]

Compares the offset view with another object for equality. True if the dictionary representation is equal, False otherwise.

__repr__() str:[source]

Returns a string representation of the offset view.

_repr_html_() str:[source]

Returns an HTML representation of the offset view.

items()[source]

Return a view of the offset view as a list of (key, value) pairs.

reset()[source]

Reset all angle offsets to zero.

update(other=None, **kwargs)[source]

Update the offset view with the provided key-value pairs.

class erlab.accessors.ParallelFitDataArrayAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.parallel_fit accessor for fitting lmfit models in parallel.

__call__(dim, model, **kwargs)[source]

Fit the specified model to the data along the given dimension.

Parameters:
  • dim (str) – The name of the dimension along which to fit the model.

  • model (lmfit.Model) – The model to fit.

  • **kwargs (dict) – Additional keyword arguments to be passed to xarray.Dataset.modelfit.

Returns:

curvefit_results – The dataset containing the results of the fit. See xarray.DataArray.modelfit for details.

Return type:

xarray.Dataset

class erlab.accessors.PlotAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.qplot accessor for plotting data.

__call__(*args, **kwargs)[source]

Plot the data.

If a 2D data array is provided, it is plotted using plot_array. Otherwise, it is equivalent to calling xarray.DataArray.plot().

Parameters:
  • *args – Positional arguments to be passed to the plotting function.

  • **kwargs – Keyword arguments to be passed to the plotting function.

class erlab.accessors.SelectionAccessor(xarray_obj)[source]

Bases: ERLabDataArrayAccessor

xarray.DataArray.qsel accessor for convenient selection and averaging.

__call__(indexers=None, *, verbose=False, **indexers_kwargs)[source]

Select and average data along specified dimensions.

Parameters:
  • indexers (Mapping[Hashable, float | slice] | None) –

    Dictionary specifying the dimensions and their values or slices. Position along a dimension can be specified in three ways:

    • As a scalar value: alpha=-1.2

      If no width is specified, the data is selected along the nearest value. It is equivalent to xarray.DataArray.sel with method='nearest'.

    • As a value and width: alpha=5, alpha_width=0.5

      The data is averaged over a slice of width alpha_width, centered at alpha.

    • As a slice: alpha=slice(-10, 10)

      The data is selected over the specified slice. No averaging is performed.

    One of indexers or indexers_kwargs must be provided.

  • verbose (bool) – If True, print information about the selected data and averaging process. Default is False.

  • **indexers_kwargs – The keyword arguments form of indexers. One of indexers or indexers_kwargs must be provided.

Returns:

The selected and averaged data.

Return type:

xarray.DataArray

Raises:

ValueError – If a specified dimension is not present in the data.

Submodules

Submodule

Description

erlab.lattice

Tools for working with real and reciprocal lattices.

erlab.constants

Physical constants and functions for unit conversion.

erlab.parallel

Helpers for parallel processing.

Lattices (erlab.lattice)

Tools related to the real and reciprocal lattice.

Functions

angle_between(v1, v2)

Return the angle between two vectors.

abc2avec(a, b, c, alpha, beta, gamma)

Construct lattice vectors from lattice parameters.

avec2abc(avec)

Determine lattice parameters from lattice vectors.

to_reciprocal(avec)

Construct the reciprocal lattice vectors from real lattice vectors.

to_real(bvec)

Construct the real lattice vectors from reciprocal lattice vectors.

erlab.lattice.abc2avec(a, b, c, alpha, beta, gamma)[source]

Construct lattice vectors from lattice parameters.

erlab.lattice.angle_between(v1, v2)[source]

Return the angle between two vectors.

Parameters:
  • v1 (ndarray[Any, dtype[float64]]) – 1D array of length 3, specifying a vector.

  • v2 (ndarray[Any, dtype[float64]]) – 1D array of length 3, specifying a vector.

Returns:

The angle in degrees.

Return type:

float

erlab.lattice.avec2abc(avec)[source]

Determine lattice parameters from lattice vectors.

erlab.lattice.to_real(bvec)[source]

Construct the real lattice vectors from reciprocal lattice vectors.

Parameters:

bvec (ndarray[Any, dtype[float64]]) – Reciprocal lattice vectors, given as a 3 by 3 numpy array with each basis vector given in each row.

Returns:

The real lattice vectors.

Return type:

avec

erlab.lattice.to_reciprocal(avec)[source]

Construct the reciprocal lattice vectors from real lattice vectors.

Parameters:

avec (ndarray[Any, dtype[float64]]) – Real lattice vectors, given as a 3 by 3 numpy array with each basis vector given in each row.

Returns:

The reciprocal lattice vectors.

Return type:

bvec

Constants (erlab.constants)

Physical constants and unit conversion.

Module Attributes

e

Elementary charge \(e\) (C)

c

Speed of light \(c\) (m/s)

m_e

Electron mass \(m_e\) (kg) (±0.000 000 0028e-31)

mcsq

Electron rest energy \(m_e c^2\) (J) (±0.000 000 0025e-14)

mcsq_eV

Electron rest energy \(m_e c^2\) (eV) (±0.000 15)

h

Planck constant \(h\) (J·s)

hc

\(hc\) (J·m)

hbar

Dirac constant \(\hbar\) (J·s)

hbarc

\(\hbar c\) (J·m)

hbarsq

\(\hbar^2\) (J²·s²)

h_eV

Planck constant \(h\) (eV·s)

hc_eV

\(hc\) (eV·m)

hbar_eV

Dirac constant \(\hbar\) (eV·s)

hbarsq_eV

\(\hbar^2\) (eV²·s²)

rel_eV_nm

\(hc\) (eV·nm), Used in energy-wavelength conversion

rel_kconv

\(\frac{\sqrt{2 m_e}}{\hbar}\), Used in momentum conversion

rel_kzconv

\(\frac{\hbar^2}{2 m_e}\), Used in momentum conversion

kb

Boltzmann constant \(k_B\) (J/K)

kb_eV

Boltzmann constant \(k_B\) (eV/K)

Functions

conv_eV_nm(value)

Convert between energy and wavelength.

conv_watt_photons(value, wavelength)

Convert Watts to photons per second.

erlab.constants.conv_eV_nm(value)[source]

Convert between energy and wavelength.

erlab.constants.conv_watt_photons(value, wavelength)[source]

Convert Watts to photons per second.

Parameters:
  • value (float) – Power in Watts.

  • wavelength (float) – Wavelength in nanometers.

Returns:

power – Power in photons per second.

Return type:

float

erlab.constants.c: float = 299792458.0

Speed of light \(c\) (m/s)

erlab.constants.e: float = 1.602176634e-19

Elementary charge \(e\) (C)

erlab.constants.h: float = 6.62607015e-34

Planck constant \(h\) (J·s)

erlab.constants.h_eV: float = 4.135667696923859e-15

Planck constant \(h\) (eV·s)

erlab.constants.hbar: float = 1.0545718176461565e-34

Dirac constant \(\hbar\) (J·s)

erlab.constants.hbar_eV: float = 6.582119569509067e-16

Dirac constant \(\hbar\) (eV·s)

erlab.constants.hbarc: float = 3.1615267734966903e-26

\(\hbar c\) (J·m)

erlab.constants.hbarsq: float = 1.1121217185735183e-68

\(\hbar^2\) (J²·s²)

erlab.constants.hbarsq_eV: float = 4.332429802731423e-31

\(\hbar^2\) (eV²·s²)

erlab.constants.hc: float = 1.9864458571489286e-25

\(hc\) (J·m)

erlab.constants.hc_eV: float = 1.2398419843320028e-06

\(hc\) (eV·m)

erlab.constants.kb: float = 1.380649e-23

Boltzmann constant \(k_B\) (J/K)

erlab.constants.kb_eV: float = 8.617333262145179e-05

Boltzmann constant \(k_B\) (eV/K)

erlab.constants.m_e: float = 9.1093837015e-31

Electron mass \(m_e\) (kg) (±0.000 000 0028e-31)

erlab.constants.m_n: float = 1.67492749804e-27

Neutron mass \(m_n\) (kg) (±0.000 000 000 95e-27)

erlab.constants.mcsq: float = 8.1871057769e-14

Electron rest energy \(m_e c^2\) (J) (±0.000 000 0025e-14)

erlab.constants.mcsq_eV: float = 510998.95

Electron rest energy \(m_e c^2\) (eV) (±0.000 15)

erlab.constants.rel_eV_nm: float = 1239.8419843320028

\(hc\) (eV·nm), Used in energy-wavelength conversion

erlab.constants.rel_kconv: float = 0.512316721967493

\(\frac{\sqrt{2 m_e}}{\hbar}\), Used in momentum conversion

erlab.constants.rel_kzconv: float = 3.8099821161548606

\(\frac{\hbar^2}{2 m_e}\), Used in momentum conversion

Parallel processing (erlab.parallel)

Helper functions for parallel processing.

Functions

joblib_progress([file])

Patches joblib to report into a tqdm progress bar.

joblib_progress_qt(signal)

Context manager for interactive windows.

erlab.parallel.joblib_progress(file=None, **kwargs)[source]

Patches joblib to report into a tqdm progress bar.

erlab.parallel.joblib_progress_qt(signal)[source]

Context manager for interactive windows.

The number of completed tasks are emitted by the given signal.

Contributing Guide

Note

Parts of this document are based on Contributing to pandas and Contributing to xarray.

We welcome your enthusiasm! All contributions, including bug reports, bug fixes, documentation improvements, enhancement suggestions, and other ideas are welcome.

If you have any questions, feel free to ask us! The recommended place to ask questions is GitHub Discussions.

Bug reports and enhancement requests

If you find a bug in the code or documentation, do not hesitate to submit a ticket to the Issue Tracker. You are also welcome to post feature requests or pull requests.

When reporting a bug, see this stackoverflow article for tips on writing a good bug report, and this article on minimal bug reports.

Creating a development environment

First, you will need to install git and conda (or mamba).

Installing git

Below are some quick instructions for installing git on various operating systems. For more detailed instructions, see the git installation guide.

  • macOS (Intel & ARM): get Xcode Command Line Tools by running in your terminal window:

    xcode-select --install
    
  • Windows 10 1709 (build 16299) or later: run in command prompt or PowerShell:

    winget install --id Git.Git -e --source winget
    

If you are new to contributing to projects through forking on GitHub, take a look at the GitHub documentation for contributing to projects. GitHub provides a quick tutorial using a test repository that may help you become more familiar with forking a repository, cloning a fork, creating a feature branch, pushing changes and making pull requests.

Below are some useful resources for learning more about forking and pull requests on GitHub:

Cloning the repository

  1. Create an account on GitHub if you do not already have one.

  2. You will need your own copy of erlabpy (aka fork) to work on the code. Go to the erlabpy repository and hit the Fork button near the top of the page. This creates a copy of the code under your account on the GitHub server.

  3. Clone your fork to your machine:

    git clone https://github.com/your-user-name/erlabpy.git
    cd erlabpy
    git remote add upstream https://github.com/kmnhan/erlabpy.git
    

    This creates the directory erlabpy and connects your repository to the upstream (main project) erlabpy repository.

Installing conda

Before starting any development, you’ll need to create an isolated environment under a package manager like conda. If you don’t have conda installed, install conda or install mamba.

Hint

  • When using conda, miniconda is recommended to save disk space.

  • Mamba is a faster alternative to conda with additional features.

  • Installing miniforge will install both conda and mamba, and is recommended.

Editable installation from source

An editable installation allows you to make changes to the code and see the changes reflected in the package without having to reinstall it. Before installing:

  1. Create and activate a mamba (or conda) environment.

    Note

    Replace <envname> with the environment name you prefer.

    Hint

    If using conda, replace mamba with conda.

    mamba env create -f environment.yml -n <envname>
    mamba activate <envname>
    
  2. Install the repository.

    Note

    The editable_mode=compat setting enables static analysis tools to work with the package. See this issue for more information.

    pip install -e ".[dev]" --config-settings editable_mode=compat
    

Updating the editable installation

  • For minor updates with editable installs, it is sufficient to just update the main branch.

  • When there are changes to the dependencies, you should also update the environment:

    Hint

    If using conda, replace mamba with conda.

    mamba env update -f environment.yml -n <envname>
    
  • In case of major changes, it is recommended to rebuild the package.

    mamba activate <envname>
    pip install -e . --force-reinstall --no-deps --config-settings editable_mode=compat
    

Development workflow

Before starting any development, make sure you have created a local development environment.

Update the main branch

Before starting a new set of changes, fetch all changes from upstream/main, and start a new feature branch from that. From time to time you should fetch the upstream changes from GitHub:

git fetch upstream
git merge upstream/main

This will combine your commits with the latest erlabpy git main. If this leads to merge conflicts, you must resolve these before submitting your pull request. Remember to follow the commit message guidelines. If you have uncommitted changes, you will need to git stash them prior to updating. This will effectively store your changes, which can be reapplied after updating with git stash apply.

Create a new feature branch

Create a branch to save your changes, even before you start making changes. You want your main branch to contain only production-ready code:

git checkout -b shiny-new-feature

This changes your working directory to the shiny-new-feature branch. Keep any changes in this branch specific to one bug or feature so it is clear what the branch brings to erlabpy. You can have many “shiny-new-features” and switch in between them using the git checkout command.

Generally, you will want to keep your feature branches on your public GitHub fork of erlabpy. To do this, you git push this new branch up to your GitHub repo. Generally (if you followed the instructions in these pages, and by default), git will have a link to your fork of the GitHub repo, called origin. You push up to your own fork with:

git push origin shiny-new-feature

In git >= 1.7 you can ensure that the link is correctly set by using the --set-upstream option:

git push --set-upstream origin shiny-new-feature

From now on git will know that shiny-new-feature is related to the shiny-new-feature branch in the GitHub repo.

The editing workflow

  1. Make some changes. Make sure to follow the code standards and the documentation standards.

  2. See which files have changed with git status. You’ll see a listing like this one:

    # On branch shiny-new-feature
    # Changed but not updated:
    #   (use "git add <file>..." to update what will be committed)
    #   (use "git checkout -- <file>..." to discard changes in working directory)
    #
    #  modified:   README
    
  3. Check what the actual changes are with git diff.

  4. Build the documentation for documentation changes. See the documentation section for more information.

Commit and push your changes

  1. To commit all modified files into the local copy of your repo, do git commit -am 'A commit message'. Note that erlabpy uses python-semantic-release for versioning, so the commit message must follow the Conventional Commits standard. This will automatically determine the version number for the next release.

  2. To push the changes up to your forked repo on GitHub, do a git push.

Open a pull request

When you’re ready or need feedback on your code, open a Pull Request (PR) so that we can give feedback and eventually include your suggested code into the main branch. Pull requests (PRs) on GitHub are the mechanism for contributing to the code and documentation.

Enter a title for the set of changes with some explanation of what you’ve done. Mention anything you’d like particular attention for - such as a complicated change or some code you are not happy with. If you don’t think your request is ready to be merged, just say so in your pull request message and use the “Draft PR” feature of GitHub. This is a good way of getting some preliminary code review.

Code standards

  • Import sorting, formatting, and linting are enforced with Ruff.

  • If you wish to contribute, using pre-commit is recommended. This will ensure that your code is properly formatted before you commit it. A pre-commit configuration file for ruff is included in the repository.

  • When writing code that uses Qt, please adhere to the following rules:

    • Import all Qt bindings from qtpy, and only import the top level modules:

      from qtpy import QtWidgets, QtCore, QtGui
      
    • Use fully qualified enum names from Qt6 instead of the short-form enums from Qt5, i. e., QtCore.Qt.CheckState.Checked instead of QtCore.Qt.Checked.

    • Use the signal and slot syntax from PySide6 (QtCore.Signal and QtCore.Slot instead of QtCore.pyqtSignal and QtCore.pyqtSlot)

    • When using Qt Designer, place .ui files in the same directory as the Python file that uses them. The files must be imported using the loadUiType function from qtpy.uic. For example:

      from qtpy import uic
      
      class MyWidget(*uic.loadUiType(os.path.join(os.path.dirname(__file__), "mywidget.ui"))):
          def __init__(self):
              super().__init__()
              self.setupUi(self)
      
  • Please try to add type annotations to your code. This will help with code completion and static analysis.

  • We are in the process of adding type annotations to the codebase, and most of it should pass mypy except for the io and interactive modules.

Documentation

The documentation is written in reStructuredText, which is almost like writing in plain English, and built using Sphinx. The Sphinx Documentation has an excellent introduction to reST. Review the Sphinx docs to perform more complex changes to the documentation as well.

Some other important things to know about the docs:

  • The documentation consists of two parts: the docstrings in the code itself and the docs in this folder erlabpy/docs/source/.

    The docstrings are meant to provide a clear explanation of the usage of the individual functions, while the documentation in this folder consists of tutorial-like overviews per topic together with some other information.

  • The docstrings follow the NumPy Docstring Standard, which is used widely in the Scientific Python community. This standard specifies the format of the different sections of the docstring. Refer to the documentation for the Numpy docstring format and the Sphinx examples for detailed explanation and examples, or look at some of the existing functions to extend it in a similar manner.

  • The documentation is automatically updated by Read the Docs when a new commit is pushed to main.

  • Type annotations that follow PEP 484 are recommended in the code, which are automatically included in the documentation. Hence, you may omit the type information for well-annotated functions.

  • We aim to follow the recommendations from the Python documentation and the Sphinx reStructuredText documentation for section markup characters,

    • * with overline, for chapters

    • =, for heading

    • -, for sections

    • ~, for subsections

    • ** text **, for bold text

Building the documentation locally

Check whether all documentation dependencies are installed with

pip install -r docs/requirements.txt

or

mamba env update -f docs/environment.yml -n <envname>

then build the documentation by running:

cd docs/
make clean
make html

Then you can find the HTML output files in the folder erlabpy/docs/build/html/.

To see what the documentation now looks like with your changes, you can view the HTML build locally by opening the files in your local browser. For example, if you normally use Google Chrome as your browser, you could enter:

google-chrome build/html/index.html

in the terminal, running from within the doc/ folder. You should now see a new tab pop open in your local browser showing the documentation. The different pages of this local build of the documentation are linked together, so you can browse the whole documentation by following links the same way you would on the hosted website.

References

[1]

S. Hoyer and J. Hamman, Xarray: N-D labeled arrays and datasets in Python, J. Open Res. Softw. (2017).

[2]

C. Tusche, A. Krasyuk, and J. Kirschner, Spin resolved bandstructure imaging with a high resolution momentum microscope, Ultramicroscopy 159, 520–529 (2015).

[3]

Y. Ishida and S. Shin, Functions to map photoelectron distributions in a variety of setups in angle-resolved photoemission spectroscopy, Rev. Sci. Instrum. 89, 043903 (2018).

[4]

R. C. Dynes, V. Narayanamurti, and J. P. Garno, Direct measurement of quasiparticle-lifetime broadening in a strong-coupled superconductor, Phys. Rev. Lett. 41, 1509–1512 (1978).

[5]

S. Schirra., How reliable are practical point-in-polygon strategies?, in Algorithms - ESA 2008, 2008, edited by D. Halperin and K. Mehlhorn (Springer Berlin Heidelberg, Berlin, Heidelberg, 2008), p. 744–755, doi:10.1007/978-3-540-87744-8_62.

[6]

P. Zhang, P. Richard, T. Qian, Y.-M. Xu, X. Dai, and H. Ding, A precise method for visualizing dispersive features in image plots, Rev. Sci. Instrum. 82, 043712 (2011).

[7]

Y. He, Y. Wang, and Z.-X. Shen, Visualizing dispersive features in 2d image via minimum gradient method, Rev. Sci. Instrum. 88, 073903 (2017).

[8]

M. P. Seah and W. A. Dench, Quantitative electron spectroscopy of surfaces: a standard data base for electron inelastic mean free paths in solids, Surf. Interface Anal. 1, 2–11 (1979).