ERLabPy¶
Date: May 16, 2024
Download documentation: Zipped HTML, PDF
A library that provides a set of tools and utilities to handle, manipulate, and visualize data from condensed matter physics experiments, with a focus on angle-resolved photoemission spectroscopy (ARPES).
ERLabPy is built on top of the popular scientific python libraries numpy, scipy, and xarray, and is designed to be easy to use and integrate with existing scientific Python workflows so that you can quickly get started with your data analysis. The package is still under development, so if you have any questions or suggestions, please feel free to contact us. We hope you find ERLabPy useful for your research!


Getting Started¶
Installing¶
The recommended way to install ERLabPy is via conda. If you do not have conda installed, follow the installation instructions. Once you have a working conda environment, you can install ERLabPy with the conda command line tool:
conda install -c conda-forge erlab
Add any optional dependencies you want to install to the command above.
Hint
If you are using macOS, you might experience degraded performance with the default BLAS and LAPACK libraries. For Apple Silicon macs, use Accelerate:
conda install "libblas=*=*accelerate"
For Intel macs, use MKL:
conda install "libblas=*=*mkl"
To prevent conda from switching back to the default libraries, see the conda-forge documentation.
If you don’t use conda, you can install ERLabPy with pip:
python -m pip install erlab
Optional dependency groups can be installed with the following commands:
python -m pip install erlab[viz] # Install optional dependencies for visualization
python -m pip install erlab[perf] # Install optional dependencies for performance
python -m pip install erlab[misc] # Install miscellaneous optional dependencies
python -m pip install erlab[complete] # Install all optional dependencies
If you wish to install ERLabPy from source, see the Contributing Guide.
Dependencies¶
ERLabPy is installed with many different python libraries. Some key packages and links to their documentation are listed below as a reference. In particular, this documentation assumes basic familiarity with the first four packages, which will be sufficient for most use cases.
Package |
Used in |
---|---|
Computation and array manipulation, linear algebra |
|
Data storage and manipulation |
|
Plotting |
|
Linear algebra, signal processing, and image processing |
|
Optimization problems including curve fitting |
|
Interactive plotting (i.e., imagetool) |
ERLabPy also requires a Qt library such as PyQt5, PyQt6, PySide2, or PySide6. To ensure compatibility and keep the namespace clean, ERLabPy imports Qt bindings from qtpy, which will automatically select the appropriate library based on what is installed.
See the User Guide to start using ERLabPy!
Optional dependencies¶
The following packages are optional dependencies that are not installed by default. They are only used in specific functions, or is not used at all but is listed just for convenience.
Package |
Description |
---|---|
Multidimensional smoothing splines |
|
Interactive widgets |
|
Interactive plotting |
|
More colormaps! |
|
numbagg and bottleneck |
Fast multidimensional aggregation, accelerates xarray |
For a full list of dependencies and optional dependencies, take a look at the
[project]
and [project.optional-dependencies]
section in pyproject.toml:
dependencies = [
"h5netcdf>=1.2.0",
"igor2>=0.5.6",
"joblib>=1.3.2",
"lmfit>=1.2.0,!=1.3.0",
"matplotlib>=3.8.0",
"numba>=0.59.0",
"numpy>=1.26.0",
"pyperclip>=1.8.2",
"pyqtgraph>=0.13.1",
"qtawesome>=1.3.1",
"qtpy>=2.4.1",
"scipy>=1.12.0",
"tqdm>=4.66.2",
"uncertainties>=3.1.4",
"varname>=0.13.0",
"xarray>=2024.02.0",
]
[project.optional-dependencies]
complete = ["erlab[viz,perf,misc,dev]"]
viz = ["cmasher", "cmocean", "colorcet", "hvplot", "ipywidgets"]
perf = ["numbagg>=0.8.1", "bottleneck>=1.3.8"]
misc = ["iminuit>=2.25.2", "csaps>=1.1.0", "dask>=2024.4.1"]
dev = [
"mypy",
"pre-commit",
"pytest-cov",
"pytest-qt",
"pytest-xdist",
"pytest",
"python-semantic-release",
"ruff",
]
docs = [
"sphinx",
"sphinx-autodoc-typehints",
"sphinx-copybutton",
"sphinxcontrib-bibtex",
"sphinx-qt-documentation",
"pybtex",
"nbsphinx",
"furo",
"sphinx-design",
]
Notes on compatibility¶
ERLabPy is tested on Python 3.11 and 3.12. It is not guaranteed to work on older versions of Python.
There are some known compatibility issues with PyQt5 and PySide2, so it is recommended to use the newer PyQt6 or PySide6 if possible.
If you meet any unexpected behaviour while using IPython’s autoreload extension, try excluding the following modules:
%aimport -erlab.io.dataloader -erlab.accessors
User Guide¶
Work in Progress
The user guide is incomplete. For the full list of packages and modules provided by ERLabPy, see API Reference.
Introduction¶
Welcome to the ERLabPy user guide! This guide provides an overview of ERLabPy and its core features.
If you are new to programming with python, Scientific Python Lectures is a great place to start.
Data structures in ERLabPy are represented using xarray[1], which provides a powerful data structure for working with multi-dimensional arrays. Visit the xarray tutorial and the xarray user guide to get familiar with xarray.
Reading and writing data¶
In ERLabPy, most data are represented as xarray.Dataset
objects or
xarray.DataArray
objects. xarray.DataArray
are similar to waves in
Igor Pro, but are much more flexible. Opposed to the maximum of 4 dimensions in Igor,
xarray.DataArray
can have as many dimensions as you want (up to 64). Another
advantage is that the coordinates of the dimensions do not have to be evenly spaced. In
fact, they are not limited to numbers but can be any type of data, such as date and time
representations.
This guide will introduce you to reading and writing data from and to various file formats, and how to implement a custom reader for a experimental setup.
Skip to the corresponding section for guides on loading ARPES data.
Reading data¶
Python has a wide range of libraries to read and write data. Here, we will focus on working with Igor Pro and xarray objects.
From Igor Pro¶
Warning
Loading waves from complex .pxp
files may fail or produce unexpected results. It
is recommended to export the waves to a .ibw
file to load them in ERLabPy. If you
encounter any problems, please let us know by opening an issue.
ERLabPy can read .ibw
, .pxt
, .pxp
, and HDF5 files exported from Igor Pro by
using the functions in erlab.io.igor
:
File extension |
Function |
---|---|
|
|
|
|
|
For easy access, the first two functions are also available in the erlab.io
namespace.
Note
Internally, the igor2 package is used to read the data.
From arbitrary formats¶
Spreadsheet data can be read using pandas.read_csv()
or pandas.read_excel()
.
The resulting DataFrame can be converted to an xarray object using
pandas.DataFrame.to_xarray()
or xarray.Dataset.from_dataframe()
.
When reading HDF5 files with arbitrary groups and metadata, you must first explore the
group structure using h5netcdf or h5py. Loading a specific HDF5 group into an xarray object can be
done using xarray.open_dataset()
or xarray.open_mfdataset()
by supplying the
group
argument. For an example that handles complex HDF5 groups, see the
implementation of erlab.io.plugins.ssrl52.SSRL52Loader.load_single()
.
FITS files can be read with astropy.
Writing data¶
Since the state and variables of a Python interpreter are not saved, it is important to save your data in a format that can be easily read and written.
While it is possible to save and load entire Python interpreter sessions using pickle or the more versatile dill, it is out of the scope of this guide. Instead, we recommend saving your data in a format that is easy to read and write, such as HDF5 or NetCDF. These formats are supported by many programming languages and are optimized for fast read and write operations.
To save and load xarray
objects, see the xarray
documentation on I/O
operations. ERLabPy offers
convenience functions load_hdf5
and save_as_hdf5
for loading and saving xarray objects from and to HDF5 files, and save_as_netcdf
for saving to NetCDF files.
To Igor Pro¶
As an experimental feature, save_as_hdf5
can save
certain xarray.DataArray
s in a format that is compatible with the Igor Pro
HDF5 loader. An accompanying Igor procedure is available in the
repository. If loading in Igor Pro fails, try saving again with all attributes removed.
Alternatively, igorwriter can be used to write
numpy arrays to .ibw
and .itx
files directly.
Loading ARPES data¶
Warning
ERLabPy is still in development and the API may change. Some major changes regarding data loading and handling are planned:
The xarray datatree structure will enable much more intuitive and powerful data handling. Once the feature gets incorporated into xarray, ERLabPy will be updated to use it.
A universal translation layer between true data header attributes and human-readable representations will be implemented. This will allow for more consistent and user-friendly data handling.
ERLabPy’s data loading framework consists of various plugins, or loaders, each
designed to load data from a different beamline or laboratory. Each loader is a class
that has a load
method which takes a file path or sequence number and returns data.
Let’s see the list of loaders available by default:
[1]:
import erlab.io
erlab.io.loaders
[1]:
Name | Aliases | Loader class |
---|---|---|
ssrl | ssrl52, bl5-2 | erlab.io.plugins.ssrl52.SSRL52Loader |
merlin | ALS_BL4, als_bl4, BL403, bl403 | erlab.io.plugins.merlin.BL403Loader |
da30 | DA30 | erlab.io.plugins.da30.DA30Loader |
kriss | KRISS | erlab.io.plugins.kriss.KRISSLoader |
You can access each loader by its name or alias, both as an attribute or as an item. For example, to access the loader for the ALS beamline 4.0.3, you can use any of the following:
[3]:
erlab.io.loaders["merlin"]
erlab.io.loaders["bl403"]
erlab.io.loaders.merlin
erlab.io.loaders.bl403
[3]:
<erlab.io.plugins.merlin.BL403Loader at 0x7f463ccea390>
Data loading is done by calling the load
method of the loader. It requires an identifier
parameter, which can be a path to a
file or a sequence number. It also accepts a data_dir
parameter, which specifies the
directory where the data is stored.
If
identifier
is a sequence number,data_dir
must be provided.If
identifier
is a string anddata_dir
is provided, the path is constructed by joiningdata_dir
andidentifier
.If
identifier
is a string anddata_dir
is not provided,identifier
should be a valid path to a file.
Suppose we have data from the ALS beamline 4.0.3 stored as /path/to/data/f_001.pxt
,
/path/to/data/f_002.pxt
, etc. To load f_001.pxt
, all three of the following are
valid:
loader = erlab.io.loaders["merlin"]
loader.load("/path/to/data/f_001.pxt")
loader.load("f_001.pxt", data_dir="/path/to/data")
loader.load(1, data_dir="/path/to/data")
Setting the default loader and data directory¶
In practice, a loader and a single directory will be used repeatedly in a session to load different data from the same experiment.
Instead of explicitly specifying the loader and directory each time, a default loader
and data directory can be set with erlab.io.set_loader()
and
erlab.io.set_data_dir()
. All subsequent calls to the shortcut function
erlab.io.load()
will use the specified loader and data directory.
erlab.io.set_loader("merlin")
erlab.io.set_data_dir("/path/to/data")
data_1 = erlab.io.load(1)
data_2 = erlab.io.load(2)
The loader and data directory can also be controlled with a context manager:
with erlab.io.loader_context("merlin", data_dir="/path/to/data"):
data_1 = erlab.io.load(1)
Data across multiple files¶
For setups like the ALS beamline 4.0.3, some scans are stored over multiple files like
f_003_S001.pxt
, f_003_S002.pxt
, and so on. In this case, the loader will
automatically concatenate all files in the same scan. For example, all of the
following will return the same concatenated data:
erlab.io.load(3)
erlab.io.load("f_003_S001.pxt")
erlab.io.load("f_003_S002.pxt")
...
If you want to cherry-pick a single file, you can pass single=True
to load
:
erlab.io.load("f_003_S001.pxt", single=True)
Handling multiple data directories¶
If you call erlab.io.set_loader()
or erlab.io.set_data_dir()
multiple times,
the last call will override the previous ones. While this is useful for changing the
loader or data directory, it makes data loading dependent on execution order. This may
lead to unexpected behavior.
If you plan to use multiple loaders or data directories in the same session, it is
recommended to use the context manager. If you have to load data from multiple
directories multiple times, it may be convenient to define functions that set the loader
and data directory and call erlab.io.load()
with the appropriate arguments. For example:
def load1(identifier):
with erlab.io.loader_context("merlin", data_dir="/path/to/data1"):
return erlab.io.load(identifier)
Summarizing data¶
Some loaders have generate_summary
implemented, which generates a
pandas.DataFrame
containing an overview of the data in a given directory. The
generated summary can be viewed as a table with the summarize
method. If ipywidgets
is installed, an
interactive widget is also displayed. This is useful for quickly skimming through the
data.
Just like load
, summarize
can also be accessed with the shortcut
function erlab.io.summarize()
. For example, to display a summary of the data
available in the directory /path/to/data
using the 'merlin'
loader:
erlab.io.set_loader("merlin")
erlab.io.set_data_dir("/path/to/data")
erlab.io.summarize()
To see what the generated summary looks like, see the example below.
Implementing a data loader plugin¶
It is easy to add new loaders to the framework. Any subclass of LoaderBase
is automatically registered as a loader! The class
must have a valid name
attribute which is
used to access the loader.
If the name
attribute is prefixed with an
underscore, the registration is skipped. This is useful for base classes that are not
meant to be used directly.
Data loading flow¶
ARPES data from a single experiment are usually stored in one folder, with files that
look like file_0001.h5
, file_0002.h5
, etc. If the naming scheme does not deviate
from this pattern, only two methods need to be implemented: identify
and load_single
. The following flowchart shows the process
of loading data from a single scan, given either a file path or a sequence number:
Here, identify
is given an integer
sequence number(identifier
) and the path to the data folder(data_dir
), and
returns the full path to the corresponding data file.
The method load_single
is given a
full path to a single data file and must return the data as an xarray.DataArray
or a xarray.Dataset
. If the data cannot be combined into a single object, the
method can also return a list of xarray.DataArray
objects.
If only all data formats were as simple as this! Unfortunately, there are some setups
where data for a single scan is saved over multiple files. In this case, the files will
look like file_0001_0001.h5
, file_0001_0002.h5
, etc. For these kinds of setups,
an additional method infer_index
must be implemented. The following flowchart shows the process of loading data from
multiple files:
In this case, the method identify
should resolve all files that belong to the given sequence number, and return a list
of file paths along with a dictionary of corresponding coordinates.
The method infer_index
is given a
bare file name (without the extension and path) like and must return the sequence number
of the scan. For example, given the file name file_0003_0123
, the method should
return 3
.
Conventions¶
There are some rules that loaded ARPES data must follow to ensure that analysis procedures such as momentum conversion and fitting works seamlessly:
The experimental geometry should be stored in the
'configuration'
attribute as an integer. See Nomenclature andAxesConfiguration
for more information.All standard angle coordinates must follow the naming conventions in Nomenclature.
The sample temperature, if available, should be stored in the
'temp_sample'
attribute.The sample work function, if available, should be stored in the
'sample_workfunction'
attribute.Energies should be given in electronvolts.
Angles should be given in degrees.
Temperatures should be given in Kelvins.
All loaders by default does a basic check for a subset of these rules using
validate
and will raise a warning if
some are missing. This behavior can be controlled with loader class attributes
skip_validate
and
strict_validation
.
A minimal example¶
Consider a setup that saves data into a .csv
file named data_0001.csv
and so on.
A bare minimum implementation of a loader for the setup will look something like this:
[4]:
import os
import pandas as pd
from erlab.io.dataloader import LoaderBase
class MyLoader(LoaderBase):
name = "my_loader"
aliases = None
name_map = {}
coordinate_attrs = {}
additional_attrs = {"information": "any metadata you want to load with the data"}
skip_validate = False
always_single = True
def identify(self, num, data_dir):
file = os.path.join(data_dir, f"data_{str(num).zfill(4)}.csv")
return [file], {}
def load_single(self, file_path):
return pd.read_csv(file_path).to_xarray()
[5]:
erlab.io.loaders
[5]:
Name | Aliases | Loader class |
---|---|---|
ssrl | ssrl52, bl5-2 | erlab.io.plugins.ssrl52.SSRL52Loader |
merlin | ALS_BL4, als_bl4, BL403, bl403 | erlab.io.plugins.merlin.BL403Loader |
da30 | DA30 | erlab.io.plugins.da30.DA30Loader |
kriss | KRISS | erlab.io.plugins.kriss.KRISSLoader |
my_loader | __main__.MyLoader |
[6]:
erlab.io.loaders["my_loader"]
[6]:
<__main__.MyLoader at 0x7f462c5399d0>
We can see that the loader has been registered.
A complex example¶
Next, let’s try to write a more realistic loader for a hypothetical setup that saves
data as HDF5 files with the following naming scheme: data_001.h5
, data_002.h5
,
and so on, with multiple scans named like data_001_S001.h5
, data_001_S002.h5
,
etc. with the scan axis information stored in a separate file named
data_001_axis.csv
.
Let us first generate a data directory and place some synthetic data in it. Before saving, we rename and set some attributes that resemble real ARPES data.
[7]:
import csv
import datetime
import os
import tempfile
import erlab.io
import numpy as np
from erlab.io.exampledata import generate_data_angles
def make_data(beta=5.0, temp=20.0, hv=50.0, bandshift=0.0):
data = generate_data_angles(
shape=(250, 1, 300),
angrange={"alpha": (-15, 15), "beta": (beta, beta)},
hv=hv,
configuration=1,
temp=temp,
bandshift=bandshift,
assign_attributes=False,
seed=1,
).T
# Rename coordinates. The loader must rename them back to the original names.
data = data.rename(
{
"alpha": "ThetaX",
"beta": "Polar",
"eV": "BindingEnergy",
"hv": "PhotonEnergy",
"xi": "Tilt",
"delta": "Azimuth",
}
)
# Assign some attributes that real data would have
data = data.assign_attrs(
{
"LensMode": "Angular30", # Lens mode of the analyzer
"SpectrumType": "Fixed", # Acquisition mode of the analyzer
"PassEnergy": 10, # Pass energy of the analyzer
"UndPol": 0, # Undulator polarization
"DateTime": datetime.datetime.now().isoformat(), # Acquisition time
"TB": temp,
"X": 0.0,
"Y": 0.0,
"Z": 0.0,
}
)
return data
# Create a temporary directory
tmp_dir = tempfile.TemporaryDirectory()
# Define coordinates for the scan
beta_coords = np.linspace(2, 7, 10)
# Generate and save cuts with different beta values
for i, beta in enumerate(beta_coords):
erlab.io.save_as_hdf5(
make_data(beta=beta, temp=20.0, hv=50.0),
filename=f"{tmp_dir.name}/data_001_S{str(i+1).zfill(3)}.h5",
igor_compat=False,
)
# Write scan coordinates to a csv file
with open(f"{tmp_dir.name}/data_001_axis.csv", "w", newline="") as file:
writer = csv.writer(file)
writer.writerow(["Index", "Polar"])
for i, beta in enumerate(beta_coords):
writer.writerow([i + 1, beta])
# Generate some cuts with different band shifts
for i in range(4):
erlab.io.save_as_hdf5(
make_data(beta=5.0, temp=20.0, hv=50.0, bandshift=-i * 0.05),
filename=f"{tmp_dir.name}/data_{str(i+2).zfill(3)}.h5",
igor_compat=False,
)
# List the generated files
sorted(os.listdir(tmp_dir.name))
[7]:
['data_001_S001.h5',
'data_001_S002.h5',
'data_001_S003.h5',
'data_001_S004.h5',
'data_001_S005.h5',
'data_001_S006.h5',
'data_001_S007.h5',
'data_001_S008.h5',
'data_001_S009.h5',
'data_001_S010.h5',
'data_001_axis.csv',
'data_002.h5',
'data_003.h5',
'data_004.h5',
'data_005.h5']
Now that we have the data, let’s implement the loader. The biggest difference from the
previous example is that we need to handle multiple files for a single scan in
identify
. Also, we have to implement
infer_index
to extract the scan
number from the file name.
[8]:
import glob
import re
import pandas as pd
from erlab.io.dataloader import LoaderBase
class ExampleLoader(LoaderBase):
name = "example"
aliases = ["Ex"]
name_map = {
"eV": "BindingEnergy",
"alpha": "ThetaX",
"beta": [
"Polar",
"Polar Compens",
], # Can have multiple names assigned to the same name
"delta": "Azimuth",
"xi": "Tilt",
"x": "X",
"y": "Y",
"z": "Z",
"hv": "PhotonEnergy",
"polarization": "UndPol",
"temp_sample": "TB",
}
coordinate_attrs: tuple[str, ...] = (
"beta",
"delta",
"xi",
"hv",
"x",
"y",
"z",
"polarization",
"photon_flux",
)
# Attributes to be used as coordinates. Place all attributes that we don't want to
# lose when merging multiple file scans here.
additional_attrs = {
"configuration": 1, # Experimental geometry. Required for momentum conversion
"sample_workfunction": 4.3,
}
# Any additional metadata you want to add to the data. Note that attributes defined
# here will not be transformed into coordinates. If you wish to promote some fixed
# attributes to coordinates, add them to additional_coords.
additional_coords = {}
# Additional non-dimension coordinates to be added to the data, for instance the
# photon energy for lab-based ARPES.
skip_validate = False
always_single = False
def identify(self, num, data_dir):
coord_dict = {}
# Look for scans with data_###_S###.h5, and sort them
files = glob.glob(f"data_{str(num).zfill(3)}_S*.h5", root_dir=data_dir)
files.sort()
if len(files) == 0:
# If no files found, look for data_###.h5
files = glob.glob(f"data_{str(num).zfill(3)}.h5", root_dir=data_dir)
else:
# If files found, extract coordinate values from the filenames
axis_file = f"{data_dir}/data_{str(num).zfill(3)}_axis.csv"
with open(axis_file) as f:
header = f.readline().strip().split(",")
# Load the coordinates from the csv file
coord_arr = np.loadtxt(axis_file, delimiter=",", skiprows=1)
# Each header entry will contain a dimension name
for i, hdr in enumerate(header[1:]):
key = self.name_map_reversed.get(hdr, hdr)
coord_dict[key] = coord_arr[: len(files), i + 1].astype(np.float64)
if len(files) == 0:
# If no files found up to this point, raise an error
raise FileNotFoundError(f"No files found for scan {num} in {data_dir}")
# Files must be full paths
files = [os.path.join(data_dir, f) for f in files]
return files, coord_dict
def load_single(self, file_path):
data = erlab.io.load_hdf5(file_path)
# To prevent conflicts when merging multiple scans, we rename the coordinates
# prior to concatenation
return self.process_keys(data)
def infer_index(self, name):
# Get the scan number from file name
try:
scan_num: str = re.match(r".*?(\d{3})(?:_S\d{3})?", name).group(1)
except (AttributeError, IndexError):
return None, None
if scan_num.isdigit():
# The second return value, a dictionary, is reserved for more complex
# setups. See tips below for a brief explanation.
return int(scan_num), {}
else:
return None, None
[9]:
erlab.io.loaders
[9]:
Name | Aliases | Loader class |
---|---|---|
ssrl | ssrl52, bl5-2 | erlab.io.plugins.ssrl52.SSRL52Loader |
merlin | ALS_BL4, als_bl4, BL403, bl403 | erlab.io.plugins.merlin.BL403Loader |
da30 | DA30 | erlab.io.plugins.da30.DA30Loader |
kriss | KRISS | erlab.io.plugins.kriss.KRISSLoader |
my_loader | __main__.MyLoader | |
example | Ex | __main__.ExampleLoader |
We can see that the example
loader has been registered. Let’s test the loader by
loading and plotting some data.
[10]:
erlab.io.set_loader("example")
erlab.io.set_data_dir(tmp_dir.name)
erlab.io.load(1)
[10]:
<xarray.DataArray (beta: 10, eV: 300, alpha: 250)> Size: 6MB 54.6 50.96 51.68 60.83 63.45 55.18 ... 3.076 0.9879 0.8741 2.515 1.359 1.262 Coordinates: * alpha (alpha) float64 2kB -15.0 -14.88 -14.76 ... 14.76 14.88 15.0 * beta (beta) float64 80B 2.0 2.556 3.111 3.667 ... 5.889 6.444 7.0 * eV (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12 xi float64 8B 0.0 delta float64 8B 0.0 hv float64 8B 50.0 x float64 8B 0.0 y float64 8B 0.0 z float64 8B 0.0 polarization int64 8B 0 Attributes: LensMode: Angular30 SpectrumType: Fixed PassEnergy: 10 DateTime: 2024-05-16T02:22:00.471983 TB: 20.0 temp_sample: 20.0 configuration: 1 sample_workfunction: 4.3 data_loader_name: example
[11]:
erlab.io.load(5).qplot()
[11]:
<matplotlib.image.AxesImage at 0x7f462c26a9d0>
Brilliant! We now have a working loader for our hypothetical setup. However, we can’t use erlab.io.summarize()
with our loader since we haven’t implemented generate_summary
.
This method should return a pandas.DataFrame
with the index containing file
names. The only requirement for the DataFrame is that it should include a column named
'Path'
that contains the paths to the data files. Other than that, the DataFrame can
contain any metadata you wish to display in the summary. Let’s implement it in a
subclass of the example
loader:
[12]:
class ExampleLoaderComplete(ExampleLoader):
name = "example_complete"
aliases = ["ExC"]
def generate_summary(self, data_dir):
# Get all valid data files in directory
files = {}
for path in erlab.io.utilities.get_files(data_dir, extensions=[".h5"]):
# Base name
data_name = os.path.splitext(os.path.basename(path))[0]
# If multiple scans, strip the _S### part
name_match = re.match(r"(.*?_\d{3})_(?:_S\d{3})?", data_name)
if name_match is not None:
data_name = name_match.group(1)
files[data_name] = path
# Map dataframe column names to data attributes
attrs_mapping = {
"Lens Mode": "LensMode",
"Scan Type": "SpectrumType",
"T(K)": "temp_sample",
"Pass E": "PassEnergy",
"Polarization": "polarization",
"hv": "hv",
"x": "x",
"y": "y",
"z": "z",
"polar": "beta",
"tilt": "xi",
"azi": "delta",
}
column_names = ["File Name", "Path", "Time", "Type", *attrs_mapping.keys()]
data_info = []
processed_indices = set()
for name, path in files.items():
# Skip already processed multi-file scans
index, _ = self.infer_index(name)
if index in processed_indices:
continue
elif index is not None:
processed_indices.add(index)
# Load data
data = self.load(path)
# Determine type of scan
data_type = "core"
if "alpha" in data.dims:
data_type = "cut"
if "beta" in data.dims:
data_type = "map"
if "hv" in data.dims:
data_type = "hvdep"
data_info.append(
[
name,
path,
datetime.datetime.fromisoformat(data.attrs["DateTime"]),
data_type,
]
)
for k, v in attrs_mapping.items():
# Try to get the attribute from the data, then from the coordinates
try:
val = data.attrs[v]
except KeyError:
try:
val = data.coords[v].values
if val.size == 1:
val = val.item()
except KeyError:
val = ""
# Convert polarization values to human readable form
if k == "Polarization":
if np.iterable(val):
val = np.asarray(val).astype(int)
else:
val = [round(val)]
val = [{0: "LH", 2: "LV", -1: "RC", 1: "LC"}.get(v, v) for v in val]
if len(val) == 1:
val = val[0]
data_info[-1].append(val)
del data
# Sort by time and set index
return (
pd.DataFrame(data_info, columns=column_names)
.sort_values("Time")
.set_index("File Name")
)
erlab.io.loaders
[12]:
Name | Aliases | Loader class |
---|---|---|
ssrl | ssrl52, bl5-2 | erlab.io.plugins.ssrl52.SSRL52Loader |
merlin | ALS_BL4, als_bl4, BL403, bl403 | erlab.io.plugins.merlin.BL403Loader |
da30 | DA30 | erlab.io.plugins.da30.DA30Loader |
kriss | KRISS | erlab.io.plugins.kriss.KRISSLoader |
my_loader | __main__.MyLoader | |
example | Ex | __main__.ExampleLoader |
example_complete | ExC | __main__.ExampleLoaderComplete |
The implementation looks complicated, but most of the code is boilerplate, and the actual logic is quite simple. You get a list of file names and paths to generate a summary for, define DataFrame columns and corresponding attributes, and then load the data one by one and extract the metadata. Let’s see how the resulting summary looks like.
Note
If ipywidgets is not installed, only the DataFrame will be displayed.
If you are viewing this documentation online, the summary will not be interactive. Run the code locally to try it out.
[13]:
erlab.io.set_loader("example_complete")
erlab.io.summarize()
Type | Lens Mode | Scan Type | T(K) | Pass E | Polarization | hv | x | y | z | polar | tilt | azi | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
File Name | |||||||||||||
data_001 | map | Angular30 | Fixed | 20 | 10 | LH | 50 | 0 | 0 | 0 | 2→7 (0.5556, 10) | 0 | 0 |
data_002 | cut | Angular30 | Fixed | 20 | 10 | LH | 50 | 0 | 0 | 0 | 5 | 0 | 0 |
data_003 | cut | Angular30 | Fixed | 20 | 10 | LH | 50 | 0 | 0 | 0 | 5 | 0 | 0 |
data_004 | cut | Angular30 | Fixed | 20 | 10 | LH | 50 | 0 | 0 | 0 | 5 | 0 | 0 |
data_005 | cut | Angular30 | Fixed | 20 | 10 | LH | 50 | 0 | 0 | 0 | 5 | 0 | 0 |
[13]:
Each cell in the summary table is formatted with formatter
. If additional formatting that cannot be
achieved within generate_summary
is needed, formatter
can be inherited in the subclass.
Tips¶
The data loading framework is designed to be simple and flexible, but it may not cover all possible setups. If you encounter a setup that cannot be loaded with the existing loaders, please let us know by opening an issue!
Before implementing a loader, see erlab.io.dataloader for descriptions about each attribute, and the values and types of the expected outputs. The implementation of existing loaders in the
erlab.io.plugins
module is a good starting point; see the source code on github.If you have implemented a new loader or have improved an existing one, consider contributing it to the ERLabPy project by opening a pull request. We are always looking for new loaders to support more experimental setups! See more about contributing in the Contributing Guide.
If you wish to add post-processing steps that are applicable to all data loaded by that loader such as fixing the sign of the binding energy coordinates, you can inherit the
post_process
which by default handles coordinate and attribute renaming. This method is called after the data is loaded and can be used to modify the data before it is returned.For complex data structures, constructing a full path from just the sequence number and the data directory can be difficult. In this case, the
identify
can be implemented to take additional keyword arguments. All keyword arguments passed toload
are passed toidentify
!For instance, consider data with different prefixes like
A_001.h5
,A_002.h5
,B_001.h5
, etc. stored in the same directory. The sequence number alone is not enough to construct the full path. In this case,identify
can be implemented to take an additionalprefix
argument which eliminates the ambiguity. Then,A_001.h5
can be loaded witherlab.io.load(1, prefix="A")
.If there are multiple file scans in this setup like
A_001_S001.h5
,A_001_S002.h5
, etc., we would want to pass theprefix
parameter toload
from an identifier given as a file name. This is where the second return value ofinfer_index
comes in handy, where you can return a dictionary which is passed toload
.
Selecting and indexing data¶
In most cases, the powerful data manipulation and indexing methods provided by xarray
are sufficient. In this page, some frequently used xarray
features are summarized in addition to some utilities provided by this package. Refer to the xarray user guide for more information.
First, let us import some example data: a simple tight binding simulation of graphene.
[1]:
import xarray as xr
xr.set_options(display_expand_data=False)
[1]:
<xarray.core.options.set_options at 0x7fad49282e10>
[2]:
from erlab.io.exampledata import generate_data
dat = generate_data(seed=1).T
[3]:
dat
[3]:
<xarray.DataArray (eV: 300, ky: 250, kx: 250)> Size: 150MB 0.5243 1.033 0.6037 1.048 0.4388 ... 0.0003526 5.536e-06 2.813e-07 6.99e-08 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
We can see that the generated data is a three-dimensional xarray.DataArray
. Now, let’s extract a cut along \(k_y = 0.3\).
[4]:
dat.sel(ky=0.3, method="nearest")
[4]:
<xarray.DataArray (eV: 300, kx: 250)> Size: 600kB 1.535 1.377 0.9181 0.4302 0.5897 ... 1.171e-06 8.757e-06 0.0002878 0.001415 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 ky float64 8B 0.2967 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
How about the Fermi surface?
[5]:
dat.sel(eV=0.0, method="nearest")
[5]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.3501 0.1119 0.1255 0.1379 0.05128 ... 0.5261 0.2332 0.1398 0.1466 0.1662 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 eV float64 8B -0.000301
In many scenarios, it is necessary to perform integration across multiple array slices. This can be done by slicing and averaging. The following code integrates the intensity over a window of 50 meV centered at \(E_F\).
[6]:
dat.sel(eV=slice(-0.025, 0.025)).mean("eV")
[6]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
However, doing this every time is cumbersome, and we have lost the coordinate eV. ERLabPy provides a callable accessor qsel
to streamline this process.
[7]:
dat.qsel(eV=0.0, eV_width=0.05)
[7]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 eV float64 8B 0.000602
Note that the averaged coordinate is automatically added to the data array. This is useful for plotting and further analysis.
If the width is not specified, qsel
behaves like passing method=’nearest’ to sel. If a slice is given instead of a single value, no integration is performed. All of these methods can be combined:
[8]:
dat.qsel(kx=slice(-0.3, 0.3), ky=0.3, eV=0.0, eV_width=0.05)
[8]:
<xarray.DataArray (kx: 84)> Size: 672B 0.3407 0.3622 0.3589 0.3659 0.2786 0.3363 ... 0.3541 0.318 0.3214 0.305 0.2766 Coordinates: * kx (kx) float64 672B -0.2967 -0.2895 -0.2824 ... 0.2824 0.2895 0.2967 ky float64 8B 0.2967 eV float64 8B 0.000602
Masking¶
In some cases, it is necessary to mask the data. Although basic masks are supported by xarray
, ERLabPy provides a way to mask data with arbitrary polygons.
Work in Progress
This part of the user guide is still under construction. For now, see
erlab.analysis.mask
. For the full list of packages and modules provided by
ERLabPy, see API Reference.
Interpolation¶
In addition to the powerful interpolation methods provided by
xarray
, ERLabPy provides a convenient way to interpolate data along an arbitrary
path.
Consider a Γ-M-K-Γ high symmetry path given as a list of kx and ky coordinates:
[9]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
import numpy as np
a = 6.97
kx = [0, 2 * np.pi / (a * np.sqrt(3)), 2 * np.pi / (a * np.sqrt(3)), 0]
ky = [0, 0, 2 * np.pi / (a * 3), 0]
dat.qsel(eV=-0.2).qplot(aspect="equal", cmap="Greys")
plt.plot(kx, ky, "o-")
[9]:
[<matplotlib.lines.Line2D at 0x7fad27b68210>]
To interpolate the data along this path with a step of 0.01 Å\(^{-1}\), we can use the following code:
[10]:
import erlab.analysis as era
dat_sliced = era.interpolate.slice_along_path(
dat, vertices={"kx": kx, "ky": ky}, step_size=0.01
)
dat_sliced
[10]:
<xarray.DataArray (eV: 300, path: 140)> Size: 336kB 0.07295 0.1004 0.4831 0.6724 0.1885 ... 1.159e-13 1.01e-07 0.00131 0.138 0.1486 Coordinates: * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09 kx (path) float64 1kB 0.0 0.01021 0.02041 ... 0.01764 0.008821 0.0 ky (path) float64 1kB 0.0 0.0 0.0 0.0 ... 0.01528 0.01019 0.005093 0.0 * path (path) float64 1kB 0.0 0.01021 0.02041 ... 1.402 1.412 1.422
We can see that the data has been interpolated along the path. The new coordinate path
contains the distance along the path, and the dimensions kx
and ky
are now expressed in terms of path
.
The distance along the path can be calculated as the sum of the distances between consecutive points in the path.
[11]:
dat_sliced.plot(cmap="Greys")
eplt.fermiline()
# Distance between each pair of consecutive points
distances = np.linalg.norm(np.diff(np.vstack([kx, ky]), axis=-1), axis=0)
seg_coords = np.concatenate(([0], np.cumsum(distances)))
plt.xticks(seg_coords, labels=["Γ", "M", "K", "Γ"])
plt.xlim(0, seg_coords[-1])
for seg in seg_coords[1:-1]:
plt.axvline(seg, ls="--", c="k", lw=1)
You will learn more about plotting in the next section.
Plotting¶
ERLabPy provides a number of plotting functions to help visualize data and create publication quality figures.
Importing¶
The key module to plotting is erlab.plotting.erplot
, which serves as an
interface to various plotting functions in ERLabPy, similar to how
matplotlib.pyplot
works. To import it, use the following code:
[1]:
import matplotlib.pyplot as plt
import erlab.plotting.erplot as eplt
First, let us generate some example data from a simple tight binding model of graphene. A rigid shift of 200 meV has been applied so that the Dirac cone is visible.
[3]:
from erlab.io.exampledata import generate_data
dat = generate_data(bandshift=-0.2, seed=1).T
[4]:
dat
[4]:
<xarray.DataArray (eV: 300, ky: 250, kx: 250)> Size: 150MB 0.1517 0.1233 0.3382 0.3101 0.3476 0.8233 ... 0.117 0.116 0.4569 0.1373 0.1174 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
We can see that the generated data is a three-dimensional xarray.DataArray
. Now, let’s extract a cut along \(k_y = 0.3\).
[5]:
cut = dat.qsel(ky=0.3)
cut
[5]:
<xarray.DataArray (eV: 300, kx: 250)> Size: 600kB 0.7814 0.3649 0.2043 0.2815 0.7351 ... 0.07689 0.01882 0.0002918 2.866e-07 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 ky float64 8B 0.2967 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
Plotting 2D data¶
The fastest way to plot a 2D array like this is to use plot_array
. Each axis is automatically labeled.
[6]:
eplt.plot_array(cut)
[6]:
<matplotlib.image.AxesImage at 0x7f458fc59d50>
plot_array
takes many arguments that can customize the look of your plot. The following is an example of some of the functionality provided. For all arguments, see the API reference.
[7]:
eplt.plot_array(
cut, cmap="Greys", gamma=0.5, colorbar=True, colorbar_kw=dict(width=10, ticks=[])
)
[7]:
<matplotlib.image.AxesImage at 0x7f458f883b50>
plot_array
can also be accessed (for 2D data) through the qplot
accessor.
[8]:
cut.qplot(cmap="Greys", gamma=0.5)
[8]:
<matplotlib.image.AxesImage at 0x7f458f7a1f90>
[9]:
eplt.plot_array(cut, cmap="Greys", gamma=0.5)
eplt.fermiline()
eplt.mark_points([-0.525, 0.525], ["K", "K"], fontsize=10, pad=(0, 10))
eplt.nice_colorbar(width=10, ticks=[])
[9]:
<matplotlib.colorbar.Colorbar at 0x7f458deafe10>
Next, let’s add some annotations! The following code adds a line indicating the Fermi level, labels high symmetry points, and adds a colorbar. Here, unlike the previous example, the colorbar was added after plotting. Like this, adding elements separately instead of using keyword arguments can make the code more readable in complex plots.
Slices¶
What if we want to plot multiple slices at once? We should create subplots to place the slices. plt.subplots is very useful in managing multiple axes and figures. If you are unfamiliar with the syntax, visit the relevant matplotlib documentation.
Suppose we want to plot constant energy surfaces at specific binding energies, say, at [-0.4, -0.2, 0.0]
. We could create three subplots and iterate over the axes.
[10]:
energies = [-0.4, -0.2, 0.0]
fig, axs = plt.subplots(1, 3, layout="compressed", sharey=True)
for energy, ax in zip(energies, axs):
const_energy_surface = dat.qsel(eV=energy)
eplt.plot_array(const_energy_surface, ax=ax, gamma=0.5, aspect="equal")
Here, we plotted each constant energy surface with plot_array
. To remove the duplicated y axis labels and add
some annotations, we can use clean_labels
and label_subplot_properties
:
[11]:
fig, axs = plt.subplots(1, 3, layout="compressed", sharey=True)
for energy, ax in zip(energies, axs):
const_energy_surface = dat.qsel(eV=energy)
eplt.plot_array(const_energy_surface, ax=ax, gamma=0.5, aspect="equal")
eplt.clean_labels(axs) # removes shared y labels
eplt.label_subplot_properties(axs, values={"Eb": energies}) # annotates energy
Not bad. However, when it gets to multiple slices along multiple datasets, it
gets cumbersome. Luckily, ERLabPy provides a function that automates the subplot
creation, slicing, and annotation for you:
plot_slices
, which reduces the same code to a
one-liner. See the API reference for a full description of
all possible arguments.
[12]:
fig, axs = eplt.plot_slices([dat], eV=[-0.4, -0.2, 0.0], gamma=0.5, axis="image")
We can also plot the data integrated over an energy window, in this case with a width of 200 meV by adding the eV_width
argument:
[13]:
fig, axs = eplt.plot_slices(
[dat], eV=[-0.4, -0.2, 0.0], eV_width=0.2, gamma=0.5, axis="image"
)
Cuts along constant \(k_y\) can be plotted analogously.
[14]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
Here, we notice that the first two plots slices through regions with less spectral weight, so the color across the three subplots are not on the same scale. This may be misleading in some occasions where intensity across different slices are important. Luckily, we have a function that can unify the color limits across multiple axes.
The same effect can be achieved by passing on same_limits=True
to plot_slices
.
[15]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
eplt.unify_clim(axs)
We can also choose a reference axis to get the color limits from.
[16]:
fig, axs = eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, figsize=(6, 2))
eplt.unify_clim(axs, target=axs.flat[1])
What if we want to plot constant energy surfaces and cuts in the same figure? We can create the subplots first and then utilize the axes
argument of plot_slices
.
[17]:
fig, axs = plt.subplots(2, 3, layout="compressed", sharex=True, sharey="row")
eplt.plot_slices([dat], eV=[-0.4, -0.2, 0.0], gamma=0.5, axes=axs[0, :], axis="image")
eplt.plot_slices([dat], ky=[0.0, 0.1, 0.3], gamma=0.5, axes=axs[1, :])
eplt.clean_labels(axs)
2D colormaps¶
2D colormaps are a method to visualize two data with a single image by mapping one of the data to the lightness of the color and the other to the hue. This is useful when visualizing dichroic or spin-resolved ARPES data [2].
Let us begin with the simulated constant energy contours of Graphene, 0.3 eV below and above the Fermi level.
[18]:
dat0, dat1 = generate_data(
shape=(250, 250, 2), Erange=(-0.3, 0.3), temp=0.0, seed=1, count=1e6
).T
_, axs = eplt.plot_slices(
[dat0, dat1],
order="F",
subplot_kw={"layout": "compressed", "sharey": "row"},
axis="scaled",
label=True,
)
# eplt.label_subplot_properties(axs, values=dict(Eb=[-0.3, 0.3]))
Suppose we want to visualize the sum and the normalized difference between the two. The simplest way is to plot them side by side.
[19]:
dat_sum = dat0 + dat1
dat_ndiff = (dat0 - dat1) / dat_sum
eplt.plot_slices(
[dat_sum, dat_ndiff],
order="F",
subplot_kw={"layout": "compressed", "sharey": "row"},
cmap=["viridis", "bwr"],
axis="scaled",
)
eplt.proportional_colorbar()
[19]:
<matplotlib.colorbar.Colorbar at 0x7f458d6621d0>
The difference array is noisy for small values of the sum. We can plot using a 2D colomap, where dat_ndiff
is mapped to the color along the colormap and dat_sum
is mapped to the lightness of the colormap.
[20]:
eplt.plot_array_2d(dat_sum, dat_ndiff)
[20]:
(<matplotlib.image.AxesImage at 0x7f458d36c990>,
<matplotlib.colorbar.Colorbar at 0x7f458d36e110>)
The color normalization for each axis can be set independently with lnorm
and cnorm
. The appearance of the colorbar axes can be customized with the returned Colorbar
object.
[21]:
_, cb = eplt.plot_array_2d(
dat_sum,
dat_ndiff,
lnorm=eplt.InversePowerNorm(0.5),
cnorm=eplt.CenteredInversePowerNorm(0.7, vcenter=0.0, halfrange=1.0),
)
cb.ax.set_xticks(cb.ax.get_xlim())
cb.ax.set_xticklabels(["Min", "Max"])
[21]:
[Text(1.922459256430518e-08, 0, 'Min'), Text(218.36113081275943, 0, 'Max')]
Styling figures¶
You can control the look and feel of matplotlib figures with style sheets and rcParams. In addition to the options provided by matplotlib, ERLabPy provides some style sheets that are listed below. Note that style sheets that change the default font requires the font to be installed on the system. To see how each one looks, try running the code provided by matplotlib.
Style Name |
Description |
---|---|
khan |
Personal preferences of the package author. |
fira |
Changes the default font to Fira Sans. |
firalight |
Changes the default font to Fira Sans Light. |
times |
Changes the default font to Times New Roman. |
nature |
Changes the default font to Arial, and tweaks some aspects such as padding and default figure size. |
[22]:
with plt.style.context(["nature"]):
eplt.plot_array(cut, cmap="Greys", gamma=0.5)
findfont: Font family ['cursive'] not found. Falling back to DejaVu Sans.
findfont: Generic family 'cursive' not found because none of the following families were found: Apple Chancery, Textile, Zapf Chancery, Sand, Script MT, Felipa, Comic Neue, Comic Sans MS, cursive
Tips¶
Although matplotlib is a powerful library, it is heavy and slow, and better suited for static plots. For interactive plots, libraries such as Plotly or Bokeh are popular.
The hvplot library is a high-level plotting library that provides a simple interface to Bokeh, Plotly, and Matplotlib. It is particularly useful for interactive plots and can be used with xarray objects. Here are some examples that uses the Bokeh backend:
[23]:
import hvplot.xarray
cut.hvplot(x="kx", y="eV", cmap="Greys", aspect=1.5)
[23]:
[24]:
dat.hvplot(x="kx", y="ky", cmap="Greys", aspect="equal", widget_location="bottom")
[24]:
Note
If you are viewing this documentation online, the slider above will not work. To see the interactive plot, you can run the notebook locally after installing hvplot.
For more information, see the hvplot documentation.
Momentum conversion¶
Momentum conversion in ERLabPy is exact with no small angle approximation, but is also
very fast, thanks to the numba-accelerated trilinear interpolation in
erlab.analysis.interpolate
.
Nomenclature¶
Momentum conversion in ERLabPy follows the nomenclature from Ishida and Shin [3]. All experimental geometry are classified into 4 types. Definition of angles differ for each geometry.
For instance, imagine a typical Type 1 setup with a vertical slit that acquires maps by rotating about the z axis in the lab frame. In this case, the polar angle (rotation about z) is \(β\), and the tilt angle is \(ξ\).
In all cases, \(δ\) is the azimuthal angle that indicates in-plane rotation, and \(α\) is the angle along the slit.
[1]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
Let’s generate some example data, this time in angle coordinates.
[3]:
from erlab.io.exampledata import generate_data_angles
dat = generate_data_angles(shape=(200, 60, 300), assign_attributes=True, seed=1).T
dat
[3]:
<xarray.DataArray (eV: 300, beta: 60, alpha: 200)> Size: 29MB 6.269 9.361 7.431 4.192 4.509 ... 0.1411 0.002556 8.644e-06 6.478e-09 3.813e-13 Coordinates: * alpha (alpha) float64 2kB -15.0 -14.85 -14.7 -14.55 ... 14.7 14.85 15.0 * beta (beta) float64 480B -15.0 -14.49 -13.98 -13.47 ... 13.98 14.49 15.0 * eV (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12 xi float64 8B 0.0 delta float64 8B 0.0 hv float64 8B 50.0 Attributes: configuration: 1 temp_sample: 20.0 sample_workfunction: 4.5
Let us define a 2D cut from the map data we just generated.
[4]:
cut = dat.sel(beta=10.0, method="nearest")
eplt.plot_array(cut)
[4]:
<matplotlib.image.AxesImage at 0x7f4b5ebd3250>
Although the functions for momentum conversion are implemented in erlab.analysis.kspace
, the actual conversion is performed using an xarray accessor. Let’s see how it works.
Converting to momentum space¶
Momentum conversion is done by the convert
method of the kspace
accessor. The bounds and resolution are automatically determined from the data if no input is provided. The method returns a new DataArray in momentum space.
[5]:
dat_kconv = dat.kspace.convert()
dat_kconv
Estimating bounds and resolution
Calculating destination coordinates
Converting ('eV', 'alpha', 'beta') -> ('eV', 'kx', 'ky')
Interpolated in 2.088 s
[5]:
<xarray.DataArray (eV: 300, kx: 310, ky: 310)> Size: 231MB nan nan nan nan nan nan nan nan ... 2.316e-06 9.819e-07 nan nan nan nan nan nan Coordinates: xi float64 8B 0.0 delta float64 8B 0.0 hv float64 8B 50.0 * eV (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12 * kx (kx) float64 2kB -0.8956 -0.8898 -0.884 ... 0.884 0.8898 0.8956 * ky (ky) float64 2kB -0.8956 -0.8898 -0.884 ... 0.884 0.8898 0.8956 Attributes: configuration: 1 temp_sample: 20.0 sample_workfunction: 4.5 delta_offset: 0.0 xi_offset: 0.0 beta_offset: 0.0
Let us plot the original and converted data side by side.
[6]:
fig, axs = plt.subplots(1, 2, layout="compressed")
eplt.plot_array(dat.sel(eV=-0.3, method="nearest"), ax=axs[0], aspect="equal")
eplt.plot_array(dat_kconv.sel(eV=-0.3, method="nearest"), ax=axs[1], aspect="equal")
[6]:
<matplotlib.image.AxesImage at 0x7f4b5ca34bd0>
Setting parameters¶
Parameters that are needed for momentum conversion are the information about the
experimental configuration, the inner potential \(V_0\) (for photon energy dependent
data), work function, and angle offsets. These parameters are all stored as data
attributes. The kspace
accessor provides a convenient way to access and modify these
parameters. See configuration
, inner_potential
, work_function
, and offsets
for more information.
First, let’s check the angle offsets.
[7]:
dat.kspace.offsets
[7]:
delta | 0.0 |
xi | 0.0 |
beta | 0.0 |
Since we haven’t set any offsets, they are all zero. We will set the azimuthal angle to 60 degrees and the polar offset to 30 degrees and see what happens.
[8]:
dat.kspace.offsets.update(delta=60.0, beta=30.0)
[8]:
delta | 60.0 |
xi | 0.0 |
beta | 30.0 |
[9]:
dat_kconv = dat.kspace.convert()
dat_kconv
Estimating bounds and resolution
Calculating destination coordinates
Converting ('eV', 'alpha', 'beta') -> ('eV', 'kx', 'ky')
Interpolated in 0.937 s
[9]:
<xarray.DataArray (eV: 300, kx: 380, ky: 398)> Size: 363MB nan nan nan nan nan nan nan nan nan nan ... nan nan nan nan nan nan nan nan nan Coordinates: xi float64 8B 0.0 delta float64 8B 0.0 hv float64 8B 50.0 * eV (eV) float64 2kB -0.45 -0.4481 -0.4462 ... 0.1162 0.1181 0.12 * kx (kx) float64 3kB -2.495 -2.489 -2.483 ... -0.3111 -0.3053 -0.2995 * ky (ky) float64 3kB -0.3431 -0.3373 -0.3315 ... 1.946 1.952 1.957 Attributes: configuration: 1 temp_sample: 20.0 sample_workfunction: 4.5 delta_offset: 60.0 xi_offset: 0.0 beta_offset: 30.0
Plotting the converted data again, we can see the effect of angle offsets on the conversion.
[10]:
fig, axs = plt.subplots(1, 2, layout="compressed")
eplt.plot_array(dat.sel(eV=-0.3, method="nearest"), ax=axs[0], aspect="equal")
eplt.plot_array(dat_kconv.sel(eV=-0.3, method="nearest"), ax=axs[1], aspect="equal")
[10]:
<matplotlib.image.AxesImage at 0x7f4b2eb60790>
Interactive conversion¶
For three dimensional momentum conversion like maps or photon energy scans, an interactive window can be opened where you can adjust the parameters and see the effect right away.
The GUI is divided into two tabs.


The first tab is for setting momentum conversion parameters. The image is updated in real time as you change the parameters. Clicking the “Copy code” button will copy the code for conversion to the clipboard. The “Open in ImageTool” button performs a full three-dimensional conversion and opens the result in the ImageTool.


The second tab provides visualization options. You can overlay Brillouin zones and high symmetry points on the result, adjust colors, and apply binning.
There are two ways to invoke the GUI. The first one is to call the interactive
method on the accessor:
data.kspace.interactive()
The second option is to invoke the GUI directly with the erlab.interactive.ktool
function. The difference is that the latter will automatically determine the name of the input data and apply it to the generated code that is copied to the clipboard.
import erlab.interactive as eri
eri.ktool(data)
Curve fitting¶
Curve fitting in ERLabPy largely relies on lmfit. Along with some convenient models for common fitting tasks, ERLabPy provides a powerful accessor that streamlines curve fitting on multidimensional xarray objects.
ERLabPy also provides optional integration of lmfit models with iminuit, which is a Python interface to the Minuit C++ library developed at CERN.
In this tutorial, we will start with the basics of curve fitting using lmfit, introduce
some models that are available in ERLabPy, and demonstrate curve fitting with the
modelfit
accessor to
fit multidimensional xarray objects. Finally, we will show how to use iminuit with lmfit models.
Basic fitting with lmfit
¶
[1]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from erlab.io.exampledata import generate_data
Let’s start by defining a model function and the data to fit.
[3]:
def poly1(x, a, b):
return a * x + b
# Generate some toy data
x = np.linspace(0, 10, 20)
y = poly1(x, 1, 2)
# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(1)
yerr = np.full_like(x, 0.5)
y = rng.normal(y, yerr)
Fitting a simple function¶
A lmfit model can be created by calling lmfit.Model
class with the model
function and the independent variable(s) as arguments.
[4]:
import lmfit
model = lmfit.Model(poly1)
params = model.make_params(a=1.0, b=2.0)
result = model.fit(y, x=x, params=params, weights=1 / yerr)
result.plot()
result
[4]:
Fit Result
Model: Model(poly1)
fitting method | leastsq |
# function evals | 7 |
# data points | 20 |
# variables | 2 |
chi-square | 5.86642273 |
reduced chi-square | 0.32591237 |
Akaike info crit. | -20.5297448 |
Bayesian info crit. | -18.5382803 |
R-squared | 0.99155992 |
name | value | standard error | relative error | initial value | min | max | vary |
---|---|---|---|---|---|---|---|
a | 0.96713173 | 0.02103116 | (2.17%) | 1.0 | -inf | inf | True |
b | 2.18308498 | 0.12301076 | (5.63%) | 2.0 | -inf | inf | True |
Parameter1 | Parameter 2 | Correlation |
---|---|---|
a | b | -0.8549 |
By passing dictionaries to make_params
, we can set the initial values of the parameters and also set the bounds for the parameters.
[5]:
model = lmfit.Model(poly1)
params = model.make_params(
a={"value": 1.0, "min": 0.0},
b={"value": 2.0, "vary": False},
)
result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()
result
is a lmfit.model.ModelResult
object that contains the results of the
fit. The best-fit parameters can be accessed through the result.params
attribute.
Note
Since all weights are the same in this case, it has little effect on the fit
results. However, if we are confident that we have a good estimate of yerr
, we
can pass scale_covar=True
to the fit
method to obtain accurate
uncertainties.
[6]:
result.params
[6]:
name | value | standard error | relative error | initial value | min | max | vary |
---|---|---|---|---|---|---|---|
a | 0.99389030 | 0.01125608 | (1.13%) | 1.0 | 0.00000000 | inf | True |
b | 2.00000000 | 0.00000000 | (0.00%) | 2.0 | -inf | inf | False |
[7]:
result.params["a"].value, result.params["a"].stderr
[7]:
(0.9938903037183116, 0.011256084507121714)
The parameters can also be retrieved in a form that allows easy error propagation calculation, enabled by the uncertainties package.
[8]:
a_uvar = result.uvars["a"]
print(a_uvar)
print(a_uvar**2)
0.994+/-0.011
0.988+/-0.022
Fitting with composite models¶
Before fitting, let us generate a Gaussian peak on a linear background.
[9]:
# Generate toy data
x = np.linspace(0, 10, 50)
y = -0.1 * x + 2 + 3 * np.exp(-((x - 5) ** 2) / (2 * 1**2))
# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(x, 0.3)
y = rng.normal(y, yerr)
# Plot the data
plt.errorbar(x, y, yerr, fmt="o")
[9]:
<ErrorbarContainer object of 3 artists>
A composite model can be created by adding multiple models together.
[10]:
from lmfit.models import GaussianModel, LinearModel
model = GaussianModel() + LinearModel()
params = model.make_params(slope=-0.1, center=5.0, sigma={"value": 0.1, "min": 0})
params
[10]:
name | value | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|
amplitude | 1.00000000 | None | -inf | inf | True | |
center | 5.00000000 | 5.0 | -inf | inf | True | |
sigma | 0.10000000 | 0.1 | 0.00000000 | inf | True | |
slope | -0.10000000 | -0.1 | -inf | inf | True | |
intercept | 0.00000000 | None | -inf | inf | True | |
fwhm | 0.23548200 | None | -inf | inf | False | 2.3548200*sigma |
height | 3.98942300 | None | -inf | inf | False | 0.3989423*amplitude/max(1e-15, sigma) |
[11]:
result = model.fit(y, x=x, params=params, weights=1 / yerr)
result.plot()
result
[11]:
Fit Result
Model: (Model(gaussian) + Model(linear))
fitting method | leastsq |
# function evals | 43 |
# data points | 50 |
# variables | 5 |
chi-square | 35.1640791 |
reduced chi-square | 0.78142398 |
Akaike info crit. | -7.59989616 |
Bayesian info crit. | 1.96021887 |
R-squared | 0.94677517 |
name | value | standard error | relative error | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|---|---|
amplitude | 7.16590435 | 0.37616176 | (5.25%) | 1.0 | -inf | inf | True | |
center | 4.99152727 | 0.04220625 | (0.85%) | 5.0 | -inf | inf | True | |
sigma | 0.94300480 | 0.04687009 | (4.97%) | 0.1 | 0.00000000 | inf | True | |
slope | -0.11380431 | 0.01318512 | (11.59%) | -0.1 | -inf | inf | True | |
intercept | 2.00633810 | 0.08441486 | (4.21%) | 0.0 | -inf | inf | True | |
fwhm | 2.22060656 | 0.11037063 | (4.97%) | 0.23548200000000002 | -inf | inf | False | 2.3548200*sigma |
height | 3.03156714 | 0.11942779 | (3.94%) | 3.989423 | -inf | inf | False | 0.3989423*amplitude/max(1e-15, sigma) |
Parameter1 | Parameter 2 | Correlation |
---|---|---|
slope | intercept | -0.7822 |
amplitude | sigma | +0.7041 |
amplitude | intercept | -0.4390 |
sigma | intercept | -0.3091 |
center | slope | -0.2592 |
center | intercept | +0.2027 |
How about multiple gaussian peaks? Since the parameter names overlap between the models, we must use the prefix
argument to distinguish between them.
[12]:
model = GaussianModel(prefix="p0_") + GaussianModel(prefix="p1_") + LinearModel()
model.make_params()
[12]:
name | value | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|
p0_amplitude | 1.00000000 | None | -inf | inf | True | |
p0_center | 0.00000000 | None | -inf | inf | True | |
p0_sigma | 1.00000000 | None | 0.00000000 | inf | True | |
p1_amplitude | 1.00000000 | None | -inf | inf | True | |
p1_center | 0.00000000 | None | -inf | inf | True | |
p1_sigma | 1.00000000 | None | 0.00000000 | inf | True | |
slope | 1.00000000 | None | -inf | inf | True | |
intercept | 0.00000000 | None | -inf | inf | True | |
p0_fwhm | 2.35482000 | None | -inf | inf | False | 2.3548200*p0_sigma |
p0_height | 0.39894230 | None | -inf | inf | False | 0.3989423*p0_amplitude/max(1e-15, p0_sigma) |
p1_fwhm | 2.35482000 | None | -inf | inf | False | 2.3548200*p1_sigma |
p1_height | 0.39894230 | None | -inf | inf | False | 0.3989423*p1_amplitude/max(1e-15, p1_sigma) |
For more information, see the lmfit documentation.
Fitting with pre-defined models¶
Creating composite models with different prefixes every time can be cumbersome, so
ERLabPy provides some pre-defined models in erlab.analysis.fit.models
.
Fitting multiple peaks¶
One example is MultiPeakModel
, which is
a composite model of multiple Gaussian or Lorentzian peaks on a linear background. By
supplying keyword arguments, you can specify the number of peaks, their shapes, whether
to multiply with a Fermi-Dirac distribution, and whether to convolve the result with
experimental resolution.
[13]:
from erlab.analysis.fit.models import MultiPeakModel
model = MultiPeakModel(npeaks=1, peak_shapes=["gaussian"], fd=False, convolve=False)
params = model.make_params(p0_center=5.0, p0_width=0.2, p0_height=3.0)
params
[13]:
name | value | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|
p0_center | 5.00000000 | 5.0 | -inf | inf | True | |
p0_width | 0.20000000 | 0.2 | 0.00000000 | inf | True | |
p0_height | 3.00000000 | 3.0 | 0.00000000 | inf | True | |
lin_bkg | 0.00000000 | None | -inf | inf | True | |
const_bkg | 0.00000000 | None | -inf | inf | True | |
p0_sigma | 0.08493218 | None | -inf | inf | False | p0_width / (2 * sqrt(2 * log(2))) |
p0_amplitude | 0.10164911 | None | -inf | inf | False | p0_height * p0_sigma / sqrt(2 * pi) |
[14]:
result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()
We can also plot components.
[15]:
comps = result.eval_components()
plt.errorbar(x, y, yerr, fmt="o", zorder=-1, alpha=0.3)
plt.plot(x, result.eval(), label="Best fit")
plt.plot(x, comps["1Peak_p0"], "--", label="Peak")
plt.plot(x, comps["1Peak_bkg"], "--", label="Background")
plt.legend()
[15]:
<matplotlib.legend.Legend at 0x7f3d52022d90>
Now, let us try fitting MDCs cut from simulated data with multiple Lorentzian peaks, convolved with a common instrumental resolution.
[16]:
data = generate_data(bandshift=-0.2, count=5e8, seed=1).T
cut = data.qsel(ky=0.3)
cut.qplot(colorbar=True)
[16]:
<matplotlib.image.AxesImage at 0x7f3d520979d0>
[17]:
mdc = cut.qsel(eV=0.0)
mdc.qplot()
[17]:
[<matplotlib.lines.Line2D at 0x7f3d51dbcb50>]
First, we define the model and set the initial parameters.
[18]:
from erlab.analysis.fit.models import MultiPeakModel
model = MultiPeakModel(npeaks=4, peak_shapes=["lorentzian"], fd=False, convolve=True)
params = model.make_params(
p0_center=-0.6,
p1_center=-0.45,
p2_center=0.45,
p3_center=0.6,
p0_width=0.02,
p1_width=0.02,
p2_width=0.02,
p3_width=0.02,
lin_bkg={"value": 0.0, "vary": False},
const_bkg=0.0,
resolution=0.03,
)
params
[18]:
name | value | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|
p0_center | -0.60000000 | -0.6 | -inf | inf | True | |
p0_width | 0.02000000 | 0.02 | 0.00000000 | inf | True | |
p0_height | -inf | None | 0.00000000 | inf | True | |
p1_center | -0.45000000 | -0.45 | -inf | inf | True | |
p1_width | 0.02000000 | 0.02 | 0.00000000 | inf | True | |
p1_height | -inf | None | 0.00000000 | inf | True | |
p2_center | 0.45000000 | 0.45 | -inf | inf | True | |
p2_width | 0.02000000 | 0.02 | 0.00000000 | inf | True | |
p2_height | -inf | None | 0.00000000 | inf | True | |
p3_center | 0.60000000 | 0.6 | -inf | inf | True | |
p3_width | 0.02000000 | 0.02 | 0.00000000 | inf | True | |
p3_height | -inf | None | 0.00000000 | inf | True | |
lin_bkg | 0.00000000 | 0.0 | -inf | inf | False | |
const_bkg | 0.00000000 | 0.0 | -inf | inf | True | |
resolution | 0.03000000 | 0.03 | 0.00000000 | inf | True | |
p0_sigma | 0.01000000 | None | -inf | inf | False | p0_width / 2 |
p0_amplitude | -inf | None | -inf | inf | False | p0_height * p0_sigma * pi |
p1_sigma | 0.01000000 | None | -inf | inf | False | p1_width / 2 |
p1_amplitude | -inf | None | -inf | inf | False | p1_height * p1_sigma * pi |
p2_sigma | 0.01000000 | None | -inf | inf | False | p2_width / 2 |
p2_amplitude | -inf | None | -inf | inf | False | p2_height * p2_sigma * pi |
p3_sigma | 0.01000000 | None | -inf | inf | False | p3_width / 2 |
p3_amplitude | -inf | None | -inf | inf | False | p3_height * p3_sigma * pi |
Then, we can fit the model to the data:
[19]:
result = model.fit(mdc, x=mdc.kx, params=params)
result.plot()
result
[19]:
Fit Result
Model: Model(4Peak)
fitting method | leastsq |
# function evals | 305 |
# data points | 250 |
# variables | 14 |
chi-square | 566.514610 |
reduced chi-square | 2.40048564 |
Akaike info crit. | 232.510488 |
Bayesian info crit. | 281.810941 |
R-squared | 0.99988540 |
name | value | standard error | relative error | initial value | min | max | vary | expression |
---|---|---|---|---|---|---|---|---|
p0_center | -0.59650469 | 2.5399e-05 | (0.00%) | -0.6 | -inf | inf | True | |
p0_width | 0.02403002 | 1.6855e-04 | (0.70%) | 0.02 | 0.00000000 | inf | True | |
p0_height | 1161.03463 | 6.49245237 | (0.56%) | 0.0 | 0.00000000 | inf | True | |
p1_center | -0.44489439 | 4.6412e-04 | (0.10%) | -0.45 | -inf | inf | True | |
p1_width | 0.01804181 | 0.00148733 | (8.24%) | 0.02 | 0.00000000 | inf | True | |
p1_height | 68.4523785 | 4.18812978 | (6.12%) | 0.0 | 0.00000000 | inf | True | |
p2_center | 0.44353079 | 5.1714e-04 | (0.12%) | 0.45 | -inf | inf | True | |
p2_width | 0.02372127 | 0.00164247 | (6.92%) | 0.02 | 0.00000000 | inf | True | |
p2_height | 57.0667112 | 2.80977111 | (4.92%) | 0.0 | 0.00000000 | inf | True | |
p3_center | 0.59606494 | 2.5489e-05 | (0.00%) | 0.6 | -inf | inf | True | |
p3_width | 0.02443832 | 1.6821e-04 | (0.69%) | 0.02 | 0.00000000 | inf | True | |
p3_height | 1152.96218 | 6.30752762 | (0.55%) | 0.0 | 0.00000000 | inf | True | |
lin_bkg | 0.00000000 | 0.00000000 | 0.0 | -inf | inf | False | ||
const_bkg | 0.27454009 | 0.13298720 | (48.44%) | 0.0 | -inf | inf | True | |
resolution | 0.02634117 | 1.6289e-04 | (0.62%) | 0.03 | 0.00000000 | inf | True | |
p0_sigma | 0.01201501 | 8.4274e-05 | (0.70%) | 0.01 | -inf | inf | False | p0_width / 2 |
p0_amplitude | 43.8247214 | 0.08237775 | (0.19%) | 0.0 | -inf | inf | False | p0_height * p0_sigma * pi |
p1_sigma | 0.00902091 | 7.4366e-04 | (8.24%) | 0.01 | -inf | inf | False | p1_width / 2 |
p1_amplitude | 1.93994144 | 0.06140643 | (3.17%) | 0.0 | -inf | inf | False | p1_height * p1_sigma * pi |
p2_sigma | 0.01186064 | 8.2124e-04 | (6.92%) | 0.01 | -inf | inf | False | p2_width / 2 |
p2_amplitude | 2.12637915 | 0.06652226 | (3.13%) | 0.0 | -inf | inf | False | p2_height * p2_sigma * pi |
p3_sigma | 0.01221916 | 8.4107e-05 | (0.69%) | 0.01 | -inf | inf | False | p3_width / 2 |
p3_amplitude | 44.2594703 | 0.08273375 | (0.19%) | 0.0 | -inf | inf | False | p3_height * p3_sigma * pi |
Parameter1 | Parameter 2 | Correlation |
---|---|---|
p0_width | p0_height | -0.9807 |
p3_width | p3_height | -0.9801 |
p1_width | p1_height | -0.9455 |
p2_width | p2_height | -0.9151 |
p0_height | resolution | +0.9068 |
p3_height | resolution | +0.9054 |
p0_width | resolution | -0.8802 |
p3_width | resolution | -0.8785 |
p0_height | p3_height | +0.8238 |
p0_width | p3_height | -0.8029 |
p0_height | p3_width | -0.8028 |
p0_width | p3_width | +0.7866 |
p0_width | const_bkg | -0.4211 |
p3_width | const_bkg | -0.4178 |
p0_height | const_bkg | +0.3711 |
p3_height | const_bkg | +0.3683 |
const_bkg | resolution | +0.3524 |
p2_width | const_bkg | -0.2401 |
p1_width | const_bkg | -0.2398 |
p1_height | const_bkg | +0.1708 |
p2_height | const_bkg | +0.1437 |
Fitting xarray
objects¶
ERLabPy provides accessors for xarray objects that allows you to fit data with lmfit
models: xarray.DataArray.modelfit
or
xarray.Dataset.modelfit
,
depending on whether you want to fit a single DataArray or multiple DataArrays in a
Dataset. The syntax is similar to xarray.DataArray.curvefit()
and
xarray.Dataset.curvefit()
. The accessor returns a xarray.Dataset
with the
best-fit parameters and the fit statistics. The example below illustrates how to use the
accessor to conduct the same fit as in the previous example.
[20]:
result_ds = mdc.modelfit("kx", model, params=params)
result_ds
[20]:
<xarray.Dataset> Size: 14kB Dimensions: (param: 23, cov_i: 23, cov_j: 23, fit_stat: 8, kx: 250) Coordinates: ky float64 8B 0.2967 eV float64 8B -0.000301 * kx (kx) float64 2kB -0.89 -0.8829 ... 0.8829 0.89 * param (param) <U12 1kB 'p0_center' ... 'p3_amplitude' * fit_stat (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic' * cov_i (cov_i) <U12 1kB 'p0_center' ... 'p3_amplitude' * cov_j (cov_j) <U12 1kB 'p0_center' ... 'p3_amplitude' Data variables: modelfit_results object 8B <lmfit.model.ModelResult object at 0x7f3... modelfit_coefficients (param) float64 184B -0.5965 0.02403 ... 44.26 modelfit_stderr (param) float64 184B 2.54e-05 0.0001685 ... 0.08273 modelfit_covariance (cov_i, cov_j) float64 4kB 6.451e-10 ... nan modelfit_stats (fit_stat) float64 64B 305.0 14.0 ... 232.5 281.8 modelfit_data (kx) float64 2kB 2.337 2.061 2.908 ... 3.202 4.133 modelfit_best_fit (kx) float64 2kB 2.336 2.437 2.545 ... 2.495 2.392
modelfit_results
contains the underlying lmfit.model.ModelResult
objects.
The coefficients and their errors are stored in modelfit_coefficients
and
modelfit_stderr
, respectively. The goodness-of-fit statistics are stored
in modelfit_stats
.
Fitting across multiple dimensions¶
The accessor comes in handy when you have to fit a single model to multiple data points across arbitrary dimensions. Let’s demonstrate this with a simulated cut that represents a curved Fermi edge at 100 K, with an energy broadening of 20 meV.
[21]:
from erlab.io.exampledata import generate_gold_edge
from erlab.analysis.fit.models import FermiEdgeModel
gold = generate_gold_edge(temp=100, Eres=0.02, count=5e5, seed=1)
gold.qplot(cmap="Greys")
[21]:
<matplotlib.image.AxesImage at 0x7f3d51609690>
We first select ± 0.2 eV around the Fermi level and fit the model across the energy axis for every EDC.
[22]:
gold_selected = gold.sel(eV=slice(-0.2, 0.2))
result_ds = gold_selected.modelfit(
coords="eV",
model=FermiEdgeModel(),
params={"temp": {"value": 100.0, "vary": False}},
guess=True,
)
result_ds
[22]:
<xarray.Dataset> Size: 358kB Dimensions: (alpha: 200, param: 7, cov_i: 7, cov_j: 7, fit_stat: 8, eV: 75) Coordinates: * alpha (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0 * eV (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983 * param (param) <U10 280B 'center' 'temp' ... 'dos0' 'dos1' * fit_stat (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic' * cov_i (cov_i) <U10 280B 'center' 'temp' ... 'dos0' 'dos1' * cov_j (cov_j) <U10 280B 'center' 'temp' ... 'dos0' 'dos1' Data variables: modelfit_results (alpha) object 2kB <lmfit.model.ModelResult object... modelfit_coefficients (alpha, param) float64 11kB -0.01774 100.0 ... -3.53 modelfit_stderr (alpha, param) float64 11kB 0.005598 0.0 ... 4.111 modelfit_covariance (alpha, cov_i, cov_j) float64 78kB 3.134e-05 ... 16.9 modelfit_stats (alpha, fit_stat) float64 13kB 106.0 6.0 ... -26.34 modelfit_data (alpha, eV) float64 120kB 5.1 7.021 7.453 ... 0.0 0.0 modelfit_best_fit (alpha, eV) float64 120kB 6.642 6.565 ... -0.0711
Let’s plot the fitted parameters as a function of angle!
[23]:
gold.qplot(cmap="Greys")
plt.errorbar(
gold_selected.alpha,
result_ds.modelfit_coefficients.sel(param="center"),
result_ds.modelfit_stderr.sel(param="center"),
fmt=".",
)
[23]:
<ErrorbarContainer object of 3 artists>
We can also conduct a global fit to all the EDCs with a multidimensional model. First,
we normalize the data with the averaged intensity of each EDC and then fit the data to
FermiEdge2dModel
.
[24]:
from erlab.analysis.fit.models import FermiEdge2dModel
gold_norm = gold_selected / gold_selected.mean("eV")
result_ds = gold_norm.T.modelfit(
coords=["eV", "alpha"],
model=FermiEdge2dModel(),
params={"temp": {"value": 100.0, "vary": False}},
guess=True,
)
result_ds
[24]:
<xarray.Dataset> Size: 244kB Dimensions: (param: 8, cov_i: 8, cov_j: 8, fit_stat: 8, alpha: 200, eV: 75) Coordinates: * alpha (alpha) float64 2kB -15.0 -14.85 -14.7 ... 14.85 15.0 * eV (eV) float64 600B -0.1977 -0.1923 ... 0.193 0.1983 * param (param) <U10 320B 'c0' 'c1' ... 'offset' 'resolution' * fit_stat (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic' * cov_i (cov_i) <U10 320B 'c0' 'c1' ... 'offset' 'resolution' * cov_j (cov_j) <U10 320B 'c0' 'c1' ... 'offset' 'resolution' Data variables: modelfit_results object 8B <lmfit.model.ModelResult object at 0x7f3... modelfit_coefficients (param) float64 64B 0.03604 8.09e-06 ... 0.04345 modelfit_stderr (param) float64 64B 0.0004463 2.749e-05 ... 0.001243 modelfit_covariance (cov_i, cov_j) float64 512B 1.992e-07 ... 1.545e-06 modelfit_stats (fit_stat) float64 64B 82.0 7.0 ... -3.98e+04 modelfit_data (alpha, eV) float64 120kB 2.027 2.791 ... 0.0 0.0 modelfit_best_fit (alpha, eV) float64 120kB 1.965 1.957 ... 0.02488
Let’s plot the fit results and the residuals.
[25]:
best_fit = result_ds.modelfit_best_fit.transpose(*gold_norm.dims)
fig, axs = eplt.plot_slices(
[gold_norm, best_fit, best_fit - gold_norm],
figsize=(4, 5),
cmap=["viridis", "viridis", "bwr"],
norm=[plt.Normalize(), plt.Normalize(), eplt.CenteredPowerNorm(1.0, vcenter=0)],
colorbar="all",
hide_colorbar_ticks=False,
colorbar_kw={"width": 7},
)
eplt.set_titles(axs, ["Data", "FermiEdge2dModel", "Residuals"])
Providing multiple initial guesses¶
You can also provide different initial guesses and bounds for different coordinates. To demonstrate, let’s create some data containing multiple MDCs, each with a different peak position.
[26]:
# Define angle coordinates for 2D data
alpha = np.linspace(-5.0, 5.0, 100)
beta = np.linspace(-1.0, 1.0, 3)
# Center of the peaks along beta
center = np.array([-2.0, 0.0, 2.0])[:, np.newaxis]
# Gaussian peak on a linear background
y = -0.1 * alpha + 2 + 3 * np.exp(-((alpha - center) ** 2) / (2 * 1**2))
# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(y, 0.1)
y = rng.normal(y, yerr)
# Transform to DataArray
darr = xr.DataArray(y, dims=["beta", "alpha"], coords={"beta": beta, "alpha": alpha})
darr.qplot()
[26]:
<matplotlib.image.AxesImage at 0x7f3d5080f450>
We can provide different initial guesses for the peak positions along the beta
dimension by passing a dictionary of DataArrays to the params
argument.
[27]:
result_ds = darr.modelfit(
coords="alpha",
model=GaussianModel() + LinearModel(),
params={
"center": xr.DataArray([-2, 0, 2], coords=[darr.beta]),
"slope": -0.1,
},
)
result_ds
[27]:
<xarray.Dataset> Size: 7kB Dimensions: (beta: 3, param: 5, cov_i: 5, cov_j: 5, fit_stat: 8, alpha: 100) Coordinates: * beta (beta) float64 24B -1.0 0.0 1.0 * alpha (alpha) float64 800B -5.0 -4.899 -4.798 ... 4.899 5.0 * param (param) <U9 180B 'amplitude' 'center' ... 'intercept' * fit_stat (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic' * cov_i (cov_i) <U9 180B 'amplitude' 'center' ... 'intercept' * cov_j (cov_j) <U9 180B 'amplitude' 'center' ... 'intercept' Data variables: modelfit_results (beta) object 24B <lmfit.model.ModelResult object ... modelfit_coefficients (beta, param) float64 120B 7.324 -2.0 ... 1.993 modelfit_stderr (beta, param) float64 120B 0.1349 0.011 ... 0.01735 modelfit_covariance (beta, cov_i, cov_j) float64 600B 0.01819 ... 0.00... modelfit_stats (beta, fit_stat) float64 192B 31.0 5.0 ... -450.2 modelfit_data (beta, alpha) float64 2kB 2.453 2.402 ... 1.404 1.64 modelfit_best_fit (beta, alpha) float64 2kB 2.553 2.553 ... 1.526 1.504
Let’s overlay the fitted peak positions on the data.
[28]:
result_ds.modelfit_data.qplot()
result_center = result_ds.sel(param="center")
plt.plot(result_center.modelfit_coefficients, result_center.beta, '.-')
[28]:
[<matplotlib.lines.Line2D at 0x7f3d509da2d0>]
The same can be done with all parameter attributes that can be passed to
lmfit.create_params()
(e.g., vary
, min
, max
, etc.). For example:
[29]:
result_ds = darr.modelfit(
coords="alpha",
model=GaussianModel() + LinearModel(),
params={
"center": {
"value": xr.DataArray([-2, 0, 2], coords=[darr.beta]),
"min": -5.0,
"max": xr.DataArray([0, 2, 5], coords=[darr.beta]),
},
"slope": -0.1,
},
)
result_ds
[29]:
<xarray.Dataset> Size: 7kB Dimensions: (beta: 3, param: 5, cov_i: 5, cov_j: 5, fit_stat: 8, alpha: 100) Coordinates: * beta (beta) float64 24B -1.0 0.0 1.0 * alpha (alpha) float64 800B -5.0 -4.899 -4.798 ... 4.899 5.0 * param (param) <U9 180B 'amplitude' 'center' ... 'intercept' * fit_stat (fit_stat) <U6 192B 'nfev' 'nvarys' ... 'aic' 'bic' * cov_i (cov_i) <U9 180B 'amplitude' 'center' ... 'intercept' * cov_j (cov_j) <U9 180B 'amplitude' 'center' ... 'intercept' Data variables: modelfit_results (beta) object 24B <lmfit.model.ModelResult object ... modelfit_coefficients (beta, param) float64 120B 7.324 -2.0 ... 1.993 modelfit_stderr (beta, param) float64 120B 0.1349 0.011 ... 0.01735 modelfit_covariance (beta, cov_i, cov_j) float64 600B 0.01819 ... 0.00... modelfit_stats (beta, fit_stat) float64 192B 31.0 5.0 ... -450.2 modelfit_data (beta, alpha) float64 2kB 2.453 2.402 ... 1.404 1.64 modelfit_best_fit (beta, alpha) float64 2kB 2.553 2.553 ... 1.526 1.504
Visualizing fits¶
If hvplot is installed, we can visualize the fit results interactively with the qshow accessor.
Note
If you are viewing this documentation online, the plot will not be interactive. Run the code locally to try it out.
[30]:
result_ds.qshow(plot_components=True)
[30]:
Note
There is a dedicated module for Fermi edge fitting and correction, described here.
Parallelization¶
The accessors are tightly integrated with xarray, so passing a dask array will parallelize the fitting process. See Parallel Computing with Dask for more information.
For non-dask objects, you can achieve joblib-based parallelization:
For non-dask Datasets, basic parallelization across multiple data variables can be achieved with the
parallel
argument toxarray.Dataset.modelfit
.For parallelizing fits on a single DataArray along a dimension with many points, the
xarray.DataArray.parallel_fit
accessor can be used. This method is similar toxarray.DataArray.modelfit
, but requires the name of the dimension to parallelize over instead of the coordinates to fit along. For example, to parallelize the fit in the previous example, you can use the following code:gold_selected.parallel_fit( dim="alpha", model=FermiEdgeModel(), params={"temp": {"value": 100.0, "vary": False}}, guess=True, )
Note
Note that the initial run will take a long time due to the overhead of creating parallel workers. Subsequent calls will run faster, since joblib’s default backend will try to reuse the workers.
The accessor has some intrinsic overhead due to post-processing. If you need the best performance, handle the parallelization yourself with joblib and
lmfit.Model.fit
.
Saving and loading fits¶
Since the resulting dataset contains no Python objects, it can be saved and loaded
reliably as a netCDF file using xarray.Dataset.to_netcdf()
and
xarray.open_dataset()
, respectively. If you wish to obtain the
lmfit.model.ModelResult
objects from the fit, you can use the output_result
argument.
Warning
Saving full model results that includes the model functions can be difficult. Instead of saving the fit results, it is recommended to save the code that can reproduce the fit. See the relevant lmfit documentation for more information.
Fermi edge fitting¶
Functions related to the Fermi edge are available in erlab.analysis.gold
. To fit
a polynomial to a Fermi edge, you can use erlab.analysis.gold.poly()
.
[32]:
import erlab.analysis as era
import erlab.plotting.erplot as eplt
gold = generate_gold_edge(temp=100, seed=1)
result = era.gold.poly(
gold,
angle_range=(-15, 15),
eV_range=(-0.2, 0.2),
temp=100.0,
vary_temp=False,
degree=2,
plot=True,
)
The resulting polynomial can be used to correct the Fermi edge with
erlab.analysis.gold.correct_with_edge()
.
[33]:
era.correct_with_edge(gold, result).qplot(cmap="Greys")
eplt.fermiline()
[33]:
<matplotlib.lines.Line2D at 0x7f3d49fa2150>
Also check out the interactive Fermi edge fitting tool,
erlab.interactive.goldtool()
.
Using iminuit
¶
Note
This part requires iminuit.
iminuit is a powerful Python interface to the Minuit C++ library developed at CERN. To learn more, see the iminuit documentation.
ERLabPy provides a thin wrapper around iminuit.Minuit
that allows you to use
lmfit models with iminuit. The example below conducts the same fit as the previous one,
but using iminuit.
[34]:
from erlab.analysis.fit.minuit import Minuit
from erlab.analysis.fit.models import MultiPeakModel
model = MultiPeakModel(npeaks=4, peak_shapes=["lorentzian"], fd=False, convolve=True)
m = Minuit.from_lmfit(
model,
mdc,
mdc.kx,
p0_center=-0.6,
p1_center=-0.45,
p2_center=0.45,
p3_center=0.6,
p0_width=0.02,
p1_width=0.02,
p2_width=0.02,
p3_width=0.02,
p0_height=1500,
p1_height=50,
p2_height=50,
p3_height=1500,
lin_bkg={"value": 0.0, "vary": False},
const_bkg=0.0,
resolution=0.03,
)
m.migrad()
m.minos()
m.hesse()
[34]:
Migrad | |
---|---|
FCN = 566.5 (χ²/ndof = 2.4) | Nfcn = 5120 |
EDM = 4.13e-06 (Goal: 0.0002) | |
Valid Minimum | Below EDM threshold (goal x 10) |
No parameters at limit | Below call limit |
Hesse ok | Covariance accurate |
Name | Value | Hesse Error | Minos Error- | Minos Error+ | Limit- | Limit+ | Fixed | |
---|---|---|---|---|---|---|---|---|
0 | p0_center | -596.505e-3 | 0.016e-3 | -0.016e-3 | 0.016e-3 | |||
1 | p0_width | 24.03e-3 | 0.11e-3 | -0.11e-3 | 0.11e-3 | 0 | ||
2 | p0_height | 1.161e3 | 0.004e3 | -0.004e3 | 0.004e3 | 0 | ||
3 | p1_center | -444.89e-3 | 0.30e-3 | -0.30e-3 | 0.30e-3 | |||
4 | p1_width | 18e-3 | 1e-3 | -1e-3 | 1e-3 | 0 | ||
5 | p1_height | 68.5 | 2.8 | -2.7 | 2.9 | 0 | ||
6 | p2_center | 443.53e-3 | 0.34e-3 | -0.34e-3 | 0.34e-3 | |||
7 | p2_width | 0.0237 | 0.0010 | -0.0010 | 0.0010 | 0 | ||
8 | p2_height | 57.1 | 1.8 | -1.7 | 1.8 | 0 | ||
9 | p3_center | 596.065e-3 | 0.016e-3 | -0.016e-3 | 0.017e-3 | |||
10 | p3_width | 24.44e-3 | 0.11e-3 | -0.11e-3 | 0.11e-3 | 0 | ||
11 | p3_height | 1.153e3 | 0.004e3 | -0.004e3 | 0.004e3 | 0 | ||
12 | lin_bkg | 0.0 | 0.1 | yes | ||||
13 | const_bkg | 0.27 | 0.09 | -0.09 | 0.09 | |||
14 | resolution | 26.34e-3 | 0.10e-3 | -0.10e-3 | 0.10e-3 | 0 |
p0_center | p0_width | p0_height | p1_center | p1_width | p1_height | p2_center | p2_width | p2_height | p3_center | p3_width | p3_height | const_bkg | resolution | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Error | -0.016e-3 | 0.016e-3 | -0.11e-3 | 0.11e-3 | -4 | 4 | -0.3e-3 | 0.3e-3 | -1e-3 | 1e-3 | -2.7 | 2.9 | -0.34e-3 | 0.34e-3 | -0.001 | 0.001 | -1.7 | 1.8 | -0.016e-3 | 0.017e-3 | -0.11e-3 | 0.11e-3 | -4 | 4 | -0.09 | 0.09 | -0.1e-3 | 0.1e-3 |
Valid | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True | True |
At Limit | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False |
Max FCN | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False |
New Min | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False | False |
p0_center | p0_width | p0_height | p1_center | p1_width | p1_height | p2_center | p2_width | p2_height | p3_center | p3_width | p3_height | lin_bkg | const_bkg | resolution | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
p0_center | 2.68e-10 | 0.01e-9 (0.004) | -309.64e-9 (-0.005) | 0.02e-9 (0.004) | -0.47e-9 (-0.029) | 1.09497e-6 (0.024) | -0 | -0.04e-9 (-0.002) | 24.04e-9 | -0 | 0.01e-9 (0.004) | -283.08e-9 (-0.004) | 0 | 9.90e-9 (0.007) | -0.01e-9 (-0.005) |
p0_width | 0.01e-9 (0.004) | 1.17e-08 | -442.668e-6 (-0.981) | 0.001e-6 (0.036) | 0.001e-6 (0.007) | -10.032e-6 (-0.033) | -0.002e-6 (-0.045) | 0.006e-6 (0.054) | -13.035e-6 (-0.068) | 0.01e-9 (0.008) | 0.009e-6 (0.785) | -352.815e-6 (-0.801) | 0 | -3.894e-6 (-0.419) | -0.010e-6 (-0.879) |
p0_height | -309.64e-9 (-0.005) | -442.668e-6 (-0.981) | 17.4 | -41.28e-6 (-0.033) | -63.7e-6 (-0.015) | 1 (0.043) | 66.39e-6 (0.047) | -173.7e-6 (-0.040) | 0.5 (0.062) | -523.22e-9 (-0.008) | -362.570e-6 (-0.802) | 14 (0.823) | 0 | 0.132 (0.369) | 394.794e-6 (0.906) |
p1_center | 0.02e-9 (0.004) | 0.001e-6 (0.036) | -41.28e-6 (-0.033) | 8.75e-08 | 0 (0.016) | -7.69e-6 (-0.009) | -0 (-0.001) | 0 (0.004) | -1.73e-6 (-0.003) | 0 | 0.001e-6 (0.026) | -31.38e-6 (-0.026) | 0 | -0.59e-6 (-0.023) | -0.001e-6 (-0.028) |
p1_width | -0.47e-9 (-0.029) | 0.001e-6 (0.007) | -63.7e-6 (-0.015) | 0 (0.016) | 9.86e-07 | -2.6249e-3 (-0.949) | -0 | 0.1e-6 (0.057) | -53.5e-6 (-0.030) | 0.04e-9 (0.003) | 0.007e-6 (0.067) | -212.3e-6 (-0.053) | 0e-6 | -20.8e-6 (-0.244) | -0.005e-6 (-0.045) |
p1_height | 1.09497e-6 (0.024) | -10.032e-6 (-0.033) | 1 (0.043) | -7.69e-6 (-0.009) | -2.6249e-3 (-0.949) | 7.76 | 2.16e-6 (0.002) | -111.5e-6 (-0.039) | 0.1 (0.023) | -98.11e-9 (-0.002) | -24.143e-6 (-0.080) | 1 (0.072) | 0 | 0.042 (0.177) | 20.462e-6 (0.070) |
p2_center | -0 | -0.002e-6 (-0.045) | 66.39e-6 (0.047) | -0 (-0.001) | -0 | 2.16e-6 (0.002) | 1.15e-07 | 0.03e-6 (0.084) | -44.92e-6 (-0.075) | 0.05e-9 (0.009) | -0.002e-6 (-0.065) | 83.33e-6 (0.060) | 0 | 0.30e-6 (0.010) | 0.002e-6 (0.052) |
p2_width | -0.04e-9 (-0.002) | 0.006e-6 (0.054) | -173.7e-6 (-0.040) | 0 (0.004) | 0.1e-6 (0.057) | -111.5e-6 (-0.039) | 0.03e-6 (0.084) | 1.07e-06 | -1.6728e-3 (-0.911) | 0.57e-9 (0.033) | -0.001e-6 (-0.011) | -0.7e-6 | 0 | -20.8e-6 (-0.234) | -0.003e-6 (-0.031) |
p2_height | 24.04e-9 | -13.035e-6 (-0.068) | 0.5 (0.062) | -1.73e-6 (-0.003) | -53.5e-6 (-0.030) | 0.1 (0.023) | -44.92e-6 (-0.075) | -1.6728e-3 (-0.911) | 3.15 | -752.74e-9 (-0.026) | -4.459e-6 (-0.023) | 0.3 (0.036) | 0.0 | 0.021 (0.136) | 11.425e-6 (0.062) |
p3_center | -0 | 0.01e-9 (0.008) | -523.22e-9 (-0.008) | 0 | 0.04e-9 (0.003) | -98.11e-9 (-0.002) | 0.05e-9 (0.009) | 0.57e-9 (0.033) | -752.74e-9 (-0.026) | 2.71e-10 | 0.01e-9 (0.005) | -373.21e-9 (-0.006) | 0 | -16.96e-9 (-0.012) | -0.01e-9 (-0.008) |
p3_width | 0.01e-9 (0.004) | 0.009e-6 (0.785) | -362.570e-6 (-0.802) | 0.001e-6 (0.026) | 0.007e-6 (0.067) | -24.143e-6 (-0.080) | -0.002e-6 (-0.065) | -0.001e-6 (-0.011) | -4.459e-6 (-0.023) | 0.01e-9 (0.005) | 1.18e-08 | -432.171e-6 (-0.980) | 0 | -3.878e-6 (-0.417) | -0.010e-6 (-0.879) |
p3_height | -283.08e-9 (-0.004) | -352.815e-6 (-0.801) | 14 (0.823) | -31.38e-6 (-0.026) | -212.3e-6 (-0.053) | 1 (0.072) | 83.33e-6 (0.060) | -0.7e-6 | 0.3 (0.036) | -373.21e-9 (-0.006) | -432.171e-6 (-0.980) | 16.5 | 0 | 0.128 (0.367) | 384.873e-6 (0.905) |
lin_bkg | 0 | 0 | 0 | 0 | 0e-6 | 0 | 0 | 0 | 0.0 | 0 | 0 | 0 | 0 | 0.000 | 0 |
const_bkg | 9.90e-9 (0.007) | -3.894e-6 (-0.419) | 0.132 (0.369) | -0.59e-6 (-0.023) | -20.8e-6 (-0.244) | 0.042 (0.177) | 0.30e-6 (0.010) | -20.8e-6 (-0.234) | 0.021 (0.136) | -16.96e-9 (-0.012) | -3.878e-6 (-0.417) | 0.128 (0.367) | 0.000 | 0.00736 | 3.146e-6 (0.351) |
resolution | -0.01e-9 (-0.005) | -0.010e-6 (-0.879) | 394.794e-6 (0.906) | -0.001e-6 (-0.028) | -0.005e-6 (-0.045) | 20.462e-6 (0.070) | 0.002e-6 (0.052) | -0.003e-6 (-0.031) | 11.425e-6 (0.062) | -0.01e-9 (-0.008) | -0.010e-6 (-0.879) | 384.873e-6 (0.905) | 0 | 3.146e-6 (0.351) | 1.09e-08 |
You can also use the interactive fitting interface provided by iminuit.
Note
Requires ipywidgets to be installed.
If you are viewing this documentation online, changing the sliders won’t change the plot. run the code locally to try it out.
[35]:
m.interactive()
[35]:
Using ImageTool¶


Inspired by Image Tool for Igor Pro written by the Advanced Light Source at Lawrence
Berkeley National Laboratory, ImageTool
is a simple tool for interactively exploring images.
Features include:
Zooming and panning
Extremely fast and smooth data exploration
Real-time binning across multiple dimensions
Multiple cursors!
Easy and intuitive plot size adjustment with splitters
Advanced colormap control
ImageTool can be used to display image-like xarray.DataArray
s ranging from
2 to 4 dimensions. If a coordinate of the input data happens to be non-uniform, it will
automatically be converted to an index array so that the data can be displayed.
There are two main ways to invoke the ImageTool. First is to call the itool
convenience function, which will create a new
ImageTool instance and handle the event loop execution:
import erlab.interactive as eri
eri.itool(data)
Another way is to use the qshow
accessor:
data.qshow()
Tips¶
If you don’t know what a button does, many buttons have tooltips that will appear when you hover over them.
Right-clicking on each plot will bring up a context menu with various options. One useful option is
Copy selection code
that copies the selection code which can be quickly pasted to a Python script or Jupyter notebook to reproduce the sliced data. You can also save the data corresponding to each slice as a HDF5 file.ImageTool is also very extensible. At our home lab, we use a modified version of ImageTool to plot data as it is being collected in real-time!
Keyboard shortcuts¶
Hints for most keyboard shortcuts are displayed in the menu bar. Here, some shortcuts that are not found in the menu bar are listed. Mac users must replace Ctrl with ⌘ and Alt with ⌥.
Shortcut |
Description |
---|---|
LMB Drag |
Pan around |
RMB Drag |
Zoom and scale |
Ctrl+LMB Drag |
Move current cursor around |
Ctrl+Alt+LMB Drag |
Move all cursors around |
Alt while dragging a cursor line |
Make all cursor lines move together |
Further reading¶
API Reference¶
ERLabPy is organized into multiple subpackages and submodules classified by their functionality. The following table lists the subpackages and submodules of ERLabPy.
Subpackages¶
Subpackage |
Description |
---|---|
Routines for analyzing ARPES data. |
|
Reading and writing data. |
|
Functions related to static plotting with matplotlib. |
|
Interactive tools and widgets based on Qt and pyqtgraph |
|
xarray accessors. You will not need to import this module directly. |
Analysis (erlab.analysis
)¶
Various functions for data analysis.
Modules¶
Utilities for curve fitting. |
|
Functions related to masking. |
|
Macros for correlation analysis. |
|
Fermi edge fitting. |
|
Various image processing functions including tools for visualizing dispersive features. |
|
Utilities for interpolation. |
|
Momentum conversion functions. |
|
Transformations. |
|
- erlab.analysis.correct_with_edge(darr, modelresult, shift_coords=True, plot=False, plot_kw=None, **shift_kwargs)[source]¶
Corrects the given data array
darr
with the given values or fit result.- Parameters:
darr (DataArray) – The input data array to be corrected.
modelresult (ModelResult | ndarray[Any, dtype[floating]] | Callable) – The model result that contains the fermi edge information. It can be an instance of
lmfit.model.ModelResult
, a numpy array containing the edge position at each angle, or a callable function that takes an array of angles and returns the corresponding energy value.shift_coords (bool) – If
True
, the coordinates of the output data will be changed so that the output contains all the values of the original data. IfFalse
, the coordinates and shape of the original data will be retained, and only the data will be shifted. Defaults toFalse
.plot (bool) – Whether to plot the original and corrected data arrays. Defaults to
False
.plot_kw (dict | None) – Additional keyword arguments for the plot. Defaults to
None
.**shift_kwargs – Additional keyword arguments to
erlab.analysis.utilities.shift
.
- Returns:
corrected – The edge corrected data.
- Return type:
- erlab.analysis.mask_with_hex_bz(kxymap, a=3.54, rotate=0.0, invert=False)[source]¶
Return map masked with a hexagonal BZ.
- erlab.analysis.polygon_mask(vertices, x, y, invert=False)[source]¶
Create a mask based on a polygon defined by its vertices.
- Parameters:
vertices (ndarray[Any, dtype[float64]]) – The vertices of the polygon. The shape should be (N, 2), where N is the number of vertices.
x (ndarray[Any, dtype[float64]]) – The x-coordinates of the grid points.
y (ndarray[Any, dtype[float64]]) – The y-coordinates of the grid points.
invert (bool) – If
True
, invert the mask (i.e., setTrue
where the polygon is outside andFalse
where it is inside). Default isFalse
.
- Returns:
mask – The mask array with shape
(len(x), len(y))
. The mask containsTrue
where the polygon is inside andFalse
where it is outside (or vice versa ifinvert
isTrue
).- Return type:
ndarray
Note
This function uses the
erlab.analysis.mask.polygon.bounded_side_bool
to determine whether a point is inside or outside the polygon.Example
>>> vertices = np.array([[0.2, 0.2], [0.2, 0.8], [0.8, 0.8], [0.8, 0.2]]) >>> x = np.linspace(0, 1, 5) >>> y = np.linspace(0, 1, 5) >>> polygon_mask(vertices, x, y) array([[False, False, False, False, False], [False, True, True, True, False], [False, True, True, True, False], [False, True, True, True, False], [False, False, False, False, False]])
- erlab.analysis.polygon_mask_points(vertices, x, y, invert=False)[source]¶
Compute a mask indicating whether points are inside or outside a polygon.
- Parameters:
vertices (ndarray[Any, dtype[float64]]) – The vertices of the polygon. The shape should be (N, 2), where N is the number of vertices.
x (ndarray[Any, dtype[float64]]) – The x-coordinates of the points.
y (ndarray[Any, dtype[float64]]) – The y-coordinates of the points.
invert (bool) – If
True
, invert the mask (i.e., setTrue
where the polygon is outside andFalse
where it is inside). Default isFalse
.
- Returns:
mask – A boolean array of shape
(len(x),)
indicating whether each point is inside or outside the polygon.- Return type:
ndarray
- Raises:
ValueError – If the lengths of x and y are not equal.
Notes
This function uses the
erlab.analysis.mask.polygon.bounded_side_bool
to determine whether a point is inside or outside the polygon.
- erlab.analysis.shift(darr, shift, along, shift_coords=False, **shift_kwargs)[source]¶
Shifts the values of a DataArray along a single dimension.
The shift is applied using
scipy.ndimage.shift
with the specified keyword arguments. Linear interpolation is used by default.- Parameters:
darr (DataArray) – The array to shift.
shift (float | DataArray) – The amount of shift to be applied along the specified dimension. If
shift
is a DataArray, different shifts can be applied to different coordinates. The dimensions ofshift
must be a subset of the dimensions ofdarr
. For more information, see the note below. Ifshift
is afloat
, the same shift is applied to all values along dimensionalong
. This is equivalent to providing a 0-dimensional DataArray.along (str) – Name of the dimension along which the shift is applied.
shift_coords (bool) – If
True
, the coordinates of the output data will be changed so that the output contains all the values of the original data. IfFalse
, the coordinates and shape of the original data will be retained, and only the data will be shifted. Defaults toFalse
.**shift_kwargs – Additional keyword arguments passed onto
scipy.ndimage.shift
. Default values ofcval
andorder
are set tonp.nan
and1
respectively.
- Returns:
The shifted DataArray.
- Return type:
Note
All dimensions in
shift
must be a dimension indarr
.The
shift
array values are divided by the step size along thealong
dimension.NaN values in
shift
are treated as zero.
Example
>>> import xarray as xr >>> darr = xr.DataArray( ... np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]).astype(float), dims=["x", "y"] ... ) >>> shift_arr = xr.DataArray([1, 0, 2], dims=["x"]) >>> shifted = erlab.analysis.utilities.shift(darr, shift_arr, along="y") >>> print(shifted) <xarray.DataArray (x: 3, y: 3)> Size: 72B nan 1.0 2.0 4.0 5.0 6.0 nan nan 7.0 Dimensions without coordinates: x, y
- erlab.analysis.slice_along_path(darr, vertices, step_size=None, dim_name='path', interp_kwargs=None, **vertices_kwargs)[source]¶
Interpolate a DataArray along a path defined by a sequence of vertices.
- Parameters:
darr (DataArray) – The data array to interpolate.
vertices (Mapping[Hashable, Sequence]) – Dictionary specifying the vertices of the path along which to interpolate the DataArray. The keys of the dictionary should correspond to the dimensions of the DataArray along which to interpolate.
step_size (float | None) – The step size to use for the interpolation. This determines the number of points along the path at which the data array will be interpolated. If None, the step size is determined automatically as the smallest step size of the coordinates along the dimensions of the vertices if all coordinates are evenly spaced. If there exists a dimension where the coordinates are not evenly spaced,
step_size
must be specified.dim_name (str) – The name of the new dimension that corresponds to the distance along the interpolated path. Default is “path”.
interp_kwargs (dict | None) – Additional keyword arguments passed to
xarray.DataArray.interp
.**vertices_kwargs – The keyword arguments form of
vertices
. One ofvertices
orvertices_kwargs
must be provided.
- Returns:
interpolated – New dataarray on the new coordinate.
- Return type:
DataArray
Examples
>>> import numpy as np >>> import xarray as xr >>> from erlab.analysis.interpolate import slice_along_path >>> x = np.linspace(0, 10, 11) >>> y = np.linspace(0, 10, 11) >>> z = np.linspace(0, 10, 11) >>> data = np.random.rand(11, 11, 11) >>> darr = xr.DataArray(data, coords={"x": x, "y": y, "z": z}, dims=["x", "y", "z"]) >>> vertices = {"x": [0, 5, 10], "y": [0, 5, 10], "z": [0, 5, 10]} >>> interp = slice_along_path(darr, vertices)
See also
Data IO (erlab.io
)¶
Read & write ARPES data.
This module provides functions that enables loading various files such as hdf5 files, igor pro files, and ARPES data from different beamlines and laboratories.
Modules¶
Data loading plugins. |
|
Base functionality for implementing data loaders. |
|
Generates simple simulated ARPES data for testing purposes. |
|
Data import for characterization experiments. |
For a single session, it is very common to use only one type of loader for a single
folder with all your data. Hence, the module provides a way to set a default loader for
a session. This is done using the set_loader()
function. The same can be done for
the data directory using the set_data_dir()
function.
For instructions on how to write a custom loader, see erlab.io.dataloader.
Examples
View all registered loaders:
>>> erlab.io.loaders
Load data by explicitly specifying the loader:
>>> dat = erlab.io.loaders["merlin"].load(...)
- erlab.io.load(data_dir=None, **kwargs)[source]¶
Load ARPES data.
- Parameters:
identifier – Value that identifies a scan uniquely. If a string or path-like object is given, it is assumed to be the path to the data file. If an integer is given, it is assumed to be a number that specifies the scan number, and is used to automatically determine the path to the data file(s).
data_dir (str | os.PathLike | None) – Where to look for the data. If
None
, the default data directory will be used.single – For some setups, data for a single scan is saved over multiple files. This argument is only used for such setups. When
identifier
is resolved to a single file within a multiple file scan, the default behavior whensingle
isFalse
is to return a single concatenated array that contains data from all files in the same scan. Ifsingle
is set toTrue
, only the data from the file given is returned. This argument is ignored whenidentifier
is a number.**kwargs – Additional keyword arguments are passed to
identify
.
- Returns:
The loaded data.
- Return type:
xarray.DataArray
orxarray.Dataset
orlist
ofxarray.DataArray
- erlab.io.load_experiment(filename, folder=None, *, prefix=None, ignore=None, recursive=False, **kwargs)[source]¶
Load waves from an igor experiment (
pxp
) file.- Parameters:
folder (str | None) – Target folder within the experiment, given as a slash-separated string. If
None
, defaults to the root.prefix (str | None) – If given, only include waves with names that starts with the given string.
recursive (bool) – If
True
, includes waves in child directories.**kwargs – Extra arguments to
load_wave()
.
- Returns:
Dataset containing the waves.
- Return type:
- erlab.io.load_hdf5(filename, **kwargs)[source]¶
Load data from an HDF5 file saved with
save_as_hdf5
.This is a thin wrapper around
xarray.load_dataarray
andxarray.load_dataset
.- Parameters:
**kwargs – Extra arguments to
xarray.load_dataarray
orxarray.load_dataset
.
- Returns:
The loaded data.
- Return type:
- erlab.io.load_wave(wave, data_dir=None)[source]¶
Load a wave from Igor binary format.
- Parameters:
wave (dict | WaveRecord | str | PathLike) – The wave to load. It can be provided as a dictionary, an instance of
igor2.record.WaveRecord
, or a string representing the path to the wave file.data_dir (str | PathLike | None) – The directory where the wave file is located. This parameter is only used if
wave
is a string orPathLike
object. IfNone
,wave
must be a valid path.
- Returns:
The loaded wave.
- Return type:
- Raises:
ValueError – If the wave file cannot be found or loaded.
TypeError – If the wave argument is of an unsupported type.
- erlab.io.loader_context(data_dir=None)[source]¶
Context manager for the current data loader and data directory.
- Parameters:
loader (
str
, optional) – The name or alias of the loader to use in the context.data_dir (
str
oros.PathLike
, optional) – The data directory to use in the context.
Examples
Load data within a context manager:
>>> with erlab.io.loader_context("merlin"): ... dat_merlin = erlab.io.load(...)
Load data with different loaders and directories:
>>> erlab.io.set_loader("ssrl52", data_dir="/path/to/dir1") >>> dat_ssrl_1 = erlab.io.load(...) >>> with erlab.io.loader_context("merlin", data_dir="/path/to/dir2"): ... dat_merlin = erlab.io.load(...) >>> dat_ssrl_2 = erlab.io.load(...)
- erlab.io.open_hdf5(filename, **kwargs)[source]¶
Open data from an HDF5 file saved with
save_as_hdf5
.This is a thin wrapper around
xarray.open_dataarray
andxarray.open_dataset
.- Parameters:
**kwargs – Extra arguments to
xarray.open_dataarray
orxarray.open_dataset
.
- Returns:
The opened data.
- Return type:
- erlab.io.save_as_hdf5(data, filename, igor_compat=True, **kwargs)[source]¶
Save data in
HDF5
format.- Parameters:
data (DataArray | Dataset) –
xarray.DataArray
to save.igor_compat (bool) – (Experimental) Make the resulting file compatible with Igor’s
HDF5OpenFile
for DataArrays with up to 4 dimensions. A convenient Igor procedure is included in the repository. Default isTrue
.**kwargs – Extra arguments to
xarray.DataArray.to_netcdf
: refer to thexarray
documentation for a list of all possible arguments.
- erlab.io.save_as_netcdf(data, filename, **kwargs)[source]¶
Save data in
netCDF4
format.Discards invalid
netCDF4
attributes and produces a warning.- Parameters:
data (DataArray) –
xarray.DataArray
to save.**kwargs – Extra arguments to
xarray.DataArray.to_netcdf
: refer to thexarray
documentation for a list of all possible arguments.
- erlab.io.set_data_dir(data_dir)[source]¶
Set the default data directory for the data loader.
All subsequent calls to
load
will use thedata_dir
set here unless specified.Note
This will only affect
load
. If the loader’sload
method is called directly, it will not use the default data directory.
- erlab.io.set_loader(loader)[source]¶
Set the current data loader.
All subsequent calls to
load
will use the loader set here.- Parameters:
loader (str | LoaderBase | None) – The loader to set. It can be either a string representing the name or alias of the loader, or a valid loader class.
Example
>>> erlab.io.set_loader("merlin") >>> dat_merlin_1 = erlab.io.load(...) >>> dat_merlin_2 = erlab.io.load(...)
- erlab.io.summarize(usecache=True, *, cache=True, display=True, **kwargs)[source]¶
Summarize the data in the given directory.
Takes a path to a directory and summarizes the data in the directory to a table, much like a log file. This is useful for quickly inspecting the contents of a directory.
The dataframe is formatted using the style from
get_styler
and displayed in the IPython shell. Results are cached in a pickle file in the directory.- Parameters:
data_dir – Directory to summarize.
usecache (bool) – Whether to use the cached summary if available. If
False
, the summary will be regenerated. The cache will be updated ifcache
isTrue
.cache (bool) – Whether to cache the summary in a pickle file in the directory. If
False
, no cache will be created or updated. Note that existing cache files will not be deleted, and will be used ifusecache
isTrue
.display (bool) – Whether to display the formatted dataframe using the IPython shell. If
False
, the dataframe will be returned without formatting. IfTrue
but the IPython shell is not detected, the dataframe styler will be returned.**kwargs – Additional keyword arguments to be passed to
generate_summary
.
- Returns:
df – Summary of the data in the directory.
If
display
isFalse
, the summary DataFrame is returned.If
display
isTrue
and the IPython shell is detected, the summary will be displayed, andNone
will be returned.If
ipywidgets
is installed, an interactive widget will be returned instead ofNone
.
If
display
isTrue
but the IPython shell is not detected, the styler for the summary DataFrame will be returned.
- Return type:
- erlab.io.loaders¶
Global instance of
LoaderRegistry
.
Plotting (erlab.plotting
)¶
Everything related to plotting.
Modules¶
Functions
|
Load a Igor CT wave file ( |
Interactive (erlab.interactive
)¶
Interactive plotting based on Qt and pyqtgraph.
Interactive tools¶
Interactive data browser. |
|
Functions for manipulating colors in Qt. |
|
Interactive momentum conversion tool. |
|
Interactive tool for visualizing dispersive data. |
|
Various helper functions and extensions to pyqtgraph. |
- erlab.interactive.goldtool(data, data_corr=None, *, data_name=None, **kwargs)[source]¶
Interactive gold edge fitting.
- Parameters:
data (DataArray) – The data to perform Fermi edge fitting on.
data_corr (DataArray | None) – The data to correct with the edge. Defaults to
data
.data_name (str | None) – Name of the data used in generating the code snipped copied to the clipboard. Overrides automatic detection.
**kwargs – Arguments passed onto
erlab.interactive.utilities.AnalysisWindow
.
- erlab.interactive.itool(data, link=False, link_colors=True, execute=None, **kwargs)[source]¶
Create and display an ImageTool window.
- Parameters:
data (Collection[xr.DataArray | npt.NDArray] | xr.DataArray | npt.NDArray | xr.Dataset) – Array-like object or a sequence of such object with 2 to 4 dimensions. See notes.
link (bool) – Whether to enable linking between multiple ImageTool windows, by default
False
.link_colors (bool) – Whether to link the color maps between multiple linked ImageTool windows, by default
True
.execute (bool | None) – Whether to execute the Qt event loop and display the window, by default
None
. IfNone
, the execution is determined based on the current IPython shell.**kwargs – Additional keyword arguments to be passed onto the underlying slicer area. For a full list of supported arguments, see the
erlab.interactive.imagetool.core.ImageSlicerArea
documentation.
- Returns:
The created ImageTool window(s).
- Return type:
ImageTool
orlist
ofImageTool
Notes
If
data
is a sequence of valid data, multiple ImageTool windows will be created and displayed.If
data
is a Dataset, each DataArray in the Dataset will be displayed in a separate ImageTool window. Data variables with 2 to 4 dimensions are considered as valid. Other variables are ignored.If
link
is True, the ImageTool windows will be synchronized.
Examples
>>> itool(data, cmap="gray", gamma=0.5) >>> itool(data_list, link=True)
Accessors (erlab.accessors
)¶
Some xarray accessors for convenient data analysis and visualization.
Modules¶
- class erlab.accessors.InteractiveDataArrayAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.qshow
accessor for interactive visualization.- __call__(*args, **kwargs)[source]¶
Visualize the data interactively.
Chooses the appropriate interactive visualization method based on the number of dimensions in the data.
- Parameters:
*args – Positional arguments passed onto the interactive visualization function.
**kwargs – Keyword arguments passed onto the interactive visualization function.
- hvplot(*args, **kwargs)[source]¶
hvplot
-based interactive visualization.- Parameters:
*args – Positional arguments passed onto
DataArray.hvplot()
.**kwargs – Keyword arguments passed onto
DataArray.hvplot()
.
- Raises:
ImportError – If
hvplot
is not installed.
- class erlab.accessors.InteractiveDatasetAccessor(xarray_obj)[source]¶
Bases:
ERLabDatasetAccessor
xarray.Dataset.qshow
accessor for interactive visualization.- __call__(*args, **kwargs)[source]¶
Visualize the data interactively.
Chooses the appropriate interactive visualization method based on the data variables.
- Parameters:
*args – Positional arguments passed onto the interactive visualization function.
**kwargs – Keyword arguments passed onto the interactive visualization function.
- hvplot(*args, **kwargs)[source]¶
hvplot
-based interactive visualization.- Parameters:
*args – Positional arguments passed onto
Dataset.hvplot()
.**kwargs – Keyword arguments passed onto
Dataset.hvplot()
.
- Raises:
ImportError – If
hvplot
is not installed.
- class erlab.accessors.ModelFitDataArrayAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.modelfit
accessor for fitting lmfit models.- __call__(*args, **kwargs)[source]¶
Curve fitting optimization for arbitrary models.
Wraps
lmfit.Model.fit
withxarray.apply_ufunc()
.- Parameters:
coords (
Hashable
,xarray.DataArray
, orSequence
ofHashable
orxarray.DataArray
) – Independent coordinate(s) over which to perform the curve fitting. Must share at least one dimension with the calling object. When fitting multi-dimensional functions, supplycoords
as a sequence in the same order as arguments infunc
. To fit along existing dimensions of the calling object,coords
can also be specified as a str or sequence of strs.model (
lmfit.Model
) – A model object to fit to the data. The model must be an instance oflmfit.Model
.reduce_dims (
str
,Iterable
ofHashable
orNone
, optional) – Additional dimension(s) over which to aggregate while fitting. For example, callingds.modelfit(coords='time', reduce_dims=['lat', 'lon'], ...)
will aggregate all lat and lon points and fit the specified function along the time dimension.skipna (
bool
, default:True
) – Whether to skip missing values when fitting. Default is True.params (
lmfit.Parameters
,dict-like
, orxarray.DataArray
, optional) – Optional input parameters to the fit. If almfit.Parameters
object, it will be used for all fits. If a dict-like object, it must look like the keyword arguments tolmfit.create_params
. Additionally, each value of the dictionary may also be a DataArray, which will be broadcasted appropriately. If a DataArray, each entry must be a dictionary-like object, almfit.Parameters
object, or a JSON string created withlmfit.Parameters.dumps
. If given a Dataset, the name of the data variables in the Dataset must match the name of the data variables in the calling object, and each data variable will be used as the parameters for the corresponding data variable. If none or only some parameters are passed, the rest will be assigned initial values and bounds withlmfit.Model.make_params
, or guessed withlmfit.Model.guess
ifguess
isTrue
.guess (
bool
, default:False
) – Whether to guess the initial parameters withlmfit.Model.guess
. For composite models, the parameters will be guessed for each component.errors (
{"raise", "ignore"}
, default:"raise"
) – If'raise'
, any errors from thelmfit.Model.fit
optimization will raise an exception. If'ignore'
, the return values for the coordinates where the fitting failed will be NaN.parallel (
bool
, optional) – Whether to parallelize the fits over the data variables. If not provided, parallelization is only applied for non-dask Datasets with more than 200 data variables.parallel_kw (
dict
, optional) – Additional keyword arguments to pass to the parallelization backendjoblib.Parallel
ifparallel
isTrue
.progress (
bool
, default:False
) – Whether to show a progress bar for fitting over data variables. Only useful if there are multiple data variables to fit.output_result (
bool
, default:True
) – Whether to include the fulllmfit.model.ModelResult
object in the output dataset. IfTrue
, the result will be stored in a variable namedmodelfit_results
.**kwargs (optional) – Additional keyword arguments to passed to
lmfit.Model.fit
.
- Returns:
curvefit_results – A single dataset which contains:
- modelfit_results
The full
lmfit.model.ModelResult
object from the fit. Only included ifoutput_result
isTrue
.- modelfit_coefficients
The coefficients of the best fit.
- modelfit_stderr
The standard errors of the coefficients.
- modelfit_covariance
The covariance matrix of the coefficients. Note that elements corresponding to non varying parameters are set to NaN, and the actual size of the covariance matrix may be smaller than the array.
- modelfit_stats
Statistics from the fit. See
lmfit.minimize
.- modelfit_data
Data used for the fit.
- modelfit_best_fit
The best fit data of the fit.
- Return type:
- class erlab.accessors.ModelFitDatasetAccessor(xarray_obj)[source]¶
Bases:
ERLabDatasetAccessor
xarray.Dataset.modelfit
accessor for fitting lmfit models.- __call__(coords, model, reduce_dims=None, skipna=True, params=None, guess=False, errors='raise', parallel=None, parallel_kw=None, progress=False, output_result=True, **kwargs)[source]¶
Curve fitting optimization for arbitrary models.
Wraps
lmfit.Model.fit
withxarray.apply_ufunc()
.- Parameters:
coords (
Hashable
,xarray.DataArray
, orSequence
ofHashable
orxarray.DataArray
) – Independent coordinate(s) over which to perform the curve fitting. Must share at least one dimension with the calling object. When fitting multi-dimensional functions, supplycoords
as a sequence in the same order as arguments infunc
. To fit along existing dimensions of the calling object,coords
can also be specified as a str or sequence of strs.model (
lmfit.Model
) – A model object to fit to the data. The model must be an instance oflmfit.Model
.reduce_dims (
str
,Iterable
ofHashable
orNone
, optional) – Additional dimension(s) over which to aggregate while fitting. For example, callingds.modelfit(coords='time', reduce_dims=['lat', 'lon'], ...)
will aggregate all lat and lon points and fit the specified function along the time dimension.skipna (
bool
, default:True
) – Whether to skip missing values when fitting. Default is True.params (
lmfit.Parameters
,dict-like
, orxarray.DataArray
, optional) – Optional input parameters to the fit. If almfit.Parameters
object, it will be used for all fits. If a dict-like object, it must look like the keyword arguments tolmfit.create_params
. Additionally, each value of the dictionary may also be a DataArray, which will be broadcasted appropriately. If a DataArray, each entry must be a dictionary-like object, almfit.Parameters
object, or a JSON string created withlmfit.Parameters.dumps
. If given a Dataset, the name of the data variables in the Dataset must match the name of the data variables in the calling object, and each data variable will be used as the parameters for the corresponding data variable. If none or only some parameters are passed, the rest will be assigned initial values and bounds withlmfit.Model.make_params
, or guessed withlmfit.Model.guess
ifguess
isTrue
.guess (
bool
, default:False
) – Whether to guess the initial parameters withlmfit.Model.guess
. For composite models, the parameters will be guessed for each component.errors (
{"raise", "ignore"}
, default:"raise"
) – If'raise'
, any errors from thelmfit.Model.fit
optimization will raise an exception. If'ignore'
, the return values for the coordinates where the fitting failed will be NaN.parallel (
bool
, optional) – Whether to parallelize the fits over the data variables. If not provided, parallelization is only applied for non-dask Datasets with more than 200 data variables.parallel_kw (
dict
, optional) – Additional keyword arguments to pass to the parallelization backendjoblib.Parallel
ifparallel
isTrue
.progress (
bool
, default:False
) – Whether to show a progress bar for fitting over data variables. Only useful if there are multiple data variables to fit.output_result (
bool
, default:True
) – Whether to include the fulllmfit.model.ModelResult
object in the output dataset. IfTrue
, the result will be stored in a variable named[var]_modelfit_results
.**kwargs (optional) – Additional keyword arguments to passed to
lmfit.Model.fit
.
- Returns:
curvefit_results – A single dataset which contains:
- [var]_modelfit_results
The full
lmfit.model.ModelResult
object from the fit. Only included ifoutput_result
isTrue
.- [var]_modelfit_coefficients
The coefficients of the best fit.
- [var]_modelfit_stderr
The standard errors of the coefficients.
- [var]_modelfit_covariance
The covariance matrix of the coefficients. Note that elements corresponding to non varying parameters are set to NaN, and the actual size of the covariance matrix may be smaller than the array.
- [var]_modelfit_stats
Statistics from the fit. See
lmfit.minimize
.- [var]_modelfit_data
Data used for the fit.
- [var]_modelfit_best_fit
The best fit data of the fit.
- Return type:
- class erlab.accessors.MomentumAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.kspace
accessor for momentum conversion related utilities.This class provides convenient access to various momentum-related properties of a data object. It allows getting and setting properties such as configuration, inner potential, work function, angle resolution, slit axis, momentum axes, angle parameters, and offsets.
- convert(bounds=None, resolution=None, *, silent=False, **coords)[source]¶
Convert to momentum space.
- Parameters:
bounds (dict[str, tuple[float, float]] | None) – A dictionary specifying the bounds for each coordinate axis. The keys are the names of the axes, and the values are tuples representing the lower and upper bounds of the axis. If not provided, the bounds will be estimated based on the data.
resolution (dict[str, float] | None) – A dictionary specifying the resolution for each momentum axis. The keys are the names of the axes, and the values are floats representing the desired resolution of the axis. If not provided, the resolution will be estimated based on the data. For in-plane momentum, the resolution is estimated from the angle resolution and kinetic energy. For out-of-plane momentum, two values are calculated. One is based on the number of photon energy points, and the other is estimated as the inverse of the photoelectron inelastic mean free path given by the universal curve. The resolution is estimated as the smaller of the two values.
silent (bool) – If
True
, suppresses printing, by defaultFalse
.**coords – Array-like keyword arguments that specifies the coordinate array for each momentum axis. If provided, the bounds and resolution will be ignored.
- Returns:
The converted data.
- Return type:
Note
This method converts the data to a new coordinate system specified by the provided bounds and resolution. It uses interpolation to map the data from the original coordinate system to the new one.
The converted data is returned as a DataArray object with updated coordinates and dimensions.
Examples
Set parameters and convert with automatic bounds and resolution:
data.kspace.offsets = {"delta": 0.1, "xi": 0.0, "beta": 0.3} data.kspace.work_function = 4.3 data.kspace.inner_potential = 12.0 converted_data = data.kspace.convert()
Convert with specified bounds and resolution:
bounds = {"kx": (0.0, 1.0), "ky": (-1.0, 1.0)} resolution = {"kx": 0.01, "ky": 0.01} converted_data = data.kspace.convert(bounds, resolution)
- convert_coords()[source]¶
Convert the coordinates to momentum space.
Assigns new exact momentum coordinates to the data. This is useful when you want to work with momentum coordinates but don’t want to interpolate the data.
- Returns:
The DataArray with transformed coordinates.
- Return type:
- estimate_bounds()[source]¶
Estimate the bounds of the data in momentum space.
- Returns:
bounds – A dictionary containing the estimated bounds for each parameter. The keys of the dictionary are ‘kx’, ‘ky’, and ‘kz’ (for \(hν\)-dependent data). The values are tuples representing the minimum and maximum values.
- Return type:
dict[str
,tuple[float
,float]]
- estimate_resolution(axis, lims=None, from_numpoints=False)[source]¶
Estimate the resolution for a given momentum axis.
- Parameters:
axis (Literal['kx', 'ky', 'kz']) – Axis to estimate the resolution for.
lims (tuple[float, float] | None) – The limits of the axis used when
from_numpoints
isTrue
. If not provided, reasonable limits will be calculated byestimate_bounds()
, by defaultNone
from_numpoints (bool) – If
True
, estimate the resolution from the number of points in the relevant axis. IfFalse
, estimate the resolution based on the data, by defaultFalse
- Returns:
The estimated resolution.
- Return type:
- Raises:
ValueError – If no photon energy axis is found in data for axis
'kz'
.
- hv_to_kz(hv)[source]¶
Return \(k_z\) for a given photon energy.
Useful when creating overlays on \(hν\)-dependent data.
- Parameters:
hv (float) – Photon energy in eV.
Note
This will be inexact for hv-dependent cuts that do not pass through the BZ center since we lost the exact angle values, i.e. the exact momentum perpendicular to the slit, during momentum conversion.
- property angle_params: dict[str, float]¶
Parameters passed to
erlab.analysis.kspace.get_kconv_func()
.
- property angle_resolution: float¶
Retrieve the angular resolution of the data in degrees.
Checks for the
angle_resolution
attribute of the data. If not found, a default value of 0.1° is silently assumed.This property is used in
best_kp_resolution
upon estimating momentum step sizes throughestimate_resolution
.
- property best_kp_resolution: float¶
Estimate the minimum in-plane momentum resolution.
The resolution is estimated with the kinetic energy and angular resolution:
\[\Delta k_{\parallel} \sim \sqrt{2 m_e E_k/\hbar^2} \cos(\alpha) \Delta\alpha\]
- property best_kz_resolution: float¶
Estimate the minimum out-of-plane momentum resolution.
The resolution is estimated based on the mean free path [8] and the kinetic energy.
\[\Delta k_z \sim 1/\lambda\]
- property configuration: AxesConfiguration¶
Return the experimental configuration.
For a properly implemented data loader, the configuration attribute must be set on data import. See
erlab.analysis.kspace.AxesConfiguration
for details.
- property has_beta: bool¶
Check if the coordinate array for \(β\) has more than one element.
- Returns:
Returns
True
if the size of the coordinate array for \(β\) is greater than 1,False
otherwise.- Return type:
Note
This may be
True
for data that are not maps. For instance, \(hν\)-dependent cuts with an in-plane momentum offset may have a \(hν\)-dependent \(β\) offset.
- property inner_potential: float¶
Inner potential of the sample in eV.
The inner potential is stored in the
inner_potential
attribute of the data. If the inner potential is not set, a warning is issued and a default value of 10.0 eV is assumed.Note
This property provides a setter method that takes a float value and sets the data attribute accordingly.
Example
>>> data.kspace.inner_potential = 13.0 >>> data.kspace.inner_potential 13.0
- property momentum_axes: tuple[Literal['kx', 'ky', 'kz'], ...]¶
Return the momentum axes of the data after conversion.
- Returns:
For photon energy dependent scans, it returns the slit axis and
'kz'
. For maps, it returns'kx'
and'ky'
. Otherwise, it returns only the slit axis.- Return type:
- property offsets: OffsetView¶
Angle offsets used in momentum conversion.
- Returns:
A mapping between valid offset keys and their corresponding offsets.
- Return type:
Examples
View all offsets
>>> data.kspace.offsets {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
View single offset
>>> data.kspace.offsets["beta"] 0.0
Offsets to dictionary
>>> dict(data.kspace.offsets) {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
Set single offset
>>> data.kspace.offsets["beta"] = 3.0 >>> data.kspace.offsets {'delta': 0.0, 'xi': 0.0, 'beta': 3.0}
Overwrite offsets with dictionary
>>> data.kspace.offsets = dict(delta=1.5, xi=2.7) >>> data.kspace.offsets {'delta': 1.5, 'xi': 2.7, 'beta': 0.0}
Update offsets
>>> data.kspace.offsets.update(beta=0.1, xi=0.0) {'delta': 1.5, 'xi': 0.0, 'beta': 0.1}
Reset all offsets
>>> data.kspace.offsets.reset() {'delta': 0.0, 'xi': 0.0, 'beta': 0.0}
- property other_axis: Literal['kx', 'ky']¶
Return the momentum axis perpendicular to the slit.
- Returns:
Returns
'ky'
for type 1 configurations,'kx'
otherwise.- Return type:
- property slit_axis: Literal['kx', 'ky']¶
Return the momentum axis parallel to the slit.
- Returns:
Returns
'kx'
for type 1 configurations,'ky'
otherwise.- Return type:
- property valid_offset_keys: tuple[str, ...]¶
Get valid offset angles based on the experimental configuration.
- Returns:
A tuple containing the valid offset keys. For configurations with a deflector, returns
("delta", "chi", "xi")
. Otherwise, returns("delta", "xi", "beta")
.- Return type:
- property work_function: float¶
Work function of the sample in eV.
The work function is stored in the
sample_workfunction
attribute of the data. If not found, a warning is issued and a default value of 4.5 eV is assumed.Note
This property provides a setter method that takes a float value and sets the data attribute accordingly.
Example
>>> data.kspace.work_function = 4.5 >>> data.kspace.work_function 4.5
- class erlab.accessors.OffsetView(xarray_obj)[source]¶
Bases:
object
A class representing an offset view for an
xarray.DataArray
.This class provides a convenient way to access and manipulate angle offsets associated with the given data.
- Parameters:
xarray_obj (DataArray) – The
xarray.DataArray
for which the offset view is created.
- __len__() int: ¶
Returns the number of valid offset keys.
- __iter__() Iterator[str, float]: [source]¶
Returns an iterator over the valid offset keys and their corresponding values.
- class erlab.accessors.ParallelFitDataArrayAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.parallel_fit
accessor for fitting lmfit models in parallel.- __call__(dim, model, **kwargs)[source]¶
Fit the specified model to the data along the given dimension.
- Parameters:
dim (
str
) – The name of the dimension along which to fit the model.model (
lmfit.Model
) – The model to fit.**kwargs (
dict
) – Additional keyword arguments to be passed toxarray.Dataset.modelfit
.
- Returns:
curvefit_results – The dataset containing the results of the fit. See
xarray.DataArray.modelfit
for details.- Return type:
- class erlab.accessors.PlotAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.qplot
accessor for plotting data.- __call__(*args, **kwargs)[source]¶
Plot the data.
If a 2D data array is provided, it is plotted using
plot_array
. Otherwise, it is equivalent to callingxarray.DataArray.plot()
.- Parameters:
*args – Positional arguments to be passed to the plotting function.
**kwargs – Keyword arguments to be passed to the plotting function.
- class erlab.accessors.SelectionAccessor(xarray_obj)[source]¶
Bases:
ERLabDataArrayAccessor
xarray.DataArray.qsel
accessor for convenient selection and averaging.- __call__(indexers=None, *, verbose=False, **indexers_kwargs)[source]¶
Select and average data along specified dimensions.
- Parameters:
indexers (Mapping[Hashable, float | slice] | None) –
Dictionary specifying the dimensions and their values or slices. Position along a dimension can be specified in three ways:
As a scalar value:
alpha=-1.2
If no width is specified, the data is selected along the nearest value. It is equivalent to
xarray.DataArray.sel
withmethod='nearest'
.As a value and width:
alpha=5, alpha_width=0.5
The data is averaged over a slice of width
alpha_width
, centered atalpha
.As a slice:
alpha=slice(-10, 10)
The data is selected over the specified slice. No averaging is performed.
One of
indexers
orindexers_kwargs
must be provided.verbose (bool) – If
True
, print information about the selected data and averaging process. Default isFalse
.**indexers_kwargs – The keyword arguments form of
indexers
. One ofindexers
orindexers_kwargs
must be provided.
- Returns:
The selected and averaged data.
- Return type:
- Raises:
ValueError – If a specified dimension is not present in the data.
Submodules¶
Submodule |
Description |
---|---|
Tools for working with real and reciprocal lattices. |
|
Physical constants and functions for unit conversion. |
|
Helpers for parallel processing. |
Lattices (erlab.lattice
)¶
Tools related to the real and reciprocal lattice.
Functions
|
Return the angle between two vectors. |
|
Construct lattice vectors from lattice parameters. |
|
Determine lattice parameters from lattice vectors. |
|
Construct the reciprocal lattice vectors from real lattice vectors. |
|
Construct the real lattice vectors from reciprocal lattice vectors. |
- erlab.lattice.abc2avec(a, b, c, alpha, beta, gamma)[source]¶
Construct lattice vectors from lattice parameters.
- erlab.lattice.to_real(bvec)[source]¶
Construct the real lattice vectors from reciprocal lattice vectors.
Constants (erlab.constants
)¶
Physical constants and unit conversion.
Module Attributes
Elementary charge \(e\) (C) |
|
Speed of light \(c\) (m/s) |
|
Electron mass \(m_e\) (kg) (±0.000 000 0028e-31) |
|
Electron rest energy \(m_e c^2\) (J) (±0.000 000 0025e-14) |
|
Electron rest energy \(m_e c^2\) (eV) (±0.000 15) |
|
Planck constant \(h\) (J·s) |
|
\(hc\) (J·m) |
|
Dirac constant \(\hbar\) (J·s) |
|
\(\hbar c\) (J·m) |
|
\(\hbar^2\) (J²·s²) |
|
Planck constant \(h\) (eV·s) |
|
\(hc\) (eV·m) |
|
Dirac constant \(\hbar\) (eV·s) |
|
\(\hbar^2\) (eV²·s²) |
|
\(hc\) (eV·nm), Used in energy-wavelength conversion |
|
\(\frac{\sqrt{2 m_e}}{\hbar}\), Used in momentum conversion |
|
\(\frac{\hbar^2}{2 m_e}\), Used in momentum conversion |
|
Boltzmann constant \(k_B\) (J/K) |
|
Boltzmann constant \(k_B\) (eV/K) |
Functions
|
Convert between energy and wavelength. |
|
Convert Watts to photons per second. |
- erlab.constants.mcsq: float = 8.1871057769e-14¶
Electron rest energy \(m_e c^2\) (J) (±0.000 000 0025e-14)
- erlab.constants.rel_eV_nm: float = 1239.8419843320028¶
\(hc\) (eV·nm), Used in energy-wavelength conversion
Parallel processing (erlab.parallel
)¶
Helper functions for parallel processing.
Functions
|
Patches joblib to report into a tqdm progress bar. |
|
Context manager for interactive windows. |
Contributing Guide¶
Note
Parts of this document are based on Contributing to pandas and Contributing to xarray.
We welcome your enthusiasm! All contributions, including bug reports, bug fixes, documentation improvements, enhancement suggestions, and other ideas are welcome.
If you have any questions, feel free to ask us! The recommended place to ask questions is GitHub Discussions.
Bug reports and enhancement requests¶
If you find a bug in the code or documentation, do not hesitate to submit a ticket to the Issue Tracker. You are also welcome to post feature requests or pull requests.
When reporting a bug, see this stackoverflow article for tips on writing a good bug report, and this article on minimal bug reports.
Creating a development environment¶
First, you will need to install git
and conda
(or mamba
).
Installing git¶
Below are some quick instructions for installing git on various operating systems. For more detailed instructions, see the git installation guide.
macOS (Intel & ARM): get Xcode Command Line Tools by running in your terminal window:
xcode-select --install
Windows 10 1709 (build 16299) or later: run in command prompt or PowerShell:
winget install --id Git.Git -e --source winget
If you are new to contributing to projects through forking on GitHub, take a look at the GitHub documentation for contributing to projects. GitHub provides a quick tutorial using a test repository that may help you become more familiar with forking a repository, cloning a fork, creating a feature branch, pushing changes and making pull requests.
Below are some useful resources for learning more about forking and pull requests on GitHub:
Cloning the repository¶
Create an account on GitHub if you do not already have one.
You will need your own copy of erlabpy (aka fork) to work on the code. Go to the erlabpy repository and hit the
Fork
button near the top of the page. This creates a copy of the code under your account on the GitHub server.Clone your fork to your machine:
git clone https://github.com/your-user-name/erlabpy.git cd erlabpy git remote add upstream https://github.com/kmnhan/erlabpy.git
This creates the directory
erlabpy
and connects your repository to the upstream (main project) erlabpy repository.
Installing conda¶
Before starting any development, you’ll need to create an isolated environment under a package manager like conda. If you don’t have conda installed, install conda or install mamba.
Editable installation from source¶
An editable installation allows you to make changes to the code and see the changes reflected in the package without having to reinstall it. Before installing:
Make sure you have cloned the repository.
Make sure you have installed conda or mamba.
cd
to the erlabpy source directory.
Create and activate a mamba (or conda) environment.
Note
Replace
<envname>
with the environment name you prefer.Hint
If using conda, replace
mamba
withconda
.mamba env create -f environment.yml -n <envname> mamba activate <envname>
Install the repository.
Note
The
editable_mode=compat
setting enables static analysis tools to work with the package. See this issue for more information.pip install -e ".[dev]" --config-settings editable_mode=compat
Updating the editable installation¶
For minor updates with editable installs, it is sufficient to just update the main branch.
When there are changes to the dependencies, you should also update the environment:
Hint
If using conda, replace
mamba
withconda
.mamba env update -f environment.yml -n <envname>
In case of major changes, it is recommended to rebuild the package.
mamba activate <envname> pip install -e . --force-reinstall --no-deps --config-settings editable_mode=compat
Development workflow¶
Before starting any development, make sure you have created a local development environment.
Update the main
branch¶
Before starting a new set of changes, fetch all changes from upstream/main
, and
start a new feature branch from that. From time to time you should fetch the upstream
changes from GitHub:
git fetch upstream
git merge upstream/main
This will combine your commits with the latest erlabpy git main
. If this leads to
merge conflicts, you must resolve these before submitting your pull request. Remember to
follow the commit message guidelines. If you have uncommitted changes, you will need to
git stash
them prior to updating. This will effectively store your changes, which
can be reapplied after updating with git stash apply
.
Create a new feature branch¶
Create a branch to save your changes, even before you start making changes. You want
your main branch
to contain only production-ready code:
git checkout -b shiny-new-feature
This changes your working directory to the shiny-new-feature
branch. Keep any
changes in this branch specific to one bug or feature so it is clear what the branch
brings to erlabpy. You can have many “shiny-new-features” and switch in between them
using the git checkout
command.
Generally, you will want to keep your feature branches on your public GitHub fork of
erlabpy. To do this, you git push
this new branch up to your GitHub repo.
Generally (if you followed the instructions in these pages, and by default), git will
have a link to your fork of the GitHub repo, called origin
. You push up to your own
fork with:
git push origin shiny-new-feature
In git >= 1.7 you can ensure that the link is correctly set by using the
--set-upstream
option:
git push --set-upstream origin shiny-new-feature
From now on git will know that shiny-new-feature
is related to the
shiny-new-feature branch
in the GitHub repo.
The editing workflow¶
Make some changes. Make sure to follow the code standards and the documentation standards.
See which files have changed with
git status
. You’ll see a listing like this one:# On branch shiny-new-feature # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: README
Check what the actual changes are with
git diff
.Build the documentation for documentation changes. See the documentation section for more information.
Commit and push your changes¶
To commit all modified files into the local copy of your repo, do
git commit -am 'A commit message'
. Note that erlabpy uses python-semantic-release for versioning, so the commit message must follow the Conventional Commits standard. This will automatically determine the version number for the next release.To push the changes up to your forked repo on GitHub, do a
git push
.
Open a pull request¶
When you’re ready or need feedback on your code, open a Pull Request (PR) so that we can
give feedback and eventually include your suggested code into the main
branch. Pull
requests (PRs) on GitHub
are the mechanism for contributing to the code and documentation.
Enter a title for the set of changes with some explanation of what you’ve done. Mention anything you’d like particular attention for - such as a complicated change or some code you are not happy with. If you don’t think your request is ready to be merged, just say so in your pull request message and use the “Draft PR” feature of GitHub. This is a good way of getting some preliminary code review.
Code standards¶
Import sorting, formatting, and linting are enforced with Ruff.
If you wish to contribute, using pre-commit is recommended. This will ensure that your code is properly formatted before you commit it. A pre-commit configuration file for ruff is included in the repository.
When writing code that uses Qt, please adhere to the following rules:
Import all Qt bindings from qtpy, and only import the top level modules:
from qtpy import QtWidgets, QtCore, QtGui
Use fully qualified enum names from Qt6 instead of the short-form enums from Qt5, i. e.,
QtCore.Qt.CheckState.Checked
instead ofQtCore.Qt.Checked
.Use the signal and slot syntax from PySide6 (
QtCore.Signal
andQtCore.Slot
instead ofQtCore.pyqtSignal
andQtCore.pyqtSlot
)When using Qt Designer, place
.ui
files in the same directory as the Python file that uses them. The files must be imported using theloadUiType
function fromqtpy.uic
. For example:from qtpy import uic class MyWidget(*uic.loadUiType(os.path.join(os.path.dirname(__file__), "mywidget.ui"))): def __init__(self): super().__init__() self.setupUi(self)
Please try to add type annotations to your code. This will help with code completion and static analysis.
We are in the process of adding type annotations to the codebase, and most of it should pass mypy except for the io and interactive modules.
Documentation¶
The documentation is written in reStructuredText, which is almost like writing in plain English, and built using Sphinx. The Sphinx Documentation has an excellent introduction to reST. Review the Sphinx docs to perform more complex changes to the documentation as well.
Some other important things to know about the docs:
The documentation consists of two parts: the docstrings in the code itself and the docs in this folder
erlabpy/docs/source/
.The docstrings are meant to provide a clear explanation of the usage of the individual functions, while the documentation in this folder consists of tutorial-like overviews per topic together with some other information.
The docstrings follow the NumPy Docstring Standard, which is used widely in the Scientific Python community. This standard specifies the format of the different sections of the docstring. Refer to the documentation for the Numpy docstring format and the Sphinx examples for detailed explanation and examples, or look at some of the existing functions to extend it in a similar manner.
The documentation is automatically updated by Read the Docs when a new commit is pushed to
main
.Type annotations that follow PEP 484 are recommended in the code, which are automatically included in the documentation. Hence, you may omit the type information for well-annotated functions.
We aim to follow the recommendations from the Python documentation and the Sphinx reStructuredText documentation for section markup characters,
*
with overline, for chapters=
, for heading-
, for sections~
, for subsections**
text**
, for bold text
Building the documentation locally¶
Check whether all documentation dependencies are installed with
pip install -r docs/requirements.txt
or
mamba env update -f docs/environment.yml -n <envname>
then build the documentation by running:
cd docs/
make clean
make html
Then you can find the HTML output files in the folder erlabpy/docs/build/html/
.
To see what the documentation now looks like with your changes, you can view the HTML build locally by opening the files in your local browser. For example, if you normally use Google Chrome as your browser, you could enter:
google-chrome build/html/index.html
in the terminal, running from within the doc/
folder. You should now see a new tab
pop open in your local browser showing the documentation. The different pages of this
local build of the documentation are linked together, so you can browse the whole
documentation by following links the same way you would on the hosted website.
References¶
S. Hoyer and J. Hamman, Xarray: N-D labeled arrays and datasets in Python, J. Open Res. Softw. (2017).
C. Tusche, A. Krasyuk, and J. Kirschner, Spin resolved bandstructure imaging with a high resolution momentum microscope, Ultramicroscopy 159, 520–529 (2015).
Y. Ishida and S. Shin, Functions to map photoelectron distributions in a variety of setups in angle-resolved photoemission spectroscopy, Rev. Sci. Instrum. 89, 043903 (2018).
R. C. Dynes, V. Narayanamurti, and J. P. Garno, Direct measurement of quasiparticle-lifetime broadening in a strong-coupled superconductor, Phys. Rev. Lett. 41, 1509–1512 (1978).
S. Schirra., How reliable are practical point-in-polygon strategies?, in Algorithms - ESA 2008, 2008, edited by D. Halperin and K. Mehlhorn (Springer Berlin Heidelberg, Berlin, Heidelberg, 2008), p. 744–755, doi:10.1007/978-3-540-87744-8_62.
P. Zhang, P. Richard, T. Qian, Y.-M. Xu, X. Dai, and H. Ding, A precise method for visualizing dispersive features in image plots, Rev. Sci. Instrum. 82, 043712 (2011).
Y. He, Y. Wang, and Z.-X. Shen, Visualizing dispersive features in 2d image via minimum gradient method, Rev. Sci. Instrum. 88, 073903 (2017).
M. P. Seah and W. A. Dench, Quantitative electron spectroscopy of surfaces: a standard data base for electron inelastic mean free paths in solids, Surf. Interface Anal. 1, 2–11 (1979).