Data IO (erlab.io
)¶
Read & write ARPES data.
This module provides functions that enables loading various files such as hdf5 files, igor pro files, and ARPES data from different beamlines and laboratories.
Modules¶
Data loading plugins. |
|
Base functionality for implementing data loaders. |
|
Generates simple simulated ARPES data for testing purposes. |
|
Data import for characterization experiments. |
For a single session, it is very common to use only one type of loader for a single
folder with all your data. Hence, the module provides a way to set a default loader for
a session. This is done using the set_loader()
function. The same can be done for
the data directory using the set_data_dir()
function.
For instructions on how to write a custom loader, see erlab.io.dataloader.
Examples
View all registered loaders:
>>> erlab.io.loaders
Load data by explicitly specifying the loader:
>>> dat = erlab.io.loaders["merlin"].load(...)
- erlab.io.load(data_dir=None, **kwargs)[source]¶
Load ARPES data.
- Parameters:
identifier – Value that identifies a scan uniquely. If a string or path-like object is given, it is assumed to be the path to the data file. If an integer is given, it is assumed to be a number that specifies the scan number, and is used to automatically determine the path to the data file(s).
data_dir (str | os.PathLike | None) – Where to look for the data. If
None
, the default data directory will be used.single – For some setups, data for a single scan is saved over multiple files. This argument is only used for such setups. When
identifier
is resolved to a single file within a multiple file scan, the default behavior whensingle
isFalse
is to return a single concatenated array that contains data from all files in the same scan. Ifsingle
is set toTrue
, only the data from the file given is returned. This argument is ignored whenidentifier
is a number.**kwargs – Additional keyword arguments are passed to
identify
.
- Returns:
The loaded data.
- Return type:
xarray.DataArray
orxarray.Dataset
orlist
ofxarray.DataArray
- erlab.io.load_experiment(filename, folder=None, *, prefix=None, ignore=None, recursive=False, **kwargs)[source]¶
Load waves from an igor experiment (
pxp
) file.- Parameters:
folder (str | None) – Target folder within the experiment, given as a slash-separated string. If
None
, defaults to the root.prefix (str | None) – If given, only include waves with names that starts with the given string.
recursive (bool) – If
True
, includes waves in child directories.**kwargs – Extra arguments to
load_wave()
.
- Returns:
Dataset containing the waves.
- Return type:
- erlab.io.load_hdf5(filename, **kwargs)[source]¶
Load data from an HDF5 file saved with
save_as_hdf5
.This is a thin wrapper around
xarray.load_dataarray
andxarray.load_dataset
.- Parameters:
**kwargs – Extra arguments to
xarray.load_dataarray
orxarray.load_dataset
.
- Returns:
The loaded data.
- Return type:
- erlab.io.load_wave(wave, data_dir=None)[source]¶
Load a wave from Igor binary format.
- Parameters:
wave (dict | WaveRecord | str | PathLike) – The wave to load. It can be provided as a dictionary, an instance of
igor2.record.WaveRecord
, or a string representing the path to the wave file.data_dir (str | PathLike | None) – The directory where the wave file is located. This parameter is only used if
wave
is a string orPathLike
object. IfNone
,wave
must be a valid path.
- Returns:
The loaded wave.
- Return type:
- Raises:
ValueError – If the wave file cannot be found or loaded.
TypeError – If the wave argument is of an unsupported type.
- erlab.io.loader_context(data_dir=None)[source]¶
Context manager for the current data loader and data directory.
- Parameters:
loader (
str
, optional) – The name or alias of the loader to use in the context.data_dir (
str
oros.PathLike
, optional) – The data directory to use in the context.
Examples
Load data within a context manager:
>>> with erlab.io.loader_context("merlin"): ... dat_merlin = erlab.io.load(...)
Load data with different loaders and directories:
>>> erlab.io.set_loader("ssrl52", data_dir="/path/to/dir1") >>> dat_ssrl_1 = erlab.io.load(...) >>> with erlab.io.loader_context("merlin", data_dir="/path/to/dir2"): ... dat_merlin = erlab.io.load(...) >>> dat_ssrl_2 = erlab.io.load(...)
- erlab.io.open_hdf5(filename, **kwargs)[source]¶
Open data from an HDF5 file saved with
save_as_hdf5
.This is a thin wrapper around
xarray.open_dataarray
andxarray.open_dataset
.- Parameters:
**kwargs – Extra arguments to
xarray.open_dataarray
orxarray.open_dataset
.
- Returns:
The opened data.
- Return type:
- erlab.io.save_as_hdf5(data, filename, igor_compat=True, **kwargs)[source]¶
Save data in
HDF5
format.- Parameters:
data (DataArray | Dataset) –
xarray.DataArray
to save.igor_compat (bool) – (Experimental) Make the resulting file compatible with Igor’s
HDF5OpenFile
for DataArrays with up to 4 dimensions. A convenient Igor procedure is included in the repository. Default isTrue
.**kwargs – Extra arguments to
xarray.DataArray.to_netcdf
: refer to thexarray
documentation for a list of all possible arguments.
- erlab.io.save_as_netcdf(data, filename, **kwargs)[source]¶
Save data in
netCDF4
format.Discards invalid
netCDF4
attributes and produces a warning.- Parameters:
data (DataArray) –
xarray.DataArray
to save.**kwargs – Extra arguments to
xarray.DataArray.to_netcdf
: refer to thexarray
documentation for a list of all possible arguments.
- erlab.io.set_data_dir(data_dir)[source]¶
Set the default data directory for the data loader.
All subsequent calls to
load
will use thedata_dir
set here unless specified.Note
This will only affect
load
. If the loader’sload
method is called directly, it will not use the default data directory.
- erlab.io.set_loader(loader)[source]¶
Set the current data loader.
All subsequent calls to
load
will use the loader set here.- Parameters:
loader (str | LoaderBase | None) – The loader to set. It can be either a string representing the name or alias of the loader, or a valid loader class.
Example
>>> erlab.io.set_loader("merlin") >>> dat_merlin_1 = erlab.io.load(...) >>> dat_merlin_2 = erlab.io.load(...)
- erlab.io.summarize(usecache=True, *, cache=True, display=True, **kwargs)[source]¶
Summarize the data in the given directory.
Takes a path to a directory and summarizes the data in the directory to a table, much like a log file. This is useful for quickly inspecting the contents of a directory.
The dataframe is formatted using the style from
get_styler
and displayed in the IPython shell. Results are cached in a pickle file in the directory.- Parameters:
data_dir – Directory to summarize.
usecache (bool) – Whether to use the cached summary if available. If
False
, the summary will be regenerated. The cache will be updated ifcache
isTrue
.cache (bool) – Whether to cache the summary in a pickle file in the directory. If
False
, no cache will be created or updated. Note that existing cache files will not be deleted, and will be used ifusecache
isTrue
.display (bool) – Whether to display the formatted dataframe using the IPython shell. If
False
, the dataframe will be returned without formatting. IfTrue
but the IPython shell is not detected, the dataframe styler will be returned.**kwargs – Additional keyword arguments to be passed to
generate_summary
.
- Returns:
df – Summary of the data in the directory.
If
display
isFalse
, the summary DataFrame is returned.If
display
isTrue
and the IPython shell is detected, the summary will be displayed, andNone
will be returned.If
ipywidgets
is installed, an interactive widget will be returned instead ofNone
.
If
display
isTrue
but the IPython shell is not detected, the styler for the summary DataFrame will be returned.
- Return type:
- erlab.io.loaders¶
Global instance of
LoaderRegistry
.