erlab.utils.array¶

Utility functions for working with numpy and xarray.

Functions

`apply_dataarray_func`(data, func, **kwargs)	Apply a function to a DataArray, Dataset, or DataTree.
`broadcast_args`(func)	Decorate a function to broadcast all DataArray arguments.
`check_arg_2d_darr`(func)	Decorate a function to check if the first argument is a 2D DataArray.
`check_arg_has_no_nans`(func)	Decorate a function to check if the first argument has no NaNs.
`check_arg_uniform_dims`(func)	Decorate a function to check if all dims in the first argument are uniform.
`effective_decimals`(step_or_coord)	Calculate the effective number of decimal places for a given step size.
`ensure_same_coord_names`(data_list)	Ensure all data has the same set of coordinate names.
`is_dims_uniform`(darr[, dims])	Check if the given dimensions of a DataArray have uniform spacing.
`is_monotonic`(arr[, strict])	Check if an array is monotonic.
`is_uniform_spaced`(arr[, rtol, atol])	Check if the given array is uniformly spaced.
`minmax_darr`(darr, *[, skipna])	Return (min, max) for DataArrays, with efficient handling for dask.
`sort_coord_order`(darr[, keys, dims_first])	Sort the coordinates of a DataArray in the given order.
`to_native_endian`(arr)	Convert an array to native endianness.
`trim_na`(darr[, dims])	Drop all-NaN coordinates from the edges of a DataArray.
`uniform_dims`(darr, **kwargs)	Return the set of dimensions that are uniformly spaced.
`unique_decimals`(arr)	Compute digits needed to represent floating point values uniquely.

erlab.utils.array.apply_dataarray_func(data, func, **kwargs)[source]¶

Apply a function to a DataArray, Dataset, or DataTree.

Parameters:

data (DataArray | Dataset | DataTree) – The input data.
func (Callable[..., DataArray]) – The function to apply to each DataArray. The first positional argument must be a DataArray, and it must return a DataArray.
**kwargs – Additional keyword arguments to pass to func.

Returns:

DataArray or Dataset or DataTree – The post-processed data with the same type as the input.

Return type:

DataArray | Dataset | DataTree

erlab.utils.array.broadcast_args(func)[source]¶

Decorate a function to broadcast all DataArray arguments.

This decorator automatically broadcasts all DataArray args and kwargs to the same shape, and only passes pure NumPy arrays to the decorated function.

If the decorated function returns a NumPy array with the same shape as the broadcasted DataArray arguments, a new DataArray will be created with the same coordinates and dimensions as the broadcasted DataArray. In this case, the attributes will be taken from the first DataArray that appears in the arguments.

This is useful when working with functions that only accept pure NumPy arrays, or always returns a NumPy array, such as numba jit-compiled functions.

Note

When used on numba functions in nopython mode, the decorated function will no longer be able to be called from another function compiled in nopython mode.
The decorated function will not be able to accept DataArray arguments.
Dask-backed DataArray arguments are applied lazily so that the decorated function still receives NumPy arrays.

erlab.utils.array.check_arg_2d_darr(func)[source]¶

Decorate a function to check if the first argument is a 2D DataArray.

erlab.utils.array.check_arg_has_no_nans(func)[source]¶

Decorate a function to check if the first argument has no NaNs.

erlab.utils.array.check_arg_uniform_dims(func)[source]¶

Decorate a function to check if all dims in the first argument are uniform.

erlab.utils.array.effective_decimals(step_or_coord)[source]¶

Calculate the effective number of decimal places for a given step size.

This function determines the number of decimal places required to approximately represent a value in a linearly spaced array, given its step size. We assume that rounding to a decimal an order of magnitude smaller than the step size to be a good approximation.

Parameters:

step – The step size for which to calculate the effective number of decimal places, or a coordinate array in which case the step size is calculated as the difference of the first two elements.

Returns:

int – The effective number of decimal places, calculated as the order of magnitude of step plus one.

If the step size is zero, a default value of 3 is returned.

Return type:

int

erlab.utils.array.ensure_same_coord_names(data_list)[source]¶

Ensure all data has the same set of coordinate names.

This function modifies the provided list in place by adding missing coordinates with NaN values to each DataArray, Dataset, or DataTree in the list.

All inputs must be of the same type: either all DataArrays, all Datasets, or all DataTrees.

This function is used by the data loading utilities to ensure that all files in a single load operation have the same set of coordinate names. This is required because some endstations produce files with missing header entries, possibly due to a bug in the data acquisition software.

erlab.utils.array.is_dims_uniform(darr, dims=None, **kwargs)[source]¶

Check if the given dimensions of a DataArray have uniform spacing.

Parameters:

darr (DataArray) – The DataArray to check.
dims (Iterable[Hashable] | None, default: None) – The dimensions to check. If None, all dimensions of the DataArray will be checked.
**kwargs – Additional keyword arguments to be passed to is_uniform_spaced.

Returns:

bool – True if all dimensions have uniform spacing, False otherwise.

Return type:

bool

erlab.utils.array.is_monotonic(arr, strict=False)[source]¶

Check if an array is monotonic.

Parameters:

arr (array-like) – The input array.
strict (bool, optional) – If True, the array must be strictly monotonic, i.e., either strictly increasing or strictly decreasing. If False, the array can be non-decreasing or non-increasing.

Returns:

bool – True if the array is monotonic, False otherwise.

Return type:

bool

erlab.utils.array.is_uniform_spaced(arr, rtol=1.0e-5, atol=1.0e-8)[source]¶

Check if the given array is uniformly spaced.

Constant arrays are also considered as uniformly spaced.

Parameters:

arr (array-like) – The input array.
rtol (float, optional) – Relative tolerance passed to numpy.isclose().
atol (float, optional) – Absolute tolerance passed to numpy.isclose().

Returns:

bool – True if the array is uniformly spaced and one-dimensional, False otherwise.

Return type:

bool

Examples

>>> is_uniform_spaced([1, 2, 3, 4])
True
>>> is_uniform_spaced([1, 2, 3, 5])
False

erlab.utils.array.minmax_darr(darr, *, skipna=True)[source]¶

Return (min, max) for DataArrays, with efficient handling for dask.

Parameters:

darr (DataArray) – The input DataArray.
skipna (bool, default: True) – Whether to skip NaN values.

Returns:

mn – The minimum value of the DataArray as a float.
mx – The maximum value of the DataArray as a float.

Return type:

tuple[float, float]

erlab.utils.array.sort_coord_order(darr, keys=None, *, dims_first=True)[source]¶

Sort the coordinates of a DataArray in the given order.

By default, DataArray represents the coordinates in the order they are given in the constructor. The order may become mixed up after performing operations; This function sorts them so that they are more easily readable.

This has been raised as an issue in xarray, but it seems like it will not be implemented in the near future.

Parameters:

darr (DataArray) – The DataArray to sort.
keys (Iterable[Hashable] | None, default: None) – The order in which to sort the coordinates. If not provided, the coordinates will retain their original order. If keys is not provided and dims_first is False, this function will return the DataArray as is.
dims_first (bool, default: True) – If True, the dimensions will come first in the sorted DataArray. The order of the dimensions will not respect the order given in keys, but will be sorted in the order they appear in the DataArray. If False, everything will be sorted in the order given in keys.

Returns:

darr – The sorted DataArray.

Return type:

DataArray

erlab.utils.array.to_native_endian(arr)[source]¶

Convert an array to native endianness.

Some Igor Pro files may contain data in big-endian format, which may be incompatible with numba functions. This function converts the array to native endianness.

Parameters:: arr (array-like) – The input array.
Returns:: array – The array in native endianness.
Return type:: ndarray[tuple[Any, …], dtype[_ScalarT]]

erlab.utils.array.trim_na(darr, dims=None)[source]¶

Drop all-NaN coordinates from the edges of a DataArray.

Parameters:

darr (DataArray) – The DataArray to trim.
dims (Iterable[Hashable] | None, default: None) – The dimensions along which to trim. If not provided, the data will be trimmed along all dimensions.

Returns:

darr – The trimmed DataArray.

Return type:

DataArray

erlab.utils.array.uniform_dims(darr, **kwargs)[source]¶

Return the set of dimensions that are uniformly spaced.

Parameters:

darr (DataArray) – The input xarray DataArray.
**kwargs – Additional keyword arguments passed to is_uniform_spaced.

Returns:

dims – A set of dimensions that are uniformly spaced.

Return type:

set[Hashable]

erlab.utils.array.unique_decimals(arr)[source]¶

Compute digits needed to represent floating point values uniquely.

This function determines the minimum number of decimal places required to uniquely represent floating point values in the given array.

Parameters:: arr (ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]) – The input array.
Returns:: int – The maximum number of decimal places.
Return type:: int