erlab.analysis.image¶

Various image processing functions including tools for visualizing dispersive features.

Some filter functions in scipy.ndimage and scipy.signal are extended to work with regularly spaced xarray DataArrays.

Notes

For many scipy-based filter functions, the default value of the mode argument is different from scipy.
Many functions in this module has conflicting names with the SciPy functions. It is good practice to avoid direct imports.

Functions

`boxcar_filter`(darr, size[, mode, cval])	Coordinate-aware boxcar filter.
`curvature`(darr[, a0, factor])	2D curvature method for detecting dispersive features.
`curvature1d`(darr, along[, a0])	1D curvature method for detecting dispersive features.
`diffn`(darr, coord[, order])	Calculate the nth derivative of a DataArray along a given dimension.
`gaussian_filter`(darr, sigma[, order, mode, ...])	Coordinate-aware wrapper around `scipy.ndimage.gaussian_filter`.
`gaussian_laplace`(darr, sigma[, mode, cval])	Coordinate-aware wrapper around `scipy.ndimage.gaussian_laplace`.
`gradient_magnitude`(arr, dx, dy[, mode, cval])	Calculate the gradient magnitude of an image.
`laplace`(darr[, mode, cval])	Coordinate-aware wrapper around `scipy.ndimage.laplace`.
`minimum_gradient`(darr[, mode, cval])	Minimum gradient method for detecting dispersive features in 2D data.
`ndsavgol`(arr, window_shape, polyorder[, ...])	Apply a Savitzky-Golay filter to an N-dimensional array.
`remove_stripe`(darr, deg[, full])	Remove angle-dependent stripe artifact from cuts and maps.
`scaled_laplace`(darr[, factor])	Calculate the Laplacian of a 2D DataArray with different scaling for each axis.

erlab.analysis.image.boxcar_filter(darr, size, mode='nearest', cval=0.0)[source]¶

Coordinate-aware boxcar filter.

Parameters:

darr (DataArray) – The input DataArray.
size (int or Sequence of ints or dict) – The size of the boxcar filter in pixels.
mode (str) – The boundary mode used for the filter. Defaults to ‘nearest’.
cval (float, default: 0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.

Returns:

boxcar_filter (xarray.DataArray) – The filtered array with the same shape as the input DataArray.

Return type:

DataArray

erlab.analysis.image.curvature(darr, a0=1.0, factor=1.0, **kwargs)[source]¶

2D curvature method for detecting dispersive features.

The curvature is calculated as defined by Zhang et al. [2011].

Parameters:

darr (DataArray) – The DataArray for which to calculate the curvature. The curvature is calculated along the first two dimensions of the DataArray.
a0 (float, default: 1.0) – The regularization constant. Reasonable values range from 0.001 to 10, but different values may be needed depending on the data. Default is 1.0.
factor (float, default: 1.0) – The factor by which to scale the x-axis curvature. Negative values will scale the y-axis curvature instead. Default is 1.0.
**kwargs – Additional keyword arguments to findiff.Diff.

Returns:

curvature (xarray.DataArray) – The 2D curvature of the input DataArray. Has the same shape as input.

Raises:

ValueError – If the input DataArray is not 2D.

Return type:

DataArray

erlab.analysis.image.curvature1d(darr, along, a0=1.0, **kwargs)[source]¶

1D curvature method for detecting dispersive features.

The curvature is calculated as defined by Zhang et al. [2011].

Parameters:

darr (DataArray) – The DataArray for which to calculate the curvature.
along (Hashable) – The dimension along which to calculate the curvature.
a0 (float, default: 1.0) – The regularization constant. Reasonable values range from 0.001 to 10, but different values may be needed depending on the data. Default is 1.0.
**kwargs – Additional keyword arguments to findiff.Diff.

Returns:

curvature (xarray.DataArray) – The 1D curvature of the input DataArray. Has the same shape as input.

Raises:

ValueError – If the input DataArray is not 2D.

Return type:

DataArray

erlab.analysis.image.diffn(darr, coord, order=1, **kwargs)[source]¶

Calculate the nth derivative of a DataArray along a given dimension.

Parameters:

darr (DataArray) – The input DataArray.
coord (Hashable) – The name of the coordinate along which to calculate the derivative.
order (int | Iterable[int], default: 1) – The order of the derivative. If given as a tuple, a tuple of derivatives for each order is returned. Default is 1.
**kwargs – Additional keyword arguments to findiff.Diff.

Returns:

DataArray or tuple of DataArray – The differentiated array or a tuple of differentiated arrays corresponding to the provided order.

Return type:

DataArray | tuple[DataArray, …]

erlab.analysis.image.gaussian_filter(darr, sigma, order=0, mode='nearest', cval=0.0, truncate=4.0, *, radius=None)[source]¶

Coordinate-aware wrapper around scipy.ndimage.gaussian_filter.

Parameters:

darr (DataArray) – The input DataArray.
sigma (float or Sequence of floats or dict) – The standard deviation(s) of the Gaussian filter in data dimensions. If a float, the same value is used for all dimensions, each scaled by the data step. If a dict, the value can be specified for each dimension using dimension names as keys. The filter is only applied to the dimensions specified in the dict. If a sequence, the values are used in the same order as the dimensions of the DataArray.
order (int or Sequence of ints or dict) – The order of the filter along each dimension. If an int, the same order is used for all dimensions. See Notes below for other options. Defaults to 0.
mode (str or Sequence of str or dict) – The boundary mode used for the filter. If a str, the same mode is used for all dimensions. See Notes below for other options. Defaults to ‘nearest’.
cval (float, default: 0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.
truncate (float, default: 4.0) – The truncation value used for the Gaussian filter. Defaults to 4.0.
radius (float or Sequence of floats or dict, optional) – The radius of the Gaussian filter in data units. See Notes below. If specified, the size of the kernel along each axis will be 2*radius + 1, and truncate is ignored.

Returns:

gaussian_filter (xarray.DataArray) – The filtered array with the same shape as the input DataArray.

Return type:

DataArray

Note

The sigma and radius values should be in data coordinates, not pixels.
The input array is assumed to be regularly spaced.
order, mode, and radius can be specified for each dimension using a dict or a sequence. If a dict, the value can be specified for each dimension using dimension names as keys. If a sequence and sigma is given as a dictionary, the order is assumed to be the same as the keys in sigma. If sigma is not a dictionary, the order is assumed to be the same as the dimensions of the DataArray.

See also

scipy.ndimage.gaussian_filter(): The underlying function used to apply the filter.

Example

>>> import numpy as np
>>> import xarray as xr
>>> import erlab.analysis as era
>>> darr = xr.DataArray(np.arange(50, step=2).reshape((5, 5)), dims=["x", "y"])
>>> darr
<xarray.DataArray (x: 5, y: 5)> Size: 200B
array([[ 0,  2,  4,  6,  8],
    [10, 12, 14, 16, 18],
    [20, 22, 24, 26, 28],
    [30, 32, 34, 36, 38],
    [40, 42, 44, 46, 48]])
Dimensions without coordinates: x, y
>>> era.image.gaussian_filter(darr, sigma=dict(x=1.0, y=1.0))
<xarray.DataArray (x: 5, y: 5)> Size: 200B
array([[ 3,  5,  7,  8, 10],
    [10, 12, 14, 15, 17],
    [20, 22, 24, 25, 27],
    [29, 31, 33, 34, 36],
    [36, 38, 40, 41, 43]])
Dimensions without coordinates: x, y

erlab.analysis.image.gaussian_laplace(darr, sigma, mode='nearest', cval=0.0, **kwargs)[source]¶

Coordinate-aware wrapper around scipy.ndimage.gaussian_laplace.

This function calculates the Laplacian of the given array using Gaussian second derivatives.

Parameters:

darr (DataArray) – The input DataArray.
sigma (float | Collection[float] | Mapping[Hashable, float]) – The standard deviation(s) of the Gaussian filter in data dimensions. If a float, the same value is used for all dimensions, each scaled by the data step. If a dict, the value can be specified for each dimension using dimension names as keys. If a sequence, the values are used in the same order as the dimensions of the DataArray.
mode (str | Sequence[str] | Mapping[Hashable, str], default: "nearest") – The mode parameter determines how the input array is extended beyond its boundaries. If a string, the same mode is used for all dimensions. If a sequence, the values should be the modes for each dimension in the same order as the dimensions in the DataArray. If a dictionary, the keys should be dimension names and the values should be the corresponding modes, and every dimension in the DataArray must be present. Default is “nearest”.
cval (float, default: 0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.
**kwargs – Additional keyword arguments to scipy.ndimage.gaussian_filter.

Returns:

gaussian_laplace (xarray.DataArray) – The filtered array with the same shape as the input DataArray.

Return type:

DataArray

Note

sigma should be in data coordinates, not pixels.
The input array is assumed to be regularly spaced.

See also

scipy.ndimage.gaussian_laplace(): The underlying function used to apply the filter.

erlab.analysis.image.gradient_magnitude(arr, dx, dy, mode='nearest', cval=0.0)[source]¶

Calculate the gradient magnitude of an image.

The gradient magnitude is calculated as defined in Ref. [He et al., 2017], using given \(\Delta x\) and \(\Delta y\) values.

Parameters:

input – Input array.
dx (float) – Step size in the x-direction.
dy (float) – Step size in the y-direction.
mode (str, default: "nearest") – The mode parameter controls how the gradient is calculated at the boundaries. Default is ‘nearest’. See scipy.ndimage.generic_filter for more information.
cval (float, default: 0.0) – The value to use for points outside the boundaries when mode is ‘constant’. Default is 0.0. See scipy.ndimage.generic_filter for more information.

Returns:

gradient_magnitude (numpy.ndarray) – Gradient magnitude of the input array. Has the same shape as input.

Return type:

ndarray[tuple[Any, …], dtype[float64]]

Note

This function calculates the gradient magnitude of an image by applying a filter that uses the given dx and dy values. The filter is defined by a kernel function that computes the squared difference between each element of the input array and the central element, divided by the corresponding distance value. The gradient magnitude is then calculated as the square root of the sum of the squared differences.

erlab.analysis.image.laplace(darr, mode='nearest', cval=0.0)[source]¶

Coordinate-aware wrapper around scipy.ndimage.laplace.

This function calculates the Laplacian of the given array using approximate second derivatives.

Parameters:

darr – The input DataArray.
mode (str | Sequence[str] | dict[str, str], default: "nearest") – The mode parameter determines how the input array is extended beyond its boundaries. If a dictionary, the keys should be dimension names and the values should be the corresponding modes, and every dimension in the DataArray must be present. Otherwise, it retains the same behavior as in scipy.ndimage.laplace. Default is ‘nearest’.
cval (float, default: 0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.

Returns:

laplace (xarray.DataArray) – The filtered array with the same shape as the input DataArray.

Return type:

DataArray

See also

scipy.ndimage.laplace(): The underlying function used to apply the filter.

erlab.analysis.image.minimum_gradient(darr, mode='nearest', cval=0.0)[source]¶

Minimum gradient method for detecting dispersive features in 2D data.

The minimum gradient is calculated by dividing the input DataArray by the gradient magnitude. See Ref. [He et al., 2017].

Parameters:

darr (DataArray) – The 2D DataArray for which to calculate the minimum gradient.
mode (str, default: 'nearest') – The mode parameter controls how the gradient is calculated at the boundaries. Default is ‘nearest’. See scipy.ndimage.generic_filter for more information.
cval (float, default: 0.0) – The value to use for points outside the boundaries when mode is ‘constant’. Default is 0.0. See scipy.ndimage.generic_filter for more information.

Returns:

minimum_gradient (xarray.DataArray) – The minimum gradient of the input DataArray. Has the same shape as input.

Raises:

ValueError – If the input DataArray is not 2D.

Return type:

DataArray

Note

Any zero gradient values are replaced with NaN.

erlab.analysis.image.ndsavgol(arr, window_shape, polyorder, deriv=0, delta=1.0, mode='mirror', cval=0.0, method='pinv')[source]¶

Apply a Savitzky-Golay filter to an N-dimensional array.

Unlike scipy.signal.savgol_filter which is limited to 1D arrays, this function calculates multi-dimensional Savitzky-Golay filters. There are some subtle differences in the implementation, so the results may not be identical. See Notes.

Parameters:

arr (array-like) – The input N-dimensional array to be filtered. The array will be cast to float64 before filtering.
window_shape (int or tuple of ints) – The shape of the window used for filtering. If an integer, the same size will be used across all axes.
polyorder (int) – The order of the polynomial used to fit the samples. polyorder must be less than the minimum of window_shape.
deriv (int or tuple of ints) – The order of the derivative to compute given as a single integer or a tuple of integers. If an integer, the derivative of that order is computed along all axes. If a tuple of integers, the derivative of each order is computed along the corresponding dimension. The default is 0, which means to filter the data without differentiating.
delta (float or tuple of floats) – The spacing of the samples to which the filter will be applied. If a float, the same value is used for all axes. If a tuple, the values are used in the same order as in deriv. The default is 1.0.
mode (Literal['mirror', 'constant', 'nearest', 'wrap'], default: "mirror") – Must be ‘mirror’, ‘constant’, ‘nearest’, or ‘wrap’. This determines the type of extension to use for the padded signal to which the filter is applied. When mode is ‘constant’, the padding value is given by cval.
cval (float) – Value to fill past the edges of the input if mode is ‘constant’. Default is 0.0.
method (Literal['pinv', 'lstsq'], default: "pinv") – Must be ‘pinv’ or ‘lstsq’. Determines the method used to calculate the filter coefficients. ‘pinv’ uses the pseudoinverse of the Vandermonde matrix, while ‘lstsq’ uses least squares for each window position. ‘lstsq’ is much slower but may be more numerically stable in some cases. The difference is more pronounced for higher dimensions, larger window size, and higher polynomial orders. The default is ‘pinv’.

Returns:

numpy.ndarray – The filtered array.

See also

scipy.signal.savgol_filter(): The 1D Savitzky-Golay filter function in SciPy.

Notes

For even window sizes, the results may differ slightly from scipy.signal.savgol_filter due to differences in the implementation.
This function is not suitable for cases where accumulated floating point errors are comparable to the filter coefficients, i.e., for high number of dimensions and large window sizes.
mode='interp' is not implemented as it is not clear how to handle the edge cases in higher dimensions.

Examples

>>> import numpy as np
>>> import erlab.analysis as era

>>> arr = np.array([1, 2, 3, 4, 5])
>>> era.image.ndsavgol(arr, (3,), polyorder=2)
array([1., 2., 3., 4., 5.])

>>> era.image.ndsavgol(arr, (3,), polyorder=2, deriv=1)
array([0., 1., 1., 1., 0.])

>>> arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> era.image.ndsavgol(arr, (3, 3), polyorder=2)
array([[0.5, 1. , 1.5],
       [2. , 2.5, 3. ],
       [3.5, 4. , 4.5]])

erlab.analysis.image.remove_stripe(darr, deg, full=False, **sel_kw)[source]¶

Remove angle-dependent stripe artifact from cuts and maps.

Energy-independent stripe artifacts may be introduced during the acquisition of ARPES data due to imperfect alignment of the slit or other experimental factors.

Assume an original intensity \(I_0(\alpha, \omega)\) that is corrupted by a energy-independent stripe pattern \(S(\alpha)\):

\[I(\alpha, \omega) = I_0(\alpha, \omega) \cdot S(\alpha).\]

If we assume that \(S(\alpha)\) to be a high-frequency noise, we may approximate \(I_0(\alpha, \omega)\) by smoothing \(I(\alpha, \omega)\) along \(\alpha\). We can then obtain an approximation of \(1/S(\alpha)\) by dividing the smoothed data with \(I(\alpha, \omega)\) and averaging the result over \(\omega\). Finally, we can remove the stripe pattern by multiplying \(I(\alpha, \omega)\) with the obtained \(1/S(\alpha)\).

Works best for data with a high signal-to-noise ratio and a high background level. Since the stripe is assumed to be energy-independent, the method is only suitable for data acquired with sweep mode.

This method may introduce artifacts that are not present in the original data, making it unsuitable for quantitative analysis. Use only for visualization purposes.

Parameters:

darr (DataArray) – The data containing the stripe artifact. Data must have the dimensions “alpha” and “eV”.
deg (int) – The degree of the polynomial fit. The degree should be high enough to capture all intrinsic features of the data, but low enough to avoid overfitting. A good starting value is around 20.
full (bool, default: False) – Flag determining whether to return the full stripe pattern. If True, \(1/S(\alpha)\) is also returned. Default is False.
**sel_kw – Keyword arguments to xarray.DataArray.sel(). Specify the range of angles and energies to use for the polynomial fit.

Returns:

corrected (xarray.DataArray) – The data with the stripe artifact removed.
stripe (xarray.DataArray) – The stripe pattern \(1/S(\alpha)\). Only returned if full is True.

Return type:

DataArray | tuple[DataArray, DataArray]

erlab.analysis.image.scaled_laplace(darr, factor=1.0, **kwargs)[source]¶

Calculate the Laplacian of a 2D DataArray with different scaling for each axis.

This function calculates the Laplacian of the given array using approximate second derivatives, taking the different scaling for each axis into account.

\[\Delta f \sim \frac{\partial^2 f}{\partial x^2} \left(\frac{\Delta x}{\Delta y}\right)^{\!2} + \frac{\partial^2 f}{\partial y^2}\]

See Ref. [Zhang et al., 2011] for more information.

Parameters:

darr – The 2D DataArray for which to calculate the scaled Laplacian.
factor (float, default: 1.0) – The factor by which to scale the x-axis derivative. Negative values will scale the y-axis derivative instead. Default is 1.0.
**kwargs – Additional keyword arguments to findiff.Diff.

Returns:

scaled_laplace (xarray.DataArray) – The filtered array with the same shape as the input DataArray.

Return type:

DataArray