erlab.analysis.image¶
Various image processing functions including tools for visualizing dispersive features.
Some filter functions in scipy.ndimage and scipy.signal are extended to work with
regularly spaced xarray DataArrays.
Notes
For many scipy-based filter functions, the default value of the
modeargument is different from scipy.Many functions in this module has conflicting names with the SciPy functions. It is good practice to avoid direct imports.
Functions
|
Coordinate-aware boxcar filter. |
|
2D curvature method for detecting dispersive features. |
|
1D curvature method for detecting dispersive features. |
|
Calculate the nth derivative of a DataArray along a given dimension. |
|
Coordinate-aware wrapper around |
|
Coordinate-aware wrapper around |
|
Calculate the gradient magnitude of an image. |
|
Coordinate-aware wrapper around |
|
Minimum gradient method for detecting dispersive features in 2D data. |
|
Apply a Savitzky-Golay filter to an N-dimensional array. |
|
Remove angle-dependent stripe artifact from cuts and maps. |
|
Calculate the Laplacian of a 2D DataArray with different scaling for each axis. |
- erlab.analysis.image.boxcar_filter(darr, size, mode='nearest', cval=0.0)[source]¶
Coordinate-aware boxcar filter.
- Parameters:
- Returns:
boxcar_filter (
xarray.DataArray) – The filtered array with the same shape as the input DataArray.- Return type:
- erlab.analysis.image.curvature(darr, a0=1.0, factor=1.0, **kwargs)[source]¶
2D curvature method for detecting dispersive features.
The curvature is calculated as defined by Zhang et al. [2011].
- Parameters:
darr (
DataArray) – The DataArray for which to calculate the curvature. The curvature is calculated along the first two dimensions of the DataArray.a0 (
float, default:1.0) – The regularization constant. Reasonable values range from 0.001 to 10, but different values may be needed depending on the data. Default is 1.0.factor (
float, default:1.0) – The factor by which to scale the x-axis curvature. Negative values will scale the y-axis curvature instead. Default is 1.0.**kwargs – Additional keyword arguments to
findiff.Diff.
- Returns:
curvature (
xarray.DataArray) – The 2D curvature of the input DataArray. Has the same shape asinput.- Raises:
ValueError – If the input DataArray is not 2D.
- Return type:
- erlab.analysis.image.curvature1d(darr, along, a0=1.0, **kwargs)[source]¶
1D curvature method for detecting dispersive features.
The curvature is calculated as defined by Zhang et al. [2011].
- Parameters:
darr (
DataArray) – The DataArray for which to calculate the curvature.along (
Hashable) – The dimension along which to calculate the curvature.a0 (
float, default:1.0) – The regularization constant. Reasonable values range from 0.001 to 10, but different values may be needed depending on the data. Default is 1.0.**kwargs – Additional keyword arguments to
findiff.Diff.
- Returns:
curvature (
xarray.DataArray) – The 1D curvature of the input DataArray. Has the same shape asinput.- Raises:
ValueError – If the input DataArray is not 2D.
- Return type:
- erlab.analysis.image.diffn(darr, coord, order=1, **kwargs)[source]¶
Calculate the nth derivative of a DataArray along a given dimension.
- Parameters:
darr (
DataArray) – The input DataArray.coord (
Hashable) – The name of the coordinate along which to calculate the derivative.order (
int|Iterable[int], default:1) – The order of the derivative. If given as a tuple, a tuple of derivatives for each order is returned. Default is 1.**kwargs – Additional keyword arguments to
findiff.Diff.
- Returns:
DataArrayortupleofDataArray– The differentiated array or a tuple of differentiated arrays corresponding to the provided order.- Return type:
- erlab.analysis.image.gaussian_filter(darr, sigma, order=0, mode='nearest', cval=0.0, truncate=4.0, *, radius=None)[source]¶
Coordinate-aware wrapper around
scipy.ndimage.gaussian_filter.- Parameters:
darr (
DataArray) – The input DataArray.sigma (
floatorSequenceoffloatsordict) – The standard deviation(s) of the Gaussian filter in data dimensions. If a float, the same value is used for all dimensions, each scaled by the data step. If a dict, the value can be specified for each dimension using dimension names as keys. The filter is only applied to the dimensions specified in the dict. If a sequence, the values are used in the same order as the dimensions of the DataArray.order (
intorSequenceofintsordict) – The order of the filter along each dimension. If an int, the same order is used for all dimensions. See Notes below for other options. Defaults to 0.mode (
strorSequenceofstrordict) – The boundary mode used for the filter. If a str, the same mode is used for all dimensions. See Notes below for other options. Defaults to ‘nearest’.cval (
float, default:0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.truncate (
float, default:4.0) – The truncation value used for the Gaussian filter. Defaults to 4.0.radius (
floatorSequenceoffloatsordict, optional) – The radius of the Gaussian filter in data units. See Notes below. If specified, the size of the kernel along each axis will be2*radius + 1, andtruncateis ignored.
- Returns:
gaussian_filter (
xarray.DataArray) – The filtered array with the same shape as the input DataArray.- Return type:
Note
The
sigmaandradiusvalues should be in data coordinates, not pixels.The input array is assumed to be regularly spaced.
order,mode, andradiuscan be specified for each dimension using a dict or a sequence. If a dict, the value can be specified for each dimension using dimension names as keys. If a sequence andsigmais given as a dictionary, the order is assumed to be the same as the keys insigma. Ifsigmais not a dictionary, the order is assumed to be the same as the dimensions of the DataArray.
See also
scipy.ndimage.gaussian_filter()The underlying function used to apply the filter.
Example
>>> import numpy as np >>> import xarray as xr >>> import erlab.analysis as era >>> darr = xr.DataArray(np.arange(50, step=2).reshape((5, 5)), dims=["x", "y"]) >>> darr <xarray.DataArray (x: 5, y: 5)> Size: 200B array([[ 0, 2, 4, 6, 8], [10, 12, 14, 16, 18], [20, 22, 24, 26, 28], [30, 32, 34, 36, 38], [40, 42, 44, 46, 48]]) Dimensions without coordinates: x, y >>> era.image.gaussian_filter(darr, sigma=dict(x=1.0, y=1.0)) <xarray.DataArray (x: 5, y: 5)> Size: 200B array([[ 3, 5, 7, 8, 10], [10, 12, 14, 15, 17], [20, 22, 24, 25, 27], [29, 31, 33, 34, 36], [36, 38, 40, 41, 43]]) Dimensions without coordinates: x, y
- erlab.analysis.image.gaussian_laplace(darr, sigma, mode='nearest', cval=0.0, **kwargs)[source]¶
Coordinate-aware wrapper around
scipy.ndimage.gaussian_laplace.This function calculates the Laplacian of the given array using Gaussian second derivatives.
- Parameters:
darr (
DataArray) – The input DataArray.sigma (
float|Collection[float] |Mapping[Hashable,float]) – The standard deviation(s) of the Gaussian filter in data dimensions. If a float, the same value is used for all dimensions, each scaled by the data step. If a dict, the value can be specified for each dimension using dimension names as keys. If a sequence, the values are used in the same order as the dimensions of the DataArray.mode (
str|Sequence[str] |Mapping[Hashable,str], default:"nearest") – The mode parameter determines how the input array is extended beyond its boundaries. If a string, the same mode is used for all dimensions. If a sequence, the values should be the modes for each dimension in the same order as the dimensions in the DataArray. If a dictionary, the keys should be dimension names and the values should be the corresponding modes, and every dimension in the DataArray must be present. Default is “nearest”.cval (
float, default:0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.**kwargs – Additional keyword arguments to
scipy.ndimage.gaussian_filter.
- Returns:
gaussian_laplace (
xarray.DataArray) – The filtered array with the same shape as the input DataArray.- Return type:
Note
sigmashould be in data coordinates, not pixels.The input array is assumed to be regularly spaced.
See also
scipy.ndimage.gaussian_laplace()The underlying function used to apply the filter.
- erlab.analysis.image.gradient_magnitude(arr, dx, dy, mode='nearest', cval=0.0)[source]¶
Calculate the gradient magnitude of an image.
The gradient magnitude is calculated as defined in Ref. [He et al., 2017], using given \(\Delta x\) and \(\Delta y\) values.
- Parameters:
input – Input array.
dx (
float) – Step size in the x-direction.dy (
float) – Step size in the y-direction.mode (
str, default:"nearest") – The mode parameter controls how the gradient is calculated at the boundaries. Default is ‘nearest’. Seescipy.ndimage.generic_filterfor more information.cval (
float, default:0.0) – The value to use for points outside the boundaries when mode is ‘constant’. Default is 0.0. Seescipy.ndimage.generic_filterfor more information.
- Returns:
gradient_magnitude (
numpy.ndarray) – Gradient magnitude of the input array. Has the same shape asinput.- Return type:
Note
This function calculates the gradient magnitude of an image by applying a filter that uses the given dx and dy values. The filter is defined by a kernel function that computes the squared difference between each element of the input array and the central element, divided by the corresponding distance value. The gradient magnitude is then calculated as the square root of the sum of the squared differences.
- erlab.analysis.image.laplace(darr, mode='nearest', cval=0.0)[source]¶
Coordinate-aware wrapper around
scipy.ndimage.laplace.This function calculates the Laplacian of the given array using approximate second derivatives.
- Parameters:
darr – The input DataArray.
mode (
str|Sequence[str] |dict[str,str], default:"nearest") – The mode parameter determines how the input array is extended beyond its boundaries. If a dictionary, the keys should be dimension names and the values should be the corresponding modes, and every dimension in the DataArray must be present. Otherwise, it retains the same behavior as inscipy.ndimage.laplace. Default is ‘nearest’.cval (
float, default:0.0) – Value to fill past edges of input if mode is ‘constant’. Defaults to 0.0.
- Returns:
laplace (
xarray.DataArray) – The filtered array with the same shape as the input DataArray.- Return type:
See also
scipy.ndimage.laplace()The underlying function used to apply the filter.
- erlab.analysis.image.minimum_gradient(darr, mode='nearest', cval=0.0)[source]¶
Minimum gradient method for detecting dispersive features in 2D data.
The minimum gradient is calculated by dividing the input DataArray by the gradient magnitude. See Ref. [He et al., 2017].
- Parameters:
darr (
DataArray) – The 2D DataArray for which to calculate the minimum gradient.mode (
str, default:'nearest') – The mode parameter controls how the gradient is calculated at the boundaries. Default is ‘nearest’. Seescipy.ndimage.generic_filterfor more information.cval (
float, default:0.0) – The value to use for points outside the boundaries when mode is ‘constant’. Default is 0.0. Seescipy.ndimage.generic_filterfor more information.
- Returns:
minimum_gradient (
xarray.DataArray) – The minimum gradient of the input DataArray. Has the same shape asinput.- Raises:
ValueError – If the input DataArray is not 2D.
- Return type:
Note
Any zero gradient values are replaced with NaN.
- erlab.analysis.image.ndsavgol(arr, window_shape, polyorder, deriv=0, delta=1.0, mode='mirror', cval=0.0, method='pinv')[source]¶
Apply a Savitzky-Golay filter to an N-dimensional array.
Unlike
scipy.signal.savgol_filterwhich is limited to 1D arrays, this function calculates multi-dimensional Savitzky-Golay filters. There are some subtle differences in the implementation, so the results may not be identical. See Notes.- Parameters:
arr (
array-like) – The input N-dimensional array to be filtered. The array will be cast to float64 before filtering.window_shape (
intortupleofints) – The shape of the window used for filtering. If an integer, the same size will be used across all axes.polyorder (
int) – The order of the polynomial used to fit the samples.polyordermust be less than the minimum ofwindow_shape.deriv (
intortupleofints) – The order of the derivative to compute given as a single integer or a tuple of integers. If an integer, the derivative of that order is computed along all axes. If a tuple of integers, the derivative of each order is computed along the corresponding dimension. The default is 0, which means to filter the data without differentiating.delta (
floatortupleoffloats) – The spacing of the samples to which the filter will be applied. If a float, the same value is used for all axes. If a tuple, the values are used in the same order as inderiv. The default is 1.0.mode (
Literal['mirror','constant','nearest','wrap'], default:"mirror") – Must be ‘mirror’, ‘constant’, ‘nearest’, or ‘wrap’. This determines the type of extension to use for the padded signal to which the filter is applied. Whenmodeis ‘constant’, the padding value is given bycval.cval (
float) – Value to fill past the edges of the input ifmodeis ‘constant’. Default is 0.0.method (
Literal['pinv','lstsq'], default:"pinv") – Must be ‘pinv’ or ‘lstsq’. Determines the method used to calculate the filter coefficients. ‘pinv’ uses the pseudoinverse of the Vandermonde matrix, while ‘lstsq’ uses least squares for each window position. ‘lstsq’ is much slower but may be more numerically stable in some cases. The difference is more pronounced for higher dimensions, larger window size, and higher polynomial orders. The default is ‘pinv’.
- Returns:
numpy.ndarray– The filtered array.
See also
scipy.signal.savgol_filter()The 1D Savitzky-Golay filter function in SciPy.
Notes
For even window sizes, the results may differ slightly from
scipy.signal.savgol_filterdue to differences in the implementation.This function is not suitable for cases where accumulated floating point errors are comparable to the filter coefficients, i.e., for high number of dimensions and large window sizes.
mode='interp'is not implemented as it is not clear how to handle the edge cases in higher dimensions.
Examples
>>> import numpy as np >>> import erlab.analysis as era
>>> arr = np.array([1, 2, 3, 4, 5]) >>> era.image.ndsavgol(arr, (3,), polyorder=2) array([1., 2., 3., 4., 5.])
>>> era.image.ndsavgol(arr, (3,), polyorder=2, deriv=1) array([0., 1., 1., 1., 0.])
>>> arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> era.image.ndsavgol(arr, (3, 3), polyorder=2) array([[0.5, 1. , 1.5], [2. , 2.5, 3. ], [3.5, 4. , 4.5]])
- erlab.analysis.image.remove_stripe(darr, deg, full=False, **sel_kw)[source]¶
Remove angle-dependent stripe artifact from cuts and maps.
Energy-independent stripe artifacts may be introduced during the acquisition of ARPES data due to imperfect alignment of the slit or other experimental factors.
Assume an original intensity \(I_0(\alpha, \omega)\) that is corrupted by a energy-independent stripe pattern \(S(\alpha)\):
\[I(\alpha, \omega) = I_0(\alpha, \omega) \cdot S(\alpha).\]If we assume that \(S(\alpha)\) to be a high-frequency noise, we may approximate \(I_0(\alpha, \omega)\) by smoothing \(I(\alpha, \omega)\) along \(\alpha\). We can then obtain an approximation of \(1/S(\alpha)\) by dividing the smoothed data with \(I(\alpha, \omega)\) and averaging the result over \(\omega\). Finally, we can remove the stripe pattern by multiplying \(I(\alpha, \omega)\) with the obtained \(1/S(\alpha)\).
Works best for data with a high signal-to-noise ratio and a high background level. Since the stripe is assumed to be energy-independent, the method is only suitable for data acquired with sweep mode.
This method may introduce artifacts that are not present in the original data, making it unsuitable for quantitative analysis. Use only for visualization purposes.
- Parameters:
darr (
DataArray) – The data containing the stripe artifact. Data must have the dimensions “alpha” and “eV”.deg (
int) – The degree of the polynomial fit. The degree should be high enough to capture all intrinsic features of the data, but low enough to avoid overfitting. A good starting value is around 20.full (
bool, default:False) – Flag determining whether to return the full stripe pattern. If True, \(1/S(\alpha)\) is also returned. Default is False.**sel_kw – Keyword arguments to
xarray.DataArray.sel(). Specify the range of angles and energies to use for the polynomial fit.
- Returns:
corrected (
xarray.DataArray) – The data with the stripe artifact removed.stripe (
xarray.DataArray) – The stripe pattern \(1/S(\alpha)\). Only returned iffullis True.
- Return type:
- erlab.analysis.image.scaled_laplace(darr, factor=1.0, **kwargs)[source]¶
Calculate the Laplacian of a 2D DataArray with different scaling for each axis.
This function calculates the Laplacian of the given array using approximate second derivatives, taking the different scaling for each axis into account.
\[\Delta f \sim \frac{\partial^2 f}{\partial x^2} \left(\frac{\Delta x}{\Delta y}\right)^{\!2} + \frac{\partial^2 f}{\partial y^2}\]See Ref. [Zhang et al., 2011] for more information.
- Parameters:
darr – The 2D DataArray for which to calculate the scaled Laplacian.
factor (
float, default:1.0) – The factor by which to scale the x-axis derivative. Negative values will scale the y-axis derivative instead. Default is 1.0.**kwargs – Additional keyword arguments to
findiff.Diff.
- Returns:
scaled_laplace (
xarray.DataArray) – The filtered array with the same shape as the input DataArray.- Return type: