Selecting and indexing data¶
In most cases, the powerful data manipulation and indexing methods provided by xarray
are sufficient. In this page, some frequently used xarray
features are summarized in addition to some utilities provided by this package. Refer to the xarray user guide for more information.
First, let us import some example data: a simple tight binding simulation of graphene.
[1]:
import xarray as xr
xr.set_options(display_expand_data=False)
[1]:
<xarray.core.options.set_options at 0x7fed44167f90>
[2]:
from erlab.io.exampledata import generate_data
dat = generate_data(seed=1).T
[3]:
dat
[3]:
<xarray.DataArray (eV: 300, ky: 250, kx: 250)> Size: 150MB 0.5243 1.033 0.6037 1.048 0.4388 ... 0.0003526 5.536e-06 2.813e-07 6.99e-08 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
We can see that the generated data is a three-dimensional xarray.DataArray
. Now, let’s extract a cut along \(k_y = 0.3\).
[4]:
dat.sel(ky=0.3, method="nearest")
[4]:
<xarray.DataArray (eV: 300, kx: 250)> Size: 600kB 1.535 1.377 0.9181 0.4302 0.5897 ... 1.171e-06 8.757e-06 0.0002878 0.001415 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 ky float64 8B 0.2967 * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09
How about the Fermi surface?
[5]:
dat.sel(eV=0.0, method="nearest")
[5]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.3501 0.1119 0.1255 0.1379 0.05128 ... 0.5261 0.2332 0.1398 0.1466 0.1662 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 eV float64 8B -0.000301
In many scenarios, it is necessary to perform integration across multiple array slices. This can be done by slicing and averaging. The following code integrates the intensity over a window of 50 meV centered at \(E_F\).
[6]:
dat.sel(eV=slice(-0.025, 0.025)).mean("eV")
[6]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89
However, doing this every time is cumbersome, and we have lost the coordinate eV. ERLabPy provides a callable accessor qsel
to streamline this process.
[7]:
dat.qsel(eV=0.0, eV_width=0.05)
[7]:
<xarray.DataArray (ky: 250, kx: 250)> Size: 500kB 0.2707 0.2155 0.2026 0.2084 0.1769 0.1773 ... 0.1942 0.2472 0.2516 0.2399 0.3594 Coordinates: * kx (kx) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 * ky (ky) float64 2kB -0.89 -0.8829 -0.8757 ... 0.8757 0.8829 0.89 eV float64 8B 0.000602
Note that the averaged coordinate is automatically added to the data array. This is useful for plotting and further analysis.
If the width is not specified, qsel
behaves like passing method=’nearest’ to sel. If a slice is given instead of a single value, no integration is performed. All of these methods can be combined:
[8]:
dat.qsel(kx=slice(-0.3, 0.3), ky=0.3, eV=0.0, eV_width=0.05)
[8]:
<xarray.DataArray (kx: 84)> Size: 672B 0.3407 0.3622 0.3589 0.3659 0.2786 0.3363 ... 0.3541 0.318 0.3214 0.305 0.2766 Coordinates: * kx (kx) float64 672B -0.2967 -0.2895 -0.2824 ... 0.2824 0.2895 0.2967 ky float64 8B 0.2967 eV float64 8B 0.000602
Masking¶
In some cases, it is necessary to mask the data. Although basic masks are supported by xarray
, ERLabPy provides a way to mask data with arbitrary polygons.
Work in Progress
This part of the user guide is still under construction. For now, see
erlab.analysis.mask
. For the full list of packages and modules provided by
ERLabPy, see API Reference.
Interpolation¶
In addition to the powerful interpolation methods provided by
xarray
, ERLabPy provides a convenient way to interpolate data along an arbitrary
path.
Consider a Γ-M-K-Γ high symmetry path given as a list of kx and ky coordinates:
[9]:
import erlab.plotting.erplot as eplt
import matplotlib.pyplot as plt
import numpy as np
a = 6.97
kx = [0, 2 * np.pi / (a * np.sqrt(3)), 2 * np.pi / (a * np.sqrt(3)), 0]
ky = [0, 0, 2 * np.pi / (a * 3), 0]
dat.qsel(eV=-0.2).qplot(aspect="equal", cmap="Greys")
plt.plot(kx, ky, "o-")
[9]:
[<matplotlib.lines.Line2D at 0x7fecfdda9a50>]
To interpolate the data along this path with a step of 0.01 Å\(^{-1}\), we can use the following code:
[10]:
import erlab.analysis as era
dat_sliced = era.interpolate.slice_along_path(
dat, vertices={"kx": kx, "ky": ky}, step_size=0.01
)
dat_sliced
[10]:
<xarray.DataArray (eV: 300, path: 140)> Size: 336kB 0.07295 0.1004 0.4831 0.6724 0.1885 ... 1.159e-13 1.01e-07 0.00131 0.138 0.1486 Coordinates: * eV (eV) float64 2kB -0.45 -0.4482 -0.4464 ... 0.08639 0.08819 0.09 kx (path) float64 1kB 0.0 0.01021 0.02041 ... 0.01764 0.008821 0.0 ky (path) float64 1kB 0.0 0.0 0.0 0.0 ... 0.01528 0.01019 0.005093 0.0 * path (path) float64 1kB 0.0 0.01021 0.02041 ... 1.402 1.412 1.422
We can see that the data has been interpolated along the path. The new coordinate path
contains the distance along the path, and the dimensions kx
and ky
are now expressed in terms of path
.
The distance along the path can be calculated as the sum of the distances between consecutive points in the path.
[11]:
dat_sliced.plot(cmap="Greys")
eplt.fermiline()
# Distance between each pair of consecutive points
distances = np.linalg.norm(np.diff(np.vstack([kx, ky]), axis=-1), axis=0)
seg_coords = np.concatenate(([0], np.cumsum(distances)))
plt.xticks(seg_coords, labels=["Γ", "M", "K", "Γ"])
plt.xlim(0, seg_coords[-1])
for seg in seg_coords[1:-1]:
plt.axvline(seg, ls="--", c="k", lw=1)
You will learn more about plotting in the next section.