erlab.utils.formatting¶

Utilities related to representing data in a human-readable format.

Functions

`format_darr_html`(darr, *[, show_size, ...])	Make a simple HTML representation of a DataArray.
`format_darr_shape_html`(darr[, show_size])	Make a simple HTML representation of a DataArray's shape.
`format_html_accent`(text[, bold, em_space])	Return an HTML string with the text colored with the accent color.
`format_html_table`(rows[, header_cols, ...])	Create a simple HTML table from a dictionary.
`format_nbytes`(value[, fmt, sep])	Format the given number of bytes in a human-readable format.
`format_value`(val[, precision, use_unicode_minus])	Format the given value based on its type.

erlab.utils.formatting.format_darr_html(darr, *, show_size=True, show_summary=True, load_values=True, additional_info=None)[source]¶

Make a simple HTML representation of a DataArray.

Parameters:

darr (DataArray) – The DataArray to represent.
show_size (bool, default: True) – Whether to include the size of the DataArray in the representation.
show_summary (bool, default: True) – Whether to include the DataArray name, dimensions, and optional size before the coordinates and attributes.
load_values (bool, default: True) – Whether coordinate values may be loaded while formatting coordinate summaries. Disable this for metadata-only previews of lazily loaded arrays.
additional_info (Iterable[str] | None, default: None) – Additional information to include in the representation. Each item in the list is added as a separate paragraph.

Returns:

str – The HTML representation of the DataArray.

Return type:

str

erlab.utils.formatting.format_darr_shape_html(darr, show_size=False)[source]¶

Make a simple HTML representation of a DataArray’s shape.

Parameters:

darr (DataArray) – The DataArray to represent.
show_size (bool, default: False) – Whether to include the size of the DataArray in the representation.

Returns:

str – The HTML representation of the DataArray’s size.

Return type:

str

erlab.utils.formatting.format_html_accent(text, bold=False, em_space=False)[source]¶

Return an HTML string with the text colored with the accent color.

Parameters:

text (Hashable) – The text to apply the accent color to.
bold (bool, default: False) – Whether to make the text bold.
em_space (bool, default: False) – Whether to add an em space after the text.

Returns:

str – The text with the accent color applied in HTML.

Return type:

str

erlab.utils.formatting.format_html_table(rows, header_cols=0, header_rows=0, use_thead=True)[source]¶

Create a simple HTML table from a dictionary.

erlab.utils.formatting.format_nbytes(value, fmt='%.1f', sep=' ')[source]¶

Format the given number of bytes in a human-readable format.

Parameters:

value (float | str) – The number of bytes to format.
fmt (str, default: "%.1f") – The format string to use when formatting the number of bytes.
sep (str, default: " ") – The separator to use between the formatted number of bytes and the unit.

erlab.utils.formatting.format_value(val, precision=4, use_unicode_minus=False)[source]¶

Format the given value based on its type.

This method is used to format various types of values to a human-readable string. It handles different types of values and formats them accordingly.

This function is used in various places in the codebase to provide a consistent representation of values.

Parameters:

val (object) – The value to be formatted.
precision (int, default: 4) – The number of decimal places to use when formatting floating-point numbers. If the magnitude of the value is smaller than the precision, the number will be printed in scientific notation.
use_unicode_minus (bool, default: False) – Whether to replace the Unicode hyphen-minus sign “-” (U+002D) with the better-looking Unicode minus sign “−” (U+2212) in the formatted value.

Returns:

str or object – The formatted value.

Return type:

str

Note

This function formats the given value based on its type. It supports formatting for various types including numpy arrays, lists of strings, floating-point numbers, integers, and datetime objects.

For numpy arrays:
- If the array has a size of 1, the value is recursively formatted using format_value(val.item()).
- If the array can be squeezed to a 1-dimensional array, the following are applied.
  If the array is evenly spaced, the start, end, step, and length values are formatted and returned as a string in the format “start→end (step, length)”.
  
  If the array is monotonic increasing or decreasing but not evenly spaced, the start, end, and length values are formatted and returned as a string in the format “start→end (length)”.
  
  If all elements are equal, the value is recursively formatted using format_value(val[0]).
  
  If the array is not monotonic, the minimum and maximum values are formatted and returned as a string in the format “min~max”.
  
  If the array has two elements, the two elements are formatted and returned.
- For arrays with more dimensions, the minimum and maximum values are formatted and returned as a string in the format “min~max”.
For lists:
The list is grouped by consecutive equal elements, and the count of each element is formatted and returned as a string in the format “[element]×count”.
For floating-point numbers:
- If the number is an integer, it is formatted as an integer using format_value(np.int64(val)).
- Otherwise, it is formatted as a floating-point number with specified decimal places and returned as a string.
For integers:
The integer is returned as a string.
For datetime objects:
They are formatted as a string in the format “%Y-%m-%d %H:%M:%S”. This includes datetime.datetime, numpy.datetime64, and pandas.Timestamp objects.
For datetime.date objects:
They are formatted as a string in the format “%Y-%m-%d”.
For other types:
The value is returned as is.

Examples

>>> format_value(np.array([0.1, 0.15, 0.2]))
'0.1→0.2 (0.05, 3)'

>>> format_value(np.array([1.0, 2.0, 2.1]))
'1→2.1 (3)'

>>> format_value(np.array([1.0, 2.1, 2.0]))
'1~2.1 (3)'

>>> format_value([1, 1, 2, 2, 2, 3, 3, 3, 3])
'[1]×2, [2]×3, [3]×4'

>>> format_value(3.14159)
'3.1416'

>>> format_value(42.0)
'42'

>>> format_value(42)
'42'

>>> format_value(datetime.datetime(2024, 1, 1, 12, 0, 0, 0))
'2024-01-01 12:00:00'