Skip to content

Zarr

async_hdf5.zarr.HDF5Store

Bases: Store

Read-only Zarr v3 store backed by async-hdf5 with lazy chunk index resolution.

Dataset metadata (shape, dtype, codecs, etc.) is parsed eagerly at construction — this is fast because the data is already in async-hdf5's BlockCache from the superblock / group parse. Chunk indices (B-tree / FixedArray traversal) are parsed lazily on first chunk access per variable.

Parameters:

  • dataset_infos (dict[str, _DatasetInfo]) –

    Mapping from variable name to _DatasetInfo (pre-parsed metadata).

  • group_attrs (dict[str, Any]) –

    Root group attributes.

  • file_url (str) –

    Full URL of the HDF5 file (used for equality checks).

async_hdf5.zarr.open_hdf5 async

open_hdf5(
    *,
    path: str,
    store: ObjectStore | ObspecInput,
    group: str | None = None,
    drop_variables: Iterable[str] | None = None,
    block_size: int = 8 * 1024 * 1024,
    pre_warm_size: int | None = None,
) -> HDF5Store

Open an HDF5 file as an :class:HDF5Store (read-only Zarr v3 store).

Chunk index parsing is deferred until data is actually accessed, making the initial open significantly faster for files with many variables.

The returned store can be passed directly to xarray::

store = await open_hdf5(path=key, store=s3_store)
ds = xr.open_dataset(store, engine="zarr", consolidated=False, zarr_format=3)

Parameters:

  • path (str) –

    Path within the store (e.g. the object key for S3).

  • store (ObjectStore | ObspecInput) –

    An async_hdf5 or obstore ObjectStore or obspec-compatible backend.

  • group (str | None, default: None ) –

    HDF5 group to open. If None, the root group is used.

  • drop_variables (Iterable[str] | None, default: None ) –

    Variable names to exclude.

  • block_size (int, default: 8 * 1024 * 1024 ) –

    BlockCache size in bytes (default 8 MiB).

  • pre_warm_size (int | None, default: None ) –

    Maximum bytes of cache blocks to pre-fetch in parallel during open (default None — disabled). When set, the file size is queried and up to pre_warm_size bytes of blocks are fetched in a single batched call. This trades bandwidth for latency: useful on very fast connections or for small files, but counterproductive when the file is large relative to the connection speed.

Returns:

  • HDF5Store

    A read-only Zarr v3 store backed by async-hdf5. Variables are lazily

  • HDF5Store

    loaded — chunk indices are only parsed when data is actually read.