Streaming NEXRAD Level 2 Chunks from S3#

xradar can now ingest a list of NEXRAD Level 2 chunk byte objects directly, so you can stream real-time radar data from S3 without downloading full volume files first. This notebook demonstrates:

  1. Listing and downloading chunk files from the unidata-nexrad-level2-chunks bucket

  2. Opening a full volume assembled from all chunks

  3. Handling partial volumes with incomplete_sweep="drop" (default)

  4. Handling partial volumes with incomplete_sweep="pad"

  5. Early streaming with just a few chunks

import warnings

import cmweather  # noqa: F401 -- registers colormaps
import fsspec
import matplotlib.pyplot as plt
import numpy as np

import xradar as xd

Background: NEXRAD chunk files on S3#

NOAA publishes NEXRAD Level 2 data to two public S3 buckets:

Bucket

Content

Latency

noaa-nexrad-level2

Complete volume files

Minutes after scan

unidata-nexrad-level2-chunks

Real-time chunk files

Seconds after scan

Each radar volume is split into many small chunk files that arrive as the radar scans. A volume directory typically contains:

  • One S (start) chunk that includes the volume header

  • Many I (intermediate) chunks with sweep data

  • One E (end) chunk marking the volume boundary

For example:

KABR/903/KABR20250717_120038_V06_S  (start)
KABR/903/KABR20250717_120038_V06_I02  (intermediate)
KABR/903/KABR20250717_120038_V06_I03
...
KABR/903/KABR20250717_120038_V06_E   (end)

xradar accepts a list of chunk bytes (or file paths, or file-like objects) directly via open_nexradlevel2_datatree(). The chunks are concatenated internally, so you never need to assemble them manually.

Load chunk data#

chunk_paths = []

try:
    fs = fsspec.filesystem("s3", anon=True)
    volumes = sorted(fs.ls("unidata-nexrad-level2-chunks/KABR/"))
    if volumes:
        chunk_paths = sorted(fs.ls(volumes[-1]))
except Exception:
    pass

if chunk_paths:
    print(f"Using live S3 chunks: {len(chunk_paths)} files")
    for p in chunk_paths[:3]:
        print(f"  {p.split('/')[-1]}")
    print(f"  ... {chunk_paths[-1].split('/')[-1]}")
else:
    print("S3 bucket empty or unreachable, using open-radar-data fixture")
/home/docs/checkouts/readthedocs.org/user_builds/xradar/conda/stable/lib/python3.13/site-packages/fsspec/registry.py:301: UserWarning: Your installed version of s3fs is very old and known to cause
severe performance issues, see also https://github.com/dask/dask/issues/10276

To fix, you should specify a lower version bound on s3fs, or
update the current installation.

  warnings.warn(s3_msg)
Using live S3 chunks: 14 files
  20260421-121521-001-S
  20260421-121521-002-I
  20260421-121521-003-I
  ... 20260421-121521-014-E

Download / load chunk bytes#

if chunk_paths:
    all_bytes = [fs.open(p, "rb").read() for p in chunk_paths]
else:
    import tarfile
    from pathlib import Path

    from open_radar_data import DATASETS

    archive = DATASETS.fetch("nexrad_level2_chunks_KLOT.tar.gz")
    with tarfile.open(archive) as tar:
        tar.extractall("/tmp/nexrad_chunks", filter="data")
    chunk_files = sorted(Path("/tmp/nexrad_chunks/nexrad_chunks_KLOT").iterdir())
    all_bytes = [f.read_bytes() for f in chunk_files]

total_mb = sum(len(b) for b in all_bytes) / 1e6
print(f"Loaded {len(all_bytes)} chunks ({total_mb:.1f} MB total)")
Loaded 14 chunks (2.1 MB total)

Full volume from all chunks#

When all chunks (S through E) are available, passing the list to open_nexradlevel2_datatree produces the same result as opening a complete volume file.

dtree = xd.io.open_nexradlevel2_datatree(all_bytes)
display(dtree)
/tmp/ipykernel_3954/601822868.py:1: UserWarning: Dropped 1 incomplete sweep(s): [2]. Use incomplete_sweep='pad' to include them with NaN-filled rays.
  dtree = xd.io.open_nexradlevel2_datatree(all_bytes)
<xarray.DataTree>
Group: /
│   Dimensions:              ()
│   Coordinates:
│       latitude             float64 8B 45.46
│       longitude            float64 8B -98.41
│       altitude             int64 8B 421
│   Data variables:
│       volume_number        int64 8B 0
│       platform_type        <U5 20B 'fixed'
│       instrument_type      <U5 20B 'radar'
│       time_coverage_start  <U20 80B '2026-04-21T12:15:21Z'
│       time_coverage_end    <U20 80B '2026-04-21T12:16:58Z'
│   Attributes: (12/25)
│       Conventions:                  None
│       instrument_name:              KABR
│       version:                      None
│       title:                        None
│       institution:                  None
│       references:                   None
│       ...                           ...
│       avset_enabled:                True
│       ebc_enabled:                  True
│       super_res_status:             2
│       rda_build_number:             2400
│       operational_mode:             4
│       actual_elevation_cuts:        3
├── Group: /sweep_0
│       Dimensions:            (azimuth: 720, range: 1832)
│       Coordinates:
│         * azimuth            (azimuth) float64 6kB 0.2444 0.7526 1.258 ... 359.3 359.8
│           elevation          (azimuth) float64 6kB 0.4395 0.4395 ... 0.4395 0.4395
│           time               (azimuth) datetime64[ns] 6kB ...
│           range              (range) float32 7kB 2.125e+03 2.375e+03 ... 4.599e+05
│       Data variables:
│           DBZH               (azimuth, range) float64 11MB ...
│           ZDR                (azimuth, range) float64 11MB ...
│           PHIDP              (azimuth, range) float64 11MB ...
│           RHOHV              (azimuth, range) float64 11MB ...
│           CCORH              (azimuth, range) float64 11MB ...
│           sweep_mode         <U20 80B 'azimuth_surveillance'
│           sweep_number       int64 8B 0
│           prt_mode           <U7 28B 'not_set'
│           follow_mode        <U7 28B 'not_set'
│           sweep_fixed_angle  float64 8B 0.4834
│       Attributes:
│           waveform_type:          contiguous_surveillance
│           channel_config:         sz2_phase_coding
│           super_resolution:       11
│           sails_cut:              False
│           sails_sequence_number:  0
│           mrle_cut:               False
│           mrle_sequence_number:   0
│           mpda_cut:               False
│           base_tilt_cut:          False
└── Group: /sweep_1
        Dimensions:            (azimuth: 718, range: 1192)
        Coordinates:
          * azimuth            (azimuth) float64 6kB 1.2 1.695 2.211 ... 359.2 359.7
            elevation          (azimuth) float64 6kB 0.4395 0.4395 ... 0.4395 0.4395
            time               (azimuth) datetime64[ns] 6kB ...
            range              (range) float32 5kB 2.125e+03 2.375e+03 ... 2.999e+05
        Data variables:
            DBZH               (azimuth, range) float64 7MB ...
            VRADH              (azimuth, range) float64 7MB ...
            WRADH              (azimuth, range) float64 7MB ...
            sweep_mode         <U20 80B 'azimuth_surveillance'
            sweep_number       int64 8B 1
            prt_mode           <U7 28B 'not_set'
            follow_mode        <U7 28B 'not_set'
            sweep_fixed_angle  float64 8B 0.4834
        Attributes:
            waveform_type:          contiguous_doppler
            channel_config:         sz2_phase_coding
            super_resolution:       7
            sails_cut:              False
            sails_sequence_number:  0
            mrle_cut:               False
            mrle_sequence_number:   0
            mpda_cut:               False
            base_tilt_cut:          False
ds = xd.georeference.get_x_y_z(dtree["sweep_0"].to_dataset(inherit="all_coords"))

fig, ax = plt.subplots(figsize=(6, 5))
ds.DBZH.plot(x="x", y="y", cmap="ChaseSpectral", vmin=-10, vmax=70, ax=ax)
ax.set_title(f"Full volume - sweep_0 ({ds.sweep_fixed_angle.values:.1f} deg)")
ax.set_aspect("equal")
fig.tight_layout()
../_images/b4c984639cd72d59d5451b5c547660620118617718bc90a59f381ee7e3ad0c74.png

Partial volume – drop mode (default)#

When only some chunks have arrived, the last sweep is usually incomplete (fewer rays than a full 360-degree rotation). By default, incomplete_sweep="drop" excludes these partial sweeps and emits a warning.

This is the safest option for downstream processing that expects complete sweeps.

partial_chunks = all_bytes[:15]

with warnings.catch_warnings(record=True) as w:
    warnings.simplefilter("always")
    dtree_drop = xd.io.open_nexradlevel2_datatree(
        partial_chunks, incomplete_sweep="drop"
    )

# Show warnings
for warning in w:
    print(f"WARNING: {warning.message}")

sweep_groups = list(dtree_drop.match("sweep_*").keys())
print(f"\nSweeps kept: {sweep_groups}")
WARNING: Dropped 1 incomplete sweep(s): [2]. Use incomplete_sweep='pad' to include them with NaN-filled rays.

Sweeps kept: ['sweep_0', 'sweep_1']
if len(sweep_groups) >= 2:
    fig, axes = plt.subplots(1, 2, figsize=(12, 5))
    for ax, grp in zip(axes, sweep_groups[:2]):
        ds = xd.georeference.get_x_y_z(dtree_drop[grp].to_dataset(inherit="all_coords"))
        ds.DBZH.plot(x="x", y="y", cmap="ChaseSpectral", vmin=-10, vmax=70, ax=ax)
        ax.set_title(f"{grp} ({ds.sweep_fixed_angle.values:.1f} deg)")
        ax.set_aspect("equal")
    fig.suptitle("Drop mode: only complete sweeps", y=1.02, fontsize=13)
    fig.tight_layout()
elif len(sweep_groups) == 1:
    fig, ax = plt.subplots(figsize=(6, 5))
    ds = xd.georeference.get_x_y_z(
        dtree_drop[sweep_groups[0]].to_dataset(inherit="all_coords")
    )
    ds.DBZH.plot(x="x", y="y", cmap="ChaseSpectral", vmin=-10, vmax=70, ax=ax)
    ax.set_title(f"{sweep_groups[0]} ({ds.sweep_fixed_angle.values:.1f} deg)")
    ax.set_aspect("equal")
    fig.suptitle("Drop mode: only complete sweeps", y=1.02, fontsize=13)
    fig.tight_layout()
else:
    print("No complete sweeps in 15 chunks (all dropped).")
../_images/de3282a8a8e6a22d8f477616218aaf58fda5dcb26e1083e59607ef8dd72ba831.png

Partial volume – pad mode#

With incomplete_sweep="pad", incomplete sweeps are kept and reindexed to a full azimuth grid. Missing rays are filled with NaN. The angular resolution (0.5 or 1.0 degree) is auto-detected from the data.

This is useful for visualization and monitoring where you want to see all available data as soon as it arrives.

dtree_pad = xd.io.open_nexradlevel2_datatree(partial_chunks, incomplete_sweep="pad")

sweep_groups_pad = list(dtree_pad.match("sweep_*").keys())
print(f"Sweeps available (pad mode): {sweep_groups_pad}")

# Show NaN percentage in each sweep
for grp in sweep_groups_pad:
    ds = dtree_pad[grp].to_dataset()
    if "DBZH" in ds:
        nan_pct = np.isnan(ds.DBZH.values).mean() * 100
        print(f"  {grp}: azimuth size={ds.sizes['azimuth']}, DBZH NaN={nan_pct:.1f}%")
Sweeps available (pad mode): ['sweep_0', 'sweep_1', 'sweep_2']
  sweep_0: azimuth size=720, DBZH NaN=0.0%
  sweep_1: azimuth size=718, DBZH NaN=0.0%
  sweep_2: azimuth size=720, DBZH NaN=90.4%
n_sweeps = len(sweep_groups_pad)
fig, axes = plt.subplots(1, n_sweeps, figsize=(6 * n_sweeps, 5))
if n_sweeps == 1:
    axes = [axes]

for ax, grp in zip(axes, sweep_groups_pad):
    ds = xd.georeference.get_x_y_z(dtree_pad[grp].to_dataset(inherit="all_coords"))
    ds.DBZH.plot(x="x", y="y", cmap="ChaseSpectral", vmin=-10, vmax=70, ax=ax)
    ax.set_title(f"{grp} ({ds.sweep_fixed_angle.values:.1f} deg)")
    ax.set_aspect("equal")

fig.suptitle("Pad mode: incomplete sweeps filled with NaN", y=1.02, fontsize=13)
fig.tight_layout()
../_images/af5d9fcc627f32f7490e3cc2c4c14ddf704aa117f4b65ad409a9f0e44d7ebd6c.png

Early streaming – few chunks#

Even with only 5 chunks (before the first sweep completes), pad mode shows the partial data that has arrived. The NaN wedge makes it clear which azimuths are still missing.

early_chunks = all_bytes[:5]

dtree_early = xd.io.open_nexradlevel2_datatree(early_chunks, incomplete_sweep="pad")

sweep_groups_early = list(dtree_early.match("sweep_*").keys())
print(f"Sweeps from 5 chunks: {sweep_groups_early}")

if sweep_groups_early:
    ds = xd.georeference.get_x_y_z(
        dtree_early[sweep_groups_early[0]].to_dataset(inherit="all_coords")
    )

    nan_pct = np.isnan(ds.DBZH.values).mean() * 100
    print(f"DBZH NaN percentage: {nan_pct:.1f}%")

    fig, ax = plt.subplots(figsize=(6, 5))
    ds.DBZH.plot(x="x", y="y", cmap="ChaseSpectral", vmin=-10, vmax=70, ax=ax)
    ax.set_title(
        f"Early stream: {sweep_groups_early[0]} "
        f"({ds.sweep_fixed_angle.values:.1f} deg) -- {nan_pct:.0f}% NaN"
    )
    ax.set_aspect("equal")
    fig.tight_layout()
else:
    print("No sweeps found in 5 chunks.")
/home/docs/checkouts/readthedocs.org/user_builds/xradar/conda/stable/lib/python3.13/site-packages/xradar/util.py:466: UserWarning: Rays might miss on beginning and/or end of sweep. `a1gate` information is needed to fully recover. We'll assume sweep start at first valid ray.
  warnings.warn(
Sweeps from 5 chunks: ['sweep_0']
DBZH NaN percentage: 33.3%
../_images/1f8b7c3fa04293bbd37bb7ac4b4690efa457897ce19709a13efc94afcd655d6b.png

Summary#

Scenario

incomplete_sweep

Behavior

Full volume (all chunks)

"drop" or "pad"

All sweeps present, no difference

Partial volume

"drop" (default)

Incomplete sweeps excluded, warning emitted

Partial volume

"pad"

Incomplete sweeps kept, missing rays filled with NaN

Early stream (few chunks)

"pad"

Single partial sweep visible with NaN wedge

Note: Single-file, bytes, and file-like inputs continue to work exactly as before. The list input and incomplete_sweep parameter are additive features.