Appendix I: Dataset Conformance Testing (Informative)
This appendix documents systematic conformance testing performed against five public reference datasets. The results validated the completeness and expressiveness of the SpatialDDS 1.5 sensing, mapping, coordination, and spatial events profiles and directly informed several normative additions to this specification.
Motivation
Sensor-data specifications risk becoming disconnected from real-world workloads if they are designed in isolation. To guard against this, the SpatialDDS 1.5 profiles were validated against five complementary datasets that together exercise the full signal-to-semantics pipeline and multi-agent coordination:
| Dataset | Focus | Modalities Stressed |
|---|---|---|
| nuScenes (Motional / nuTonomy) | Perception → semantics | Camera (6×), lidar, radar detections (5×), 3D annotations, coordinate conventions |
| DeepSense 6G (ASU Wireless Intelligence Lab) | Signal → perception | Raw radar I/Q tensors, 360° cameras, lidar, IMU, GPS-RTK, mmWave beam vectors |
| S3E (Sun Yat-sen University / HKUST) | Multi-agent coordination | 3 UGVs × (lidar, stereo, IMU), UWB inter-robot ranging, RTK-GNSS, collaborative SLAM |
| ScanNet (TU Munich / Princeton) | Indoor scene understanding | RGB-D depth frames, 3D surface mesh, instance segmentation (NYU40), room-level zones, 20 scene types |
| LaMAR (CVG ETH Zurich / Microsoft) | AR localization with radio assistance | WiFi scans (wifi.txt), Bluetooth scans (bt.txt), RGB images, AR trajectories, visual localization metadata |
nuScenes was chosen because it stresses sensor diversity, per-detection radar fields rarely found in other corpora (compensated velocity, dynamic property, RCS), and rich annotation metadata (visibility, attributes, evidence counts). DeepSense 6G was chosen because it stresses signal-level data (raw FMCW radar cubes, phased-array beam power vectors) and ISAC modalities absent from traditional perception datasets. S3E was chosen because it is the first collaborative SLAM dataset with UWB inter-robot ranging and exercises the multi-agent capabilities — map lifecycle, inter-map alignment, range-only constraints, and fleet discovery — that differentiate SpatialDDS from single-vehicle frameworks such as ROS 2. ScanNet was chosen because it is the definitive indoor RGB-D scene understanding benchmark, uniquely exercises depth sensing (DEPTH16) and the Spatial Events extension (room zones, object-in-room events, per-class occupancy counts), and validates the semantics profile's instance segmentation types against a rich 40-class indoor vocabulary. LaMAR was chosen because it provides paired visual and radio observations (WiFi/Bluetooth) for AR localization and directly tests whether radio fingerprints can be represented as first-class typed streams rather than ad hoc JSON metadata.
The goal was not to certify particular datasets but to answer two concrete questions: Can every field, enum, and convention in each dataset's schema be losslessly mapped to SpatialDDS 1.5 IDL without workarounds or out-of-band agreements? And for multi-agent scenarios: Can the full coordination lifecycle — from independent mapping through inter-map alignment — be expressed using the standard types?
Methodology
For each dataset, a conformance harness was constructed as a self-contained Python 3 script that:
- Mirrors the SpatialDDS 1.5 IDL as Python data structures (enum values, struct field lists, normative prose flags).
- Mirrors the dataset schema as synthetic data (sensor names, field lists, data shapes).
- Runs targeted checks, each producing a verdict:
| Verdict | Meaning |
|---|---|
| PASS | Dataset field maps losslessly to an existing SpatialDDS type or enum value. |
| GAP | A mapping exists conceptually but the required SpatialDDS type or field does not yet exist. |
| MISSING | No SpatialDDS construct exists for the dataset field; a new profile is needed. |
- Reports a per-modality scorecard.
Neither nuScenes nor DeepSense 6G harness requires network access, a DDS runtime, or a dataset download. Both operate as static schema-vs-schema dry runs, reproducible in any CI environment. The S3E (§I.3) and ScanNet (§I.4) conformance sections were performed as manual schema analyses following the same check structure; scripted harnesses are planned for a future revision.
I.1 nuScenes Conformance
Reference Dataset
nuScenes (Motional / nuTonomy) is a multimodal autonomous driving dataset containing:
| Dimension | Value |
|---|---|
| Scenes | 1,000 (20 s each) |
| Cameras | 6 surround-view (FRONT, FRONT_LEFT, FRONT_RIGHT, BACK, BACK_LEFT, BACK_RIGHT) |
| Lidar | 1 x 32-beam spinning (Velodyne HDL-32E), ~34 k points/scan |
| Radar | 5 x Continental ARS 408 (FRONT, FRONT_LEFT, FRONT_RIGHT, BACK_LEFT, BACK_RIGHT) |
| 3D annotations | 1.4 M oriented bounding boxes, 23 object classes |
| Annotation metadata | visibility tokens, attribute tokens, per-box lidar/radar point counts |
| Coordinate convention | Right-handed; quaternions in (w, x, y, z) order |
Checks Performed (27)
Radar — Detection Path (6 checks)
| ID | Check | Description |
|---|---|---|
| R-01 | Detection-centric profile | RadDetection struct exists with per-detection xyz, velocity, RCS, dyn_prop. |
| R-02 | Per-detection velocity | Cartesian velocity_xyz (preferred) + scalar v_r_mps (fallback), both with has_* guards. |
| R-03 | Ego-compensated velocity | velocity_comp_xyz field for ego-motion-compensated velocity. |
| R-04 | Dynamic property enum | RadDynProp covers all 7 nuScenes values (UNKNOWN through STOPPED). |
| R-05 | Per-detection RCS | rcs_dbm2 field in dBm² with has_rcs_dbm2 guard. |
| R-06 | Sensor type enum | RadSensorType differentiates SHORT_RANGE, LONG_RANGE, IMAGING_4D, etc. |
Vision (5 checks)
| ID | Check | Description |
|---|---|---|
| V-01 | RigRole coverage | RigRole enum includes FRONT, FRONT_LEFT, FRONT_RIGHT, BACK, BACK_LEFT, BACK_RIGHT. |
| V-02 | Pre-rectified images | Normative prose documents dist = NONE with model = PINHOLE semantics. |
| V-03 | Image dimensions | CamIntrinsics.width / height are REQUIRED; zero values are malformed. |
| V-04 | Keyframe flag | VisionFrame.is_key_frame boolean. |
| V-05 | Quaternion reorder | §2 table maps nuScenes (w,x,y,z) to SpatialDDS (x,y,z,w). |
Lidar (6 checks)
| ID | Check | Description |
|---|---|---|
| L-01 | BIN_INTERLEAVED encoding | CloudEncoding value for raw interleaved binary with normative record layout table. |
| L-02 | Per-point timestamps | PointLayout.XYZ_I_R_T and XYZ_I_R_T_N with normative prose for the t field. |
| L-03 | Metadata guards | LidarMeta uses has_range_limits, has_horiz_fov, has_vert_fov guards. |
| L-04 | Timestamp presence flag | LidarFrame.has_per_point_timestamps signals per-point timing in the blob. |
| L-05 | t_end computation | Normative guidance for computing t_end from t_start + 1/rate_hz or max(point.t). |
| L-06 | Ring field | PointLayout.XYZ_I_R carries ring as uint16. |
Semantics (5 checks)
| ID | Check | Description |
|---|---|---|
| S-01 | Size convention | Normative: size[0] = width (X), size[1] = height (Z), size[2] = depth (Y). nuScenes (w,l,h) -> (w,h,l) mapping documented. |
| S-02 | Attributes | Detection3D.attributes as sequence<MetaKV, 8> with has_attributes guard. |
| S-03 | Visibility | Detection3D.visibility float [0..1] with has_visibility guard. |
| S-04 | Evidence counts | num_lidar_pts + num_radar_pts with has_num_pts guard. |
| S-05 | Quaternion reorder | §2 table covers annotation quaternion conversion. |
Common / Core (5 checks)
| ID | Check | Description |
|---|---|---|
| C-01 | Quaternion table | §2 convention table covering GeoPose, ROS 2, nuScenes, Eigen, Unity, Unreal, OpenXR, glTF. |
| C-02 | FQN guidance | FrameRef { uuid, fqn } semantics documented; UUID is authoritative. |
| C-03 | Local-frame coverage | §3.3.4 covers local-only deployments. |
| C-04 | has_* pattern consistency | All new optional fields use the has_* guard pattern uniformly. |
| C-05 | Sequence bounds | Standard bounds table: SZ_MEDIUM (2048), SZ_SMALL (256), SZ_XL (32768), SZ_LARGE (8192). |
Results
All 27 nuScenes checks pass.
| Modality | Checks | Pass | Remaining Gaps |
|---|---|---|---|
| Radar (detections) | 6 | 6 | 0 |
| Vision | 5 | 5 | 0 |
| Lidar | 6 | 6 | 0 |
| Semantics | 5 | 5 | 0 |
| Common / Core | 5 | 5 | 0 |
| Total | 27 | 27 | 0 |
I.2 DeepSense 6G Conformance
Reference Dataset
DeepSense 6G (Arizona State University, Wireless Intelligence Lab) is a large-scale multi-modal sensing and communication dataset containing:
| Dimension | Value |
|---|---|
| Scenarios | 40+ across 12+ locations |
| Snapshots | 1.08 M+ synchronized samples |
| FMCW Radar | 76–81 GHz, 3 Tx × 4 Rx, complex I/Q tensor [4×256×128], 10 Hz |
| 3D Lidar | Ouster OS1-32, 32×1024, 120 m range, 865 nm, 10–20 Hz |
| Camera | ZED2 stereo (960×540) + Insta360 ONE X2 360° (5.7K) |
| GPS-RTK | 10 Hz, ≤1 cm accuracy (RTK fix), DOP + satellite metadata |
| IMU | 6-axis, 100 Hz |
| mmWave Comm | 60 GHz phased array, 64-beam codebook, 90° FoV, 10 Hz |
| Deployment types | V2I, V2V (4× arrays/vehicle), ISAC indoor, drone |
The dataset was chosen because it stresses signal-level data (raw FMCW radar cubes consumed directly by ML pipelines), 360° camera rigs, and ISAC modalities (beam power vectors, blockage state) absent from perception-focused datasets.
Checks Performed (41)
Radar — Tensor Path (8 checks)
| ID | Check | Description |
|---|---|---|
| DT-01 | Tensor meta struct | RadTensorMeta exists with axes, voxel_type, layout, physical_meaning. |
| DT-02 | Complex sample type | SampleType.CF32 covers complex I/Q data. |
| DT-03 | Channel axis | RadTensorLayout.CH_FAST_SLOW maps raw FMCW [Rx, samples, chirps]. |
| DT-04 | MIMO antenna config | num_tx, num_rx, num_virtual_channels with has_antenna_config guard. |
| DT-05 | Waveform params | bandwidth_hz, center_freq_hz, samples_per_chirp, chirps_per_frame with guard. |
| DT-06 | Frame blob transport | RadTensorFrame.hdr.blobs[] carries the raw cube; size computable from axes × sample size. |
| DT-07 | Sensor type | RadSensorType covers FMCW radar as MEDIUM_RANGE or IMAGING_4D. |
| DT-08 | StreamMeta extrinsics | T_bus_sensor (PoseSE3) + nominal_rate_hz for hand-eye calibration and 10 Hz cadence. |
Vision (7 checks)
| ID | Check | Description |
|---|---|---|
| DV-01 | Standard camera | PixFormat.RGB8 + CamIntrinsics.width/height cover ZED2 at 960×540. |
| DV-02 | Camera extrinsics | VisionMeta.base → StreamMeta.T_bus_sensor for hand-eye calibration. |
| DV-03 | Camera model | CamModel.PINHOLE for ZED2 pre-rectified output. |
| DV-04 | Frame rate | StreamMeta.nominal_rate_hz = 10 (downsampled from 30 Hz). |
| DV-05 | 360° rig roles | RigRole.PANORAMIC and EQUIRECTANGULAR for Insta360 ONE X2 in V2V scenarios. |
| DV-06 | Keyframe flag | VisionFrame.is_key_frame boolean. |
| DV-07 | Compression codec | Codec enum covers JPEG/H264/H265/AV1. |
Lidar (7 checks)
| ID | Check | Description |
|---|---|---|
| DL-01 | Lidar type | LidarType.MULTI_BEAM_3D for Ouster OS1-32 (spinning, 32 rings). |
| DL-02 | Ring count + FOV | LidarMeta.n_rings, has_horiz_fov, has_vert_fov with guards. |
| DL-03 | Range limits | has_range_limits + max_range_m = 120 m. |
| DL-04 | Point layout | PointLayout.XYZ_I_R for x, y, z, intensity, ring. |
| DL-05 | Cloud encoding | CloudEncoding.BIN_INTERLEAVED for raw binary transport. |
| DL-06 | Sensor wavelength | LidarMeta.wavelength_nm with has_wavelength guard (865 nm). |
| DL-07 | Frame rate | StreamMeta.nominal_rate_hz covers 10–20 Hz. |
IMU (4 checks)
| ID | Check | Description |
|---|---|---|
| DI-01 | 6-axis sample | ImuSample with accel (Vec3, m/s²) + gyro (Vec3, rad/s). |
| DI-02 | Noise densities | ImuInfo.accel_noise_density + gyro_noise_density + random walk params. |
| DI-03 | Frame reference | ImuInfo.frame_ref for sensor-to-bus mounting. |
| DI-04 | Timestamp + sequence | ImuSample.stamp + .seq for 100 Hz temporal ordering. |
GPS (6 checks)
| ID | Check | Description |
|---|---|---|
| DG-01 | Position | GeoPose.lat_deg/lon_deg/alt_m for GPS-RTK coordinates. |
| DG-02 | Orientation | GeoPose.q (QuaternionXYZW) for heading-derived orientation. |
| DG-03 | Timestamp | GeoPose.stamp for 10 Hz GPS samples. |
| DG-04 | Covariance | GeoPose.cov for positional uncertainty (RTK ≤1 cm). |
| DG-05 | GNSS quality | NavSatStatus provides DOP, fix type, and satellite count with has_dop guard. |
| DG-06 | Speed over ground | NavSatStatus.speed_mps + course_deg with has_velocity guard. |
mmWave Beam (8 checks)
| ID | Check | Description |
|---|---|---|
| DB-01 | Beam power vector | RfBeamFrame.power (sequencerf_beam profile (K-B1). |
| DB-02 | Codebook metadata | RfBeamMeta.n_beams (64), n_elements (16), center_freq_ghz (60.0), fov_az_deg (90), codebook_type. |
| DB-03 | Optimal beam index | RfBeamFrame.best_beam_idx (uint16) with has_best_beam guard. Ground-truth label: beam maximizing SNR. |
| DB-04 | Blockage status | RfBeamFrame.is_blocked (boolean) + blockage_confidence (float 0..1) with has_blockage_state guard. |
| DB-05 | Multi-array set | RfBeamArraySet.arrays (sequenceoverall_best_array_idx + overall_best_beam_idx for cross-array best beam. Covers V2V 4-array rig. |
| DB-06 | Sparse sweep indices | RfBeamFrame.beam_indices maps power[i] to codebook position for PARTIAL/TRACKING sweeps. BeamSweepType enum: EXHAUSTIVE, HIERARCHICAL, TRACKING, PARTIAL. |
| DB-07 | Power unit convention | RfBeamMeta.power_unit (PowerUnit enum: DBM, LINEAR_MW, RSRP) declares units for RfBeamFrame.power. |
| DB-08 | Stream linkage | RfBeamFrame.stream_id matches RfBeamMeta.stream_id for meta/frame correlation. |
Note: All mmWave Beam checks validated against the provisional sensing.rf_beam profile (Appendix E). Types are subject to breaking changes.
Semantics (4 checks)
| ID | Check | Description |
|---|---|---|
| DS-01 | 2D bounding boxes | Detection2D.bbox + class_id covers 8 DeepSense object classes. |
| DS-02 | Sequence index | FrameHeader.frame_seq for sample ordering. |
| DS-03 | Class ID | Detection2D.class_id (string) maps all DeepSense class labels. |
| DS-04 | Beam/blockage labels | RfBeamFrame.best_beam_idx and .is_blocked/.blockage_confidence carry ISAC-specific ground-truth labels. Covered by provisional rf_beam profile. |
Results
All 44 DeepSense 6G checks pass. GNSS diagnostics are covered by NavSatStatus, and mmWave Beam checks pass against the provisional rf_beam profile (Appendix E).
| Modality | Checks | Pass | Gap | Missing | Notes |
|---|---|---|---|---|---|
| Radar (tensor) | 8 | 8 | 0 | 0 | — |
| Vision | 7 | 7 | 0 | 0 | Includes 360° rig roles |
| Lidar | 7 | 7 | 0 | 0 | Includes sensor wavelength |
| IMU | 4 | 4 | 0 | 0 | — |
| GPS | 6 | 6 | 0 | 0 | NavSatStatus covers GNSS diagnostics |
| mmWave Beam | 8 | 8 | 0 | 0 | Provisional rf_beam profile (K-B1) |
| Semantics | 4 | 4 | 0 | 0 | Beam labels via rf_beam |
| Total | 44 | 44 | 0 | 0 | 100% coverage |
Deferred Items
DeepSense 6G conformance has no remaining schema gaps. Future ISAC extensions (e.g., CSI/CIR profiles) remain under discussion; see Appendix K for the maturity promotion criteria.
I.3 S3E Conformance (Multi-Robot Collaborative SLAM)
Reference Dataset
S3E (Sun Yat-sen University / HKUST) is a multi-robot multimodal dataset for collaborative SLAM containing:
| Dimension | Value |
|---|---|
| Robots | 3 UGVs (Alpha, Blob, Carol) operating simultaneously |
| LiDAR | 1 × 16-beam 3D scanner (Velodyne VLP-16) per robot, 10 Hz |
| Stereo cameras | 2 × high-resolution color cameras per robot |
| IMU | 9-axis, 100–200 Hz per robot |
| UWB | Inter-robot Ultra-Wideband ranging (pairwise distances at ~10 Hz) |
| GNSS | Dual-antenna RTK receiver per robot (ground truth) |
| Environments | 13 outdoor + 5 indoor sequences |
| Trajectory paradigms | 4 collaborative patterns (concentric circles, intersecting circles, intersection curve, rays) |
| Format | ROS 2 bag files; ground truth as TUM-format pose files |
The dataset was chosen because it is the first C-SLAM dataset to include UWB inter-robot ranging, exercises multi-agent map building with inter-robot loop closures, and represents a scenario class (heterogeneous multi-robot coordination) where SpatialDDS's Mapping extension, Discovery profile, and multi-source pose graph types provide capabilities absent from ROS 2's nav_msgs and sensor_msgs.
Checks Performed (38)
Per-Robot Sensing — LiDAR (5 checks)
| ID | Check | Description |
|---|---|---|
| SL-01 | LiDAR meta | LidarMeta with sensor_type, rate_hz, point_layout covers Velodyne VLP-16. |
| SL-02 | Point layout | PointLayout.XYZ_I_R_T carries x, y, z, intensity, ring, time — matches Velodyne binary format. |
| SL-03 | Per-robot topic isolation | Topic template spatialdds/<scene>/lidar/<sensor_id>/frame/v1 with per-robot sensor_id (e.g., alpha/vlp16). |
| SL-04 | CloudEncoding | BIN_INTERLEAVED covers raw binary point cloud blobs. |
| SL-05 | RigRole | RigRole.TOP covers single roof-mounted LiDAR. |
Per-Robot Sensing — Vision (4 checks)
| ID | Check | Description |
|---|---|---|
| SV-01 | Stereo pair | Two VisionFrame streams per robot with RigRole.LEFT / RigRole.RIGHT. |
| SV-02 | Camera intrinsics | CameraMeta with fx, fy, cx, cy, dist_model, dist_coeffs covers calibrated stereo cameras. |
| SV-03 | Per-robot namespacing | Topic spatialdds/<scene>/vision/<sensor_id>/frame/v1 isolates per-robot camera streams. |
| SV-04 | Timestamp sync | VisionFrame.stamp synchronized to common timebase via hardware PPS trigger. |
Per-Robot Sensing — IMU (3 checks)
| ID | Check | Description |
|---|---|---|
| SI-01 | 9-axis sample | ImuSample with accel (Vec3, m/s²) + gyro (Vec3, rad/s) covers 6-axis; MagSample covers magnetometer. |
| SI-02 | High-rate ordering | ImuSample.seq monotonic counter handles 100–200 Hz temporal ordering. |
| SI-03 | Extrinsic calibration | Sensor-to-body transform publishable as FrameTransform (LiDAR-IMU, camera-IMU extrinsics). |
Per-Robot Sensing — GNSS/RTK (3 checks)
| ID | Check | Description |
|---|---|---|
| SG-01 | RTK fix type | GnssFixType.RTK_FIXED covers dual-antenna RTK ground truth receiver. |
| SG-02 | GeoPose output | GeoPose with lat_deg, lon_deg, alt_m, quaternion covers RTK-derived global pose. |
| SG-03 | NavSatStatus | NavSatStatus with fix_type, num_satellites, hdop, vdop covers receiver diagnostics. |
Inter-Robot Ranging — UWB (4 checks)
| ID | Check | Description |
|---|---|---|
| SU-01 | Range edge type | mapping::EdgeType.RANGE explicitly models UWB range-only constraint (scalar distance, no orientation). |
| SU-02 | Range fields | mapping::Edge.range_m + range_std_m carry measured distance and uncertainty. |
| SU-03 | Cross-map provenance | has_from_map_id / has_to_map_id populated on RANGE edges because UWB connects nodes in different robots' maps. |
| SU-04 | Range-assisted alignment | AlignmentMethod.RANGE_COARSE covers initial inter-map alignment derived solely from UWB distances. |
Core Pose Graph (5 checks)
| ID | Check | Description |
|---|---|---|
| SC-01 | Per-robot nodes | core::Node with map_id per robot (e.g., alpha-map, blob-map, carol-map), @key node_id unique per keyframe. |
| SC-02 | Odometry edges | core::Edge with type = ODOM connects sequential keyframes within each robot's map. |
| SC-03 | Intra-robot loop closures | core::Edge with type = LOOP for within-map loop closures (e.g., concentric circle paradigm). |
| SC-04 | Versioning | Node.seq monotonic per source; Node.graph_epoch increments after global re-optimization. |
| SC-05 | Multi-source coexistence | Three simultaneous source_id values on core::Node and core::Edge topics — one per robot. |
Mapping Extension — Multi-Agent (8 checks)
| ID | Check | Description |
|---|---|---|
| SM-01 | Map lifecycle | MapMeta per robot with state progressing: BUILDING → OPTIMIZING → STABLE. |
| SM-02 | Map kind | MapMeta.kind = POSE_GRAPH for each robot's SLAM output. |
| SM-03 | Inter-robot loop closures | mapping::Edge with type = INTER_MAP and has_from_map_id / has_to_map_id populated. |
| SM-04 | MapAlignment | MapAlignment with T_from_to expressing the inter-map transform after cross-robot alignment. |
| SM-05 | Alignment revision | MapAlignment.revision increments as more inter-robot edges accumulate and the alignment refines. |
| SM-06 | Evidence trail | MapAlignment.evidence_edge_ids[] references the specific cross-map edges supporting the alignment. |
| SM-07 | MapEvent notifications | MapEvent with MAP_ALIGNED event when two robots' maps are first linked. |
| SM-08 | Concurrent map builds | Three MapMeta samples simultaneously active (keyed by map_id), demonstrating multi-map lifecycle. |
Discovery & Coordination (3 checks)
| ID | Check | Description |
|---|---|---|
| SD-01 | Service announcement | Each robot publishes Announce with ServiceKind.SLAM and sensor capabilities in topics[]. |
| SD-02 | Spatial coverage | Announce.coverage (Aabb3 or geo-bounds) advertises each robot's operational area. |
| SD-03 | Multi-frame NodeGeo | After inter-map alignment, NodeGeo.poses[] carries a node's pose in multiple robots' map frames simultaneously (FramedPose array). |
Cross-cutting (3 checks)
| ID | Check | Description |
|---|---|---|
| SX-01 | Quaternion convention | §2 table covers ROS 2 (x,y,z,w) to SpatialDDS (x,y,z,w) identity mapping for S3E's ROS 2 bag source. |
| SX-02 | Coordinate frame convention | Right-handed; S3E uses right-hand rule per documentation. |
| SX-03 | Time synchronization | Hardware PPS-synchronized timestamps map directly to Time { sec, nanosec }. |
Results
All 38 S3E checks pass.
| Modality | Checks | Pass | Remaining Gaps |
|---|---|---|---|
| LiDAR | 5 | 5 | 0 |
| Vision | 4 | 4 | 0 |
| IMU | 3 | 3 | 0 |
| GNSS/RTK | 3 | 3 | 0 |
| UWB (inter-robot range) | 4 | 4 | 0 |
| Core Pose Graph | 5 | 5 | 0 |
| Mapping (multi-agent) | 8 | 8 | 0 |
| Discovery & Coordination | 3 | 3 | 0 |
| Cross-cutting | 3 | 3 | 0 |
| Total | 38 | 38 | 0 |
S3E Scenario Narrative (Informative)
The S3E "teaching building" outdoor sequence illustrates the full multi-agent lifecycle:
-
Bootstrap. Three robots (Alpha, Blob, Carol) power on and each publishes an
AnnouncewithServiceKind.SLAM, their sensor capabilities, and an initial coverage bounding box. Each begins publishingcore::Nodeandcore::Edge(ODOM) on the pose graph topics with distinctsource_idandmap_idvalues. -
Independent mapping. Each robot runs visual-inertial-lidar SLAM independently.
MapMetaper robot showsstate = BUILDING. Keyframes stream ascore::Node; odometry constraints ascore::Edge(ODOM); intra-robot loop closures ascore::Edge(LOOP).ImuSample,VisionFrame, andLidarFrameare published on per-robot sensor topics. -
UWB ranging begins. As robots come within UWB range (~50 m), pairwise distance measurements are published as
mapping::Edgewithtype = RANGE,range_mcarrying the measured distance,has_from_map_id/has_to_map_ididentifying which robots' maps the linked nodes belong to. -
Inter-robot loop closure. When Alpha and Blob's LiDAR scans overlap, a cross-robot loop closure is detected. This is published as
mapping::Edgewithtype = INTER_MAP,match_scorecarrying the ICP fitness, andfrom_map_id = "alpha-map",to_map_id = "blob-map". -
Map alignment. A
MapAlignmentis published linking Alpha's and Blob's maps, withmethod = LIDAR_ICP(orMULTI_METHODif UWB ranges were fused),T_from_tocarrying the inter-map transform, andevidence_edge_ids[]referencing the supporting cross-map edges.MapEventwithMAP_ALIGNEDnotifies all subscribers. -
Multi-frame localization. Once the alignment exists, a geo-referencing service can publish
NodeGeowithposes[]containing FramedPoses in both Alpha's and Blob's map frames simultaneously. Consumers (e.g., a planning service) can pick the frame they need. -
Graph optimization. After sufficient inter-robot constraints accumulate, a global optimizer runs. All robots'
MapMeta.statetransitions toOPTIMIZING, thenSTABLE.graph_epochincrements on all nodes and edges.MapAlignment.revisionincrements. Consumers watchinggraph_epochknow to re-fetch the entire graph.
This end-to-end scenario is precisely what ROS 2's nav_msgs and sensor_msgs cannot express: there is no ROS 2 standard for map lifecycle, inter-map alignment, range-only constraints, or multi-agent discovery with spatial coverage.
I.4 ScanNet Conformance (Indoor Scene Understanding)
Reference Dataset
ScanNet (TU Munich / Princeton) is an RGB-D video dataset of indoor scenes containing:
| Dimension | Value |
|---|---|
| Scenes | 1,513 (707 unique spaces, multiple rescans) |
| RGB-D sensor | Structure.io depth + iPad color camera |
| Depth format | 16-bit unsigned integer, millimeters, 640×480 @ 30 Hz |
| Color format | JPEG-compressed RGB, 1296×968 @ 30 Hz |
| Camera poses | Per-frame 4×4 camera-to-world extrinsics via BundleFusion |
| IMU | Embedded IMU data in .sens stream |
| Surface reconstruction | Dense triangle mesh (PLY) via BundleFusion |
| Semantic annotations | Instance-level labels (NYU40 label set, 40 classes) |
| Instance annotations | Per-vertex segment IDs + aggregated object instances |
| Scene types | 20 categories (bathroom, bedroom, kitchen, living room, office, etc.) |
| Axis alignment | Per-scene 4×4 gravity-alignment matrix |
| Coordinate convention | Right-handed; +Z up in aligned frame |
ScanNet was chosen because it is the definitive indoor RGB-D scene understanding benchmark, exercises depth sensing absent from all three prior conformance datasets, and provides room-level semantic structure that naturally maps to the Spatial Events extension — the only SpatialDDS extension not yet tested by conformance.
Checks Performed (35)
RGB-D Sensing — Color (4 checks)
| ID | Check | Description |
|---|---|---|
| NC-01 | Color meta | VisionMeta with pix = RGB8, codec = JPEG, CamIntrinsics (fx, fy, cx, cy at 1296×968). |
| NC-02 | Color frame | VisionFrame per RGB image with frame_seq, hdr.stamp, blob reference to JPEG payload. |
| NC-03 | Per-scene stream isolation | Topic spatialdds/<scene_id>/vision/<stream_id>/frame/v1 with unique stream_id per scan. |
| NC-04 | Rig linkage | VisionMeta.rig_id shared between color and depth streams for spatial association. |
RGB-D Sensing — Depth (5 checks)
| ID | Check | Description |
|---|---|---|
| ND-01 | Depth meta | VisionMeta with pix = DEPTH16, codec = NONE (raw 16-bit), CamIntrinsics for depth camera. |
| ND-02 | Depth pixel format | PixFormat.DEPTH16 explicitly identifies 16-bit millimeter depth. Requires SN-1. |
| ND-03 | Depth frame | VisionFrame per depth image with frame_seq matching co-located color frame. |
| ND-04 | Invalid depth convention | Zero-valued pixels denote no measurement, consistent with DEPTH16 normative note. |
| ND-05 | Depth unit | Default millimeter unit; no depth_unit attribute required for ScanNet's Structure.io sensor. |
IMU (2 checks)
| ID | Check | Description |
|---|---|---|
| NI-01 | IMU sample | ImuSample with accel (Vec3, m/s²) + gyro (Vec3, rad/s) covers 6-axis IMU embedded in .sens stream. |
| NI-02 | Temporal ordering | ImuSample.seq provides monotonic ordering within the scan. |
Camera Pose & Frames (4 checks)
| ID | Check | Description |
|---|---|---|
| NP-01 | Per-frame pose | Camera-to-world 4×4 matrix maps to FrameHeader.sensor_pose (PoseSE3: translation + quaternion). |
| NP-02 | Axis-alignment transform | Per-scene gravity-alignment matrix published as FrameTransform from sensor frame to aligned frame. |
| NP-03 | Frame hierarchy | Aligned frame FQN follows §2.2 pattern: <scene_id>/aligned. |
| NP-04 | Quaternion convention | ScanNet uses 4×4 rotation matrices; decomposition to (x,y,z,w) quaternion per §2 convention table. |
Mesh Reconstruction (4 checks)
| ID | Check | Description |
|---|---|---|
| NM-01 | Map kind | MapMeta with kind = MESH for BundleFusion surface reconstruction. |
| NM-02 | Map lifecycle | MapMeta.state = STABLE for completed reconstructions (offline dataset; no BUILDING phase observed). |
| NM-03 | Mesh payload | BlobRef referencing PLY mesh file. SpatialDDS carries mesh references, not inline mesh data. |
| NM-04 | Vertex count metadata | MapMeta.attributes carries vertex/face count as MetaKV for consumers to assess mesh complexity. |
3D Instance Segmentation — Semantics (6 checks)
| ID | Check | Description |
|---|---|---|
| NS-01 | 3D detection | Detection3D per annotated object instance, with class_id from NYU40 label set (e.g., "chair", "table", "door"). |
| NS-02 | Instance ID | Detection3D.det_id unique per object instance within a scene (maps from ScanNet's objectId). |
| NS-03 | Oriented bounding box | Detection3D.center + size + q cover ScanNet's axis-aligned bounding boxes (identity quaternion in aligned frame). |
| NS-04 | Track ID | Detection3D.track_id groups the same physical object across multiple rescans of the same space. |
| NS-05 | Visibility | Detection3D.visibility (0–1) maps from ScanNet annotation coverage ratio. |
| NS-06 | Class vocabulary | class_id as free-form string covers all 40 NYU40 categories without a closed enum — consistent with SpatialDDS's ontology-agnostic design. |
Spatial Events — Indoor Zones (6 checks)
| ID | Check | Description |
|---|---|---|
| NZ-01 | Room as zone | SpatialZone per ScanNet scene, with zone_id = scene ID, name = human-readable scene name. |
| NZ-02 | Zone kind | ZoneKind.MONITORING for general-purpose room observation (no access restriction implied). |
| NZ-03 | Zone bounds | SpatialZone.bounds (Aabb3) enclosing the room extent, derived from mesh bounding box in aligned frame. |
| NZ-04 | Scene type as attribute | ScanNet sceneType (bathroom, bedroom, kitchen, etc.) carried as MetaKV in SpatialZone.attributes with namespace = "scene_type", json = {"type": "kitchen"}. |
| NZ-05 | Class filter | SpatialZone.class_filter populated with object classes of interest (e.g., ["person", "chair", "table"]) for selective event triggering. |
| NZ-06 | Zone frame | SpatialZone.frame_ref references the gravity-aligned frame established by the axis-alignment transform (NP-02). |
Spatial Events — Object Events (4 checks)
| ID | Check | Description |
|---|---|---|
| NE-01 | Zone entry | SpatialEvent with event_type = ZONE_ENTRY when a Detection3D instance is first observed within a SpatialZone's bounds. |
| NE-02 | Trigger linkage | SpatialEvent.trigger_det_id references the triggering Detection3D.det_id; trigger_class_id carries the NYU40 label. |
| NE-03 | Zone state | ZoneState with zone_occupancy count reflecting the number of annotated object instances within the room. |
| NE-04 | Class counts | ZoneState.class_counts (sequence of MetaKV) carries per-class occupancy (e.g., {"count": 4} for class "chair"). |
Results
All 35 ScanNet checks pass.
| Modality | Checks | Pass | Remaining Gaps |
|---|---|---|---|
| Color (RGB) | 4 | 4 | 0 |
| Depth (RGBD) | 5 | 5 | 0 |
| IMU | 2 | 2 | 0 |
| Camera Pose & Frames | 4 | 4 | 0 |
| Mesh Reconstruction | 4 | 4 | 0 |
| 3D Instance Segmentation | 6 | 6 | 0 |
| Spatial Events — Zones | 6 | 6 | 0 |
| Spatial Events — Object Events | 4 | 4 | 0 |
| Total | 35 | 35 | 0 |
ScanNet Scenario Narrative (Informative)
The ScanNet "apartment" scan sequence illustrates how SpatialDDS types map to a complete indoor scene understanding pipeline:
-
Scan ingestion. An operator walks through a kitchen with an iPad running the ScanNet capture app. Color frames are published as
VisionFrame(pix=RGB8, codec=JPEG) and depth frames asVisionFrame(pix=DEPTH16, codec=NONE) on paired streams linked byrig_id.ImuSamplestreams concurrently from the embedded IMU. -
Pose estimation. BundleFusion produces per-frame camera poses, published as
FrameHeader.sensor_poseon each VisionFrame. The per-scene axis-alignment matrix is published as aFrameTransformfrom the sensor coordinate system to a gravity-aligned room frame. -
Mesh reconstruction. The completed surface mesh is registered as
MapMetawithkind = MESH,state = STABLE. The PLY file is referenced viaBlobRef. Vertex/face counts are carried inMapMeta.attributes. -
Zone definition. The kitchen is defined as a
SpatialZonewithkind = MONITORING,boundsenclosing the room extent, andattributescarryingscene_type = "kitchen". Theframe_refpoints to the gravity-aligned frame. -
3D instance detection. Crowdsourced annotations produce
Detection3Dinstances for each labeled object: chairs withclass_id = "chair", tables withclass_id = "table", a refrigerator withclass_id = "refrigerator"— each with an oriented bounding box in the aligned frame. -
Spatial events. A zone monitoring service evaluates which Detection3D instances fall within the kitchen SpatialZone's bounds and publishes
SpatialEvent(ZONE_ENTRY) for each.ZoneStateis published periodically withzone_occupancy = 12(total instances) andclass_countslisting per-class breakdowns.
This pipeline exercises the Spatial Events extension end-to-end — from zone definition through detection to event generation — a capability path untested by nuScenes (no zones), DeepSense 6G (no zones), or S3E (no zones or semantics).
Deferred Items
- Per-vertex semantic labels. ScanNet provides per-vertex class labels on the reconstructed mesh. SpatialDDS has no per-vertex label type; the labeled mesh PLY is carried as a
BlobRef. A future per-vertex or per-point semantic annotation type could make this data first-class. - CAD model alignment. ScanNet aligns ShapeNet CAD models to detected objects. The ShapeNet model ID can be carried in
Detection3D.attributesas a MetaKV, but there is no first-class CAD reference type. - 2D projected labels. ScanNet provides per-frame 2D semantic/instance label images. These can be published as
VisionFramewith a label-specificstream_idandpix = RAW16(16-bit label IDs), but a dedicated label pixel format is not defined.
I.5 LaMAR Conformance (AR Localization with Radio Fingerprints)
Reference Dataset
LaMAR (CVG ETH Zurich / Microsoft) is an AR localization benchmark containing aligned visual and radio observations:
| Dimension | Value |
|---|---|
| Captures | Indoor/outdoor smartphone + HoloLens trajectories |
| Visual data | RGB image streams for retrieval/localization |
| Radio data | WiFi scans (wifi.txt) and Bluetooth scans (bt.txt) |
| Targets | Visual localization recall under radio-assisted retrieval |
| Notable finding | WiFi/BT-assisted retrieval improves localization recall (+4.6% to +17.5%) |
LaMAR was selected to validate radio-assisted AR workflows and close the prior LM-1 gap where radio observations were carried only as ad hoc MetaKV JSON payloads.
Checks Performed (22)
Radio Profile Coverage (12 checks)
| ID | Check | Description |
|---|---|---|
| LM-01 | Typed per-scan container | RadioScan carries one scan event with sensor_id, radio_type, scan_seq, and stamp. |
| LM-02 | Typed per-observation container | RadioObservation carries one transmitter measurement (identifier, measurement_kind, value). |
| LM-03 | WiFi identifier format | BSSID maps to lowercase colon-separated identifier. |
| LM-04 | BLE identifier format | Beacon UUID/MAC maps to canonical identifier. |
| LM-05 | RSSI representation | RSSI maps to measurement_kind = RSSI, value in dBm. |
| LM-06 | WiFi frequency/channel | frequency_mhz, band, and channel map with has_* guards. |
| LM-07 | BLE major/minor | iBeacon major/minor maps with has_major_minor. |
| LM-08 | BLE Tx power | Advertised Tx power maps with has_tx_power. |
| LM-09 | Scan duration | Variable scan-window duration maps to scan_duration_s. |
| LM-10 | Aggregation window | ±window aggregation maps to aggregation_window_s. |
| LM-11 | Sensor metadata | RadioSensorMeta captures capability flags and adapter metadata. |
| LM-12 | Schema tag | schema_version set to spatial.sensing.radio/1.5. |
Discovery and QoS Integration (5 checks)
| ID | Check | Description |
|---|---|---|
| LD-01 | Registered type | Discovery type registry includes radio_scan. |
| LD-02 | QoS profile | RADIO_SCAN_RT available for radio scan topics. |
| LD-03 | Topic naming | Topic pattern spatialdds/<scene>/radio/<sensor_id>/scan/v1 is valid under §3.3.1. |
| LD-04 | Meta durability | RadioSensorMeta uses RELIABLE + TRANSIENT_LOCAL semantics. |
| LD-05 | Optional fields | Radio optional values consistently follow has_* guard pattern. |
Interop and Privacy (5 checks)
| ID | Check | Description |
|---|---|---|
| LP-01 | Multi-technology support | A device can publish separate WiFi and BLE scan streams with shared timebase. |
| LP-02 | Fingerprint matching readiness | Canonical identifier formats support stable join keys across sessions. |
| LP-03 | Pose association | Optional sensor_pose + pose_frame_ref supports radio-visual alignment. |
| LP-04 | Privacy guidance | Identifier anonymization guidance documented for sensitive deployments. |
| LP-05 | No algorithm coupling | Profile transports observations only; no positioning algorithm mandated. |
Results
All 22 LaMAR checks pass.
| Modality | Checks | Pass | Gap | Missing | Notes |
|---|---|---|---|---|---|
| Radio profile | 12 | 12 | 0 | 0 | LM-1 closed via RadioScan/RadioSensorMeta |
| Discovery + QoS | 5 | 5 | 0 | 0 | radio_scan + RADIO_SCAN_RT integrated |
| Interop + privacy | 5 | 5 | 0 | 0 | Identifier and anonymization guidance documented |
| Total | 22 | 22 | 0 | 0 | 100% coverage |
Deferred Items
- CSI/CIR first-class payloads.
CSI_REFcurrently points to external payloads. A future extension may define typed CSI/CIR transport. - Multi-band coexistence metadata. Additional fields for scan policy and dwell-time scheduling may be needed for dense AP environments.
Reproducing the Tests
The nuScenes and DeepSense 6G conformance harnesses are self-contained Python 3 scripts with no external dependencies.
nuScenes harness (scripts/nuscenes_harness_v2.py):
python3 scripts/nuscenes_harness_v2.py
Mirrors the SpatialDDS 1.5 IDL structures as Python dictionaries and checks them against the nuScenes schema. Produces a plain-text report and a JSON results file.
DeepSense 6G harness (scripts/deepsense6g_harness_v3.py):
python3 scripts/deepsense6g_harness_v3.py
Validates 44 checks across 7 modalities (radar tensor, vision, lidar, IMU, GPS, mmWave beam, semantics). The mmWave beam checks validate against the provisional rf_beam profile (Appendix E). Produces a plain-text report and a JSON results file.
S3E conformance: The 38 S3E checks documented in §I.3 were performed as a manual schema-vs-schema analysis. A scripted harness (scripts/s3e_harness_v1.py) following the same pattern as the nuScenes and DeepSense 6G scripts is planned for a future revision.
ScanNet conformance: The 35 ScanNet checks documented in §I.4 were performed as a manual schema-vs-schema analysis. A scripted harness (scripts/scannet_harness_v1.py) is planned for a future revision.
LaMAR conformance: The 22 LaMAR checks documented in §I.5 were performed as a manual schema-vs-schema analysis against the published wifi.txt and bt.txt field layouts and the radio-assisted retrieval workflow described by the benchmark. A scripted harness (scripts/lamar_harness_v1.py) is planned for a future revision.
No harness requires network access, a DDS runtime, or a dataset download. Implementers are encouraged to adapt the harnesses for additional reference datasets (e.g., Waymo Open, KITTI, Argoverse 2, RADIal, SubT-MRS, ScanNet, LaMAR) to validate coverage for sensor configurations or multi-agent scenarios not already covered.
Limitations
This testing validates schema expressiveness -- whether every dataset field has a lossless SpatialDDS mapping. It does not validate:
- Wire interoperability -- actual DDS serialization/deserialization round-trips.
- Performance -- throughput, latency, or memory footprint under real sensor loads.
- Semantic correctness -- whether a particular producer's mapping preserves the intended meaning of each field.
- Multi-dataset coverage -- datasets with different sensor configurations (e.g., solid-state lidar, event cameras, ultrasonic sensors) or deployment patterns (e.g., multi-floor hierarchical spaces, aerial-ground cooperation, dense pedestrian tracking) may surface additional gaps. S3E covers three-robot outdoor coordination; ScanNet covers single-room indoor scenes. Larger fleet sizes, degraded-communication environments, multi-floor buildings, and heterogeneous robot types (ground + aerial) remain untested.
These areas are appropriate targets for future conformance work.