Point-LIO native module + Virtual Livox #2486
Conversation
…lio_native flake/cmake consuming dimos-module-fastlio2 pointlio branch + Estimator/parameters sources)
…y path) The fast-lio input was pinned to file:///Users/jeffhykin/... which only exists on the Mac. Repoint to the dimensionalOS/dimos-module-fastlio2 pointlio branch on github so the flake builds on Linux. Same locked rev.
Rename the mirrored fastlio_blueprints.py to pointlio_blueprints.py and wire it to PointLio (was incorrectly using FastLio2). Adds mid360-pointlio and mid360-pointlio-voxels to the blueprint registry.
…race-fix flake relock Adds the offline replay/inspection tooling for the pointlio_native module: pcap_replay.hpp (streams a Livox pcap into the SDK callbacks), deterministic_clock + dual-thread replay options, the ruwik2_pt3 replay harness, the pcap_to_db tool (append pointlio_odometry into an existing memory db at ~30Hz, streaming), validated Point-LIO mid360.yaml, and the live smoke-test helper. flake.lock relocks onto the fastlio2 pointlio rev carrying the mtx_buffer race fix; main.cpp comment corrected to attribute the divergence fix to that lock (not the publish-rate gating).
Switches the flake's fast-lio input from path:/home/dimos/repos/dimos-module-fastlio2 to github:dimensionalOS/dimos-module-fastlio2/pointlio so the module builds on any machine without a local fastlio2 clone. Relocked onto rev 7e5d88f (the mtx_buffer race fix); verified pointlio_native builds from the github source.
…hful config OUTPUT-model replay of the Jun-7 hand-shake bag diverged to ~25km (vs hku-mars master's bounded ~5m). Two issues, both fixed here (paired with the dimos-module-fastlio2 pointlio-branch curvature/deque fixes): - main.cpp: heap-use-after-free thread race in bag-frame replay. The feeder thread drives run_main_iter via step() after each feed, but the main thread also drove run_main_iter, so both raced on the shared PCL measurement cloud (ASan: free in sync_packages, read-after-free in ImuProcess::Process). Force serial_replay in bag-frame mode so only the feeder thread touches the EKF. - CMakeLists.txt + flake.nix: the iVox map backend needs glog; add the find_package/link and the nix buildInput. Drop ikd_Tree.cpp from sources (iVox replaces ikd-Tree). - config/mid360.yaml (+ cpp/config duplicate): set satu_acc 5.5->3.0 (master value — accel >=3g saturated, residual zeroed, keeps velocity bounded) and add ivox_grid_resolution=2.0 / ivox_nearby_type=6 (NEARBY6) matching the master config that produced the 5.028m baseline. - replay_ruwik2_pt3.py: make MAX_WALL_SEC env-overridable. After the fix the OUTPUT model stays bounded (peak |pos| 3.898m) on the full bag; ASan reports zero memory errors.
…onsumes the iVox fix The iVox-port + divergence fixes live in dimos-module-fastlio2 @ pointlio commit 02d5066, which is committed locally but NOT pushed (per the task's no-push constraint). The flake's fast-lio input was pointing at github:dimensionalOS/dimos-module-fastlio2/pointlio, whose lock pinned the pre-fix rev 7e5d88f. Since CMakeLists.txt already drops ikd_Tree.cpp (iVox replaces ikd-Tree), building against that stale github source fails with KD_TREE undefined-reference linker errors. Repoint fast-lio at path:/home/dimos/repos/dimos-module-fastlio2 and bump the lock so `nix build .#pointlio_native` consumes the local 02d5066 source. Verified: build exits 0, no KD_TREE errors, 661176-byte binary. NOTE: this bakes a machine-specific absolute path into the lock. Once Repo A's pointlio branch is pushed, switch this input back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to 02d5066 for portability.
Now that dimos-module-fastlio2 pointlio (02d5066, the iVox port + divergence fix) is on origin, switch the fast-lio flake input from the local path: pin back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to rev 02d5066. This makes jeff/feat/pointlio_native self-contained and portable (no machine-specific path in the lock). Verified: nix build .#pointlio_native exits 0, no KD_TREE errors.
… add --force Extends pcap_to_db.py to populate both pointlio_odometry and pointlio_lidar streams (start-aligned onto the db's earliest ts) so pointlio can be compared against fastlio in one recording. --force overwrites existing pointlio streams. Untracks the machine-specific fastlio_test/setup_network symlinks (enhance overlay) and drops the empty tools/__init__.py.
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #2486 +/- ##
==========================================
- Coverage 70.09% 69.97% -0.13%
==========================================
Files 838 842 +4
Lines 74337 74469 +132
Branches 6667 6672 +5
==========================================
- Hits 52110 52109 -1
- Misses 20552 20658 +106
- Partials 1675 1702 +27
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 4 files with indirect coverage changes 🚀 New features to boost your workflow:
|
…o tools - Consolidate config/ to one default.yaml (the tuned Mid-360/iVox config); remove the unused upstream sensor presets (avia, horizon, marsim, ouster64, velodyne, mid360). Module now defaults to default.yaml. - Remove tools/replay_ruwik2_pt3.py (pcap_to_db.py covers offline replay) and tools/demo_live_test.py.
…egistry - --db is now optional: with no existing db, build one from scratch (defaults to <pcap>.db next to the pcap); with an existing db, append + time-align as before. - Rename the internal recorder Rec/RecConfig -> _Rec/_RecConfig so the blueprint-registry generator skips it (a tool helper isn't a public module); fixes test_all_blueprints_is_current. - Docstring: document from-scratch generation using the ruwik2_part3 LFS sample.
… rm cpp/README + config copies; concise comments
- config: consolidate to a single config/default.yaml; drop the ROS-only
lid_topic/imu_topic + odom frame-id/publish/pcd_save keys (parsed-but-unused
by the native binary), and note which params are Mid-360-specific.
- delete cpp/config/{mid360.json,mid360.yaml} (duplicate) and cpp/README.md.
- tighten verbose comments in main.cpp, pcap_replay.hpp, timing.hpp
(comments only; binary rebuilds clean).
… db name in docstring
120s clip (elapsed 55-175s) of the ruwik velocity-spike recording — the data
that diverges through FAST-LIO at the 0.5m pre-KF voxel and is bounded under
Point-LIO. Use via get_data('ruwik2_part3'); pcap_to_db --pcap <it> builds the
db from scratch.
Greptile SummaryThis PR adds two new modules:
Confidence Score: 5/5Safe to merge; all previously identified blocking issues have been addressed in this revision, and only two minor logic nits remain in the Rust virtual sensor binary. Every issue flagged in prior review rounds — velocity forwarding, cp.y coordinate assignment, TF frame IDs, ts falsy-zero fallbacks, shutdown ordering, cleanup scope, and DB ownership — is correctly resolved in the current diff. The two remaining findings (a double-sleep when the handshake times out with a non-zero delay, and unicast vs. broadcast in the discovery ACK) are both in the Rust virtual-sensor replay path, affect only non-default configurations or unusual network topologies, and do not corrupt data or block the primary use case. dimos/hardware/sensors/lidar/livox/virtual_mid360/src/main.rs — the two minor fixes in spawn_stream and spawn_discovery. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant User
participant Outer as pcap_to_db outer
participant Inner as inner Python coord
participant PLIO as PointLio binary
participant VM as VirtualMid360 binary
participant DB as SQLite DB
User->>Outer: python -m pcap_to_db --pcap file.pcap
Outer->>Outer: setup netns/alias iface via sudo
Outer->>Inner: spawn in drv netns
Inner->>PLIO: spawn via NativeModule
PLIO->>PLIO: LivoxLidarSdkStart, wait for sensor
Outer->>Outer: sleep warmup_sec
Outer->>VM: spawn, write JSON config to stdin
VM->>VM: listen on discovery port 56000 and control port 56100
PLIO->>VM: discovery broadcast
VM->>PLIO: discovery ACK cmd 0x0000
PLIO->>VM: work-mode cmd 0x0100
VM->>PLIO: work-mode ACK arm stream
VM->>PLIO: replay point and IMU UDP at configured rate
PLIO->>PLIO: IESKF SLAM feed_lidar feed_imu process
PLIO->>Inner: LCM odometry and point cloud
Inner->>DB: append ts-aligned rows
Inner->>Inner: poll stagnant and startup timeout
Inner->>Outer: exit when pcap drained or timeout
Outer->>Outer: teardown netns alias restore db ownership
Outer->>User: print stats and exit
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant User
participant Outer as pcap_to_db outer
participant Inner as inner Python coord
participant PLIO as PointLio binary
participant VM as VirtualMid360 binary
participant DB as SQLite DB
User->>Outer: python -m pcap_to_db --pcap file.pcap
Outer->>Outer: setup netns/alias iface via sudo
Outer->>Inner: spawn in drv netns
Inner->>PLIO: spawn via NativeModule
PLIO->>PLIO: LivoxLidarSdkStart, wait for sensor
Outer->>Outer: sleep warmup_sec
Outer->>VM: spawn, write JSON config to stdin
VM->>VM: listen on discovery port 56000 and control port 56100
PLIO->>VM: discovery broadcast
VM->>PLIO: discovery ACK cmd 0x0000
PLIO->>VM: work-mode cmd 0x0100
VM->>PLIO: work-mode ACK arm stream
VM->>PLIO: replay point and IMU UDP at configured rate
PLIO->>PLIO: IESKF SLAM feed_lidar feed_imu process
PLIO->>Inner: LCM odometry and point cloud
Inner->>DB: append ts-aligned rows
Inner->>Inner: poll stagnant and startup timeout
Inner->>Outer: exit when pcap drained or timeout
Outer->>Outer: teardown netns alias restore db ownership
Outer->>User: print stats and exit
Reviews (30): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile |
…-KF downsample; C++ style - Rename output-cloud filter config sor_mean_k/sor_stddev -> outlier_neighbor_count/ outlier_std_threshold; add raw_cloud bool that bypasses voxel+outlier entirely; voxel_size<=0 now skips downsampling. - Document the pre-KF (input) downsampling in default.yaml as distinct + how to disable (point_filter_num=1, space_down_sample=false). - main.cpp: unwrap the multi-line fn signatures; brace every if/for body.
…ers); filter_size_surf 0.5->0.2 - main.cpp: fastlio_debug->pointlio_debug, include fast_lio.hpp/fast_lio_debug.hpp -> pointlio.hpp/pointlio_debug.hpp; flake bumped to dimos-module-fastlio2@286c530 (the matching upstream rename). - filter_size_surf 0.5->0.2 (denser pre-KF scan). Verified bounded over 3 replays of ruwik2_part3: max|pos| 9.59 / 9.54 / 9.32 m (was ~15m at 0.5).
The earlier printf-unwrap pass collapsed lines that ended in a trailing comment, eating the next statement into the comment: cp.y (both CARTESIAN paths), cp.offset_time (HIGH path), and the g_child_frame_id/g_frequency globals were all silently dropped. Split them back onto their own lines so they execute again.
…null-deref race) LivoxLidarSdkUninit() now runs before g_point_lio/g_lcm are nulled, so the SDK callback threads (on_imu_data/on_point_cloud) are stopped + joined first and can't race the assignment into a null dereference.
Re-tarred data/mid360_shake_stairs/ (the Mid-360 shake-on-stairs clip) to data/.lfs/mid360_shake_stairs.tar.gz (uploaded to LFS), removed the old ruwik2_part3 tarball, and updated the get_data() reference in pcap_to_db.
Remove cloud_filter.hpp (PCL voxel-grid + statistical-outlier-removal) and its config (raw_cloud/voxel_size/outlier_*); publish_lidar now emits get_body_cloud as-is. The pre-KF downsample (Point-LIO's own space_down_sample/filter_size_surf in default.yaml) is untouched.
| lidar_ip: str | ||
| snaplen: int = 2048 |
There was a problem hiding this comment.
lidar_ip required but env-var fallback missing
Mid360PcapRecorderConfig.lidar_ip is a bare str with no default, so Pydantic requires it at construction time. The PR description shows it commented out (# lidar_ip="192.168.1.155") and the setup section advertises export DIMOS_MID360_LIDAR_IP, but unlike VirtualMid360Config (which reads DIMOS_MID360_LIDAR_IP via a Field(default_factory=...)) the recorder config has no env-var hook. Any user who follows the documented example without explicitly passing lidar_ip will get a ValidationError: Field required before the module even starts.
| Err(msg) => { | ||
| // Exit non-zero so the coordinator surfaces the fix command. | ||
| tracing::error!("{msg}"); | ||
| eprintln!("\n[virtual_mid360] {msg}\n"); |
There was a problem hiding this comment.
Why eprintln! instead of just using tracing?
Adds virtual mid360, and pointlio as alternative to fastlio
Usage / Testing
Pcap to DB
Record a Mid-360 pcap
Replay a pcap to a module