feat: in-house robot asset manager#2505
Conversation
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #2505 +/- ##
==========================================
+ Coverage 70.09% 70.12% +0.02%
==========================================
Files 838 846 +8
Lines 74337 74797 +460
Branches 6667 6742 +75
==========================================
+ Hits 52109 52449 +340
- Misses 20553 20643 +90
- Partials 1675 1705 +30
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR introduces a Git-backed robot asset manager that clones upstream robot description repositories into
Confidence Score: 4/5Safe to merge for its primary use case; one gap in the cache-update guard can silently discard local commits made to cached checkouts. The core asset management pipeline is well-structured and the lazy path adapters, xacro rendering, and Drake cleanup layers are clean. The one concrete defect is in dimos/robot/assets/git_cache.py — the Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Catalog as catalog/ufactory.py
participant RAP as RobotAssetPath (lazy)
participant RAM as RobotAssetManager
participant GAC as GitAssetCache
participant FS as ~/.cache/dimos/robot_assets
Catalog->>RAP: RobotAssetPath("xarm7", "urdf") -- no I/O
Note over RAP: Stores model+role, manager ref<br/>resolved_cache = None
Catalog->>RAM: robot_asset_package_paths("xarm7") -- no I/O
RAM-->>Catalog: "{xarm_description: RobotAssetPackagePath(...)}"
Note over Catalog: First use of model_path or package_paths triggers resolution
RAP->>RAM: resolve_artifact("xarm7", "urdf")
RAM->>GAC: resolve(repo_url, "humble")
GAC->>FS: FileLock acquire
alt Cache miss
GAC->>FS: "clone xarm_ros2 to sources/{hash}/xarm_ros2/"
else Cache hit, clean
GAC->>FS: fetch + checkout -B humble origin/humble
else Cache hit, dirty
GAC-->>RAM: warn, return cached path
end
GAC-->>RAM: GitAssetCheckout(path)
RAM-->>RAP: checkout/xarm_description/urdf/xarm_device.urdf.xacro
RAP->>RAP: cache resolved Path
RAM->>FS: "render_urdf to derived/rendered_urdfs/{hash}/*.urdf"
RAM->>FS: "Drake cleanup to derived/drake_urdfs/{hash}/*.urdf"
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Catalog as catalog/ufactory.py
participant RAP as RobotAssetPath (lazy)
participant RAM as RobotAssetManager
participant GAC as GitAssetCache
participant FS as ~/.cache/dimos/robot_assets
Catalog->>RAP: RobotAssetPath("xarm7", "urdf") -- no I/O
Note over RAP: Stores model+role, manager ref<br/>resolved_cache = None
Catalog->>RAM: robot_asset_package_paths("xarm7") -- no I/O
RAM-->>Catalog: "{xarm_description: RobotAssetPackagePath(...)}"
Note over Catalog: First use of model_path or package_paths triggers resolution
RAP->>RAM: resolve_artifact("xarm7", "urdf")
RAM->>GAC: resolve(repo_url, "humble")
GAC->>FS: FileLock acquire
alt Cache miss
GAC->>FS: "clone xarm_ros2 to sources/{hash}/xarm_ros2/"
else Cache hit, clean
GAC->>FS: fetch + checkout -B humble origin/humble
else Cache hit, dirty
GAC-->>RAM: warn, return cached path
end
GAC-->>RAM: GitAssetCheckout(path)
RAM-->>RAP: checkout/xarm_description/urdf/xarm_device.urdf.xacro
RAP->>RAP: cache resolved Path
RAM->>FS: "render_urdf to derived/rendered_urdfs/{hash}/*.urdf"
RAM->>FS: "Drake cleanup to derived/drake_urdfs/{hash}/*.urdf"
Reviews (2): Last reviewed commit: "spec: remove" | Re-trigger Greptile |
| def _generate_render_key( | ||
| urdf_path: Path, | ||
| package_paths: Mapping[str, Path], | ||
| xacro_args: Mapping[str, str], | ||
| package_uri_mode: PackageUriMode, | ||
| ) -> str: | ||
| processing_version = "urdf-render-v1" | ||
| mtime = urdf_path.stat().st_mtime if urdf_path.exists() else 0 | ||
| key_data = repr( | ||
| ( | ||
| processing_version, | ||
| str(urdf_path), | ||
| mtime, | ||
| sorted((name, str(path)) for name, path in package_paths.items()), | ||
| sorted(xacro_args.items()), | ||
| package_uri_mode, | ||
| ) | ||
| ) | ||
| return hashlib.sha256(key_data.encode()).hexdigest()[:16] |
There was a problem hiding this comment.
Stale render cache when only included xacro files change
The cache key is built from the top-level xacro file's mtime only. When _refresh_cached runs git reset --hard origin/{ref}, git only updates the mtime of changed files. If an upstream commit modifies an included xacro (e.g., a joint-limit macro or mesh include) without touching the entry-point file, the cache key stays identical and the old rendered URDF is silently reused — potentially loading the robot with outdated joint parameters or mesh paths. To fix this robustly the key would need to fold in the mtime (or content hash) of each file reachable from xacro_path, or at least the mtimes of every file under the package-root directories supplied via package_paths.
| cache_path = _RENDERED_URDF_CACHE_ROOT / cache_key / source_path.stem | ||
| cache_path.mkdir(parents=True, exist_ok=True) | ||
| rendered_urdf = cache_path / f"{source_path.stem}.urdf" |
There was a problem hiding this comment.
Double
.urdf extension for .urdf.xacro sources
Path("xarm_device.urdf.xacro").stem is "xarm_device.urdf", so the rendered file ends up as xarm_device.urdf.urdf inside the cache. This also propagates to the Drake-specific cache in mesh_utils.py (rendered_urdf.stem = "xarm_device.urdf", final file = "xarm_device.urdf.urdf"). The file is still read correctly, but the double extension makes cache directories harder to inspect and could confuse tools that infer format from the extension. Consider stripping all non-final suffixes when building the stem, e.g. source_path.stem.removesuffix(".urdf").
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
| @staticmethod | ||
| def _is_dirty(repo: Repo) -> bool: | ||
| return repo.is_dirty(untracked_files=True) | ||
|
|
||
| @staticmethod | ||
| def _checkout_ref(repo: Repo, ref: str) -> None: | ||
| """Check out a branch, tag, or commit and update clean worktrees safely.""" | ||
| remote_ref = f"origin/{ref}" | ||
| remote_refs = {str(r) for r in repo.remotes.origin.refs} | ||
|
|
||
| if remote_ref in remote_refs: | ||
| repo.git.checkout("-B", ref, remote_ref) | ||
| repo.git.reset("--hard", remote_ref) |
There was a problem hiding this comment.
_is_dirty misses local commits, causing silent data loss on update
_is_dirty returns True only when the working tree has uncommitted modifications or untracked files. It does not detect local commits that have been made on the branch but not yet pushed. As a result, when a cached checkout has clean working-tree state but one or more local commits on the tracked branch, the guard passes and _checkout_ref immediately runs git checkout -B ref origin/ref + git reset --hard origin/ref. These two commands forcibly move the local branch pointer back to origin/ref, making those local commits unreachable without going through the reflog — and also removing any files introduced by those commits from the working tree.
This contradicts the class docstring's policy: "if the cached repository has local changes, warn and skip updates." Local commits are local changes in the meaningful sense. A developer who tests a model tweak by committing it inside ~/.cache/dimos/robot_assets/ would lose that work silently on the next DimOS run.
To also protect local commits, _is_dirty should additionally check whether the local branch is ahead of its remote tracking branch.
Problem
DimOS currently keeps several manipulation robot model bundles in Git LFS. That copies upstream robot description assets into this repo, makes updates manual, and makes CI brittle on runners that do not have those LFS objects available.
robot_descriptions.pysolves a related problem, but it is not a good fit for DimOS as the main asset layer:Because of those limits, this PR adds a small in-house Robot Asset Manager instead of replacing LFS with
robot_descriptions.py.Solution
Add a Git-backed robot asset layer for manipulation robot descriptions:
GitAssetCachethat clones and updates upstream source repos under~/.cache/dimos/robot_assets;RobotAssetPath/RobotAssetPackagePathobjects so catalog imports do not trigger network or Git work;The asset code is kept self-contained under
dimos/robot/assetsso it can be extracted into a standalone package later.How to Test
Manual check for a human reviewer:
Open the Drake Meshcat URL printed in the logs. The xArm7 model should load and visualize correctly. The first run may clone
xarm_ros2into~/.cache/dimos/robot_assets; derived URDFs should be written under~/.cache/dimos/robot_assets/derived.For daemon mode:
Contributor License Agreement