Skip to content

Add K8s eval support, Kaniko build pipeline, and OpenAI config#1

Open
qywu wants to merge 1 commit into
mainfrom
k8s-eval-support
Open

Add K8s eval support, Kaniko build pipeline, and OpenAI config#1
qywu wants to merge 1 commit into
mainfrom
k8s-eval-support

Conversation

@qywu
Copy link
Copy Markdown
Member

@qywu qywu commented May 29, 2026

Summary

  • K8s mode for hugging_face_task: set ENVIRONMENT_IMAGE to skip docker-compose and run a fresh Pod+Service per eval run; uses ClusterIP directly (no cluster DNS required from dev pods)
  • build_and_push.sh: Kaniko-based image build using the shared WekaFS PVC as build context — no Docker daemon needed
  • environment/Dockerfile.kaniko: Kaniko-compatible Dockerfile replacing BuildKit --mount=type=secret with ARG/ENV GITHUB_TOKEN, plus a sed patch that redirects the private mercor-mcp-shared git dependency to a local stub
  • mcp_servers/packages/mercor-mcp-shared/: Stub package that provides mcp_schema (GeminiBaseModel, flatten_schema, etc.) without requiring access to the private Mercor-Intelligence GitHub org
  • MCP config: Added PYTHONPATH pointing to the stub for servers that depend on mercor-mcp-shared; removed code_execution_server (sandbox .so compile issue in Kaniko build)
  • k8s_environment.yaml: Reference K8s manifests for the environment pod/service
  • litellm bump: 1.83.101.86.2 in both agents/ and grading/
  • Model config: switched to OpenAI for both agent and LLM judge

Test plan

  • Build image: export ENVIRONMENT_IMAGE=<registry>/archipelago-environment:latest && ./build_and_push.sh
  • Run eval: cd examples/hugging_face_task && ./run.sh (or uv run python main.py with ENVIRONMENT_IMAGE set)
  • Verify MCP servers start (calendar, chat, sheets, filesystem, mail, pdf, slides, docs)
  • Verify agent completes the default Investment Banking task and grading runs

Key changes:
- examples/hugging_face_task/main.py: K8s mode via ENVIRONMENT_IMAGE env var,
  per-eval Pod+Service lifecycle, ClusterIP-based ENV_URL (no cluster DNS needed),
  ENV=local override for subprocess settings validation
- build_and_push.sh: Kaniko-based image build using shared PVC context,
  Dockerfile.kaniko generation that replaces BuildKit secrets with ARG/ENV
- environment/Dockerfile.kaniko: Kaniko-compatible Dockerfile with
  ARG/ENV GITHUB_TOKEN and sed patch to redirect mercor-mcp-shared git dep
  to local stub package
- mcp_servers/packages/mercor-mcp-shared/: Stub package providing mcp_schema
  (GeminiBaseModel, flatten_schema, etc.) without requiring private GitHub access
- examples/hugging_face_task/k8s_environment.yaml: K8s manifests reference
- examples/hugging_face_task/mcp_config_all_oss_servers.json: Added PYTHONPATH
  override for servers depending on mercor-mcp-shared stub
- agents/pyproject.toml, grading/pyproject.toml: Bump litellm to 1.86.2
- orchestrator_config.json, grading_settings.json: Switch to OpenAI models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant