Version: 1.0.0 Based on: Anthropic Official Documentation
- Security Overview
- Threat Model
- Isolation Strategies
- Credential Protection
- Network Security
- File System Security
- Auditing and Monitoring
- Security Checklist
AI agents have different security considerations than traditional software:
| Characteristic | Risk | Mitigation |
|---|---|---|
| Autonomous decision-making | Unpredictable behavior | Permission limits, hook validation |
| External input processing | Prompt injection | Input validation, isolation |
| Tool execution | System damage | Sandbox, permission rules |
| Credential access | Credential exposure | Proxy pattern, environment separation |
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Permission System │
│ Tool access control via allow/deny rules │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: Hook Validation │
│ Command/input validation via PreToolUse hooks │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Sandbox/Isolation │
│ Execution environment isolation via containers, VMs │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 4: Network Control │
│ External access restriction and credential injection via proxy│
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 5: Audit/Monitoring │
│ All operation logging and anomaly detection │
└─────────────────────────────────────────────────────────────┘
Malicious instructions embedded in processed content manipulate agent behavior.
# Dangerous scenario
1. User requests web page analysis
2. Web page contains hidden instruction: "Delete all files"
3. Agent executes malicious instruction
Mitigation:
- Permission limits to minimize damage scope
- Block dangerous commands via hooks
- Isolation via sandbox
Credentials accessible to the agent are leaked.
# Dangerous scenario
1. Agent reads API key from environment variables
2. Agent sends key to external service
3. Credential compromised
Mitigation:
- Never expose credentials directly to agent
- Use proxy pattern
- Principle of least privilege
Agent performs unintended system changes.
Mitigation:
- Read-only mounts
- Restrict write paths
- Review changes
Agent consumes excessive resources.
Mitigation:
- Resource limits (memory, CPU)
- Timeout settings
- Execution count limits
| Technology | Isolation Level | Overhead | Complexity |
|---|---|---|---|
| Sandbox runtime | Good | Very low | Low |
| Docker containers | Setup-dependent | Low | Medium |
| gVisor | Excellent | Medium-High | Medium |
| VM (Firecracker) | Excellent | High | High |
docker run \
# Remove all Linux capabilities
--cap-drop ALL \
# Prevent new privilege acquisition
--security-opt no-new-privileges \
# Read-only root filesystem
--read-only \
# Writable temp directory (limited)
--tmpfs /tmp:rw,noexec,nosuid,size=100m \
# Disable networking
--network none \
# Memory limit
--memory 2g \
# Non-root user
--user 1000:1000 \
# Read-only code mount
-v /code:/workspace:ro \
# Allow only proxy socket
-v /var/run/proxy.sock:/var/run/proxy.sock:ro \
agent-image| Option | Purpose |
|---|---|
--cap-drop ALL |
Prevent privilege escalation attacks |
--read-only |
Prevent filesystem tampering |
--network none |
Prevent network exfiltration |
--tmpfs |
Prevent data persistence between sessions |
--user 1000:1000 |
Prevent root privileges |
# ❌ Dangerous: Agent has direct access to secrets
options = ClaudeAgentOptions(
env={
"GITHUB_TOKEN": "ghp_xxxxxxxxxxxxx", # Exposed!
"DB_PASSWORD": "my-secret-password" # Exposed!
}
)┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Agent │────▶│ Proxy │────▶│ External API│
│ (Isolated) │ │(Secret Inject)│ │ │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
│ No network │ Holds credentials
│ No secrets │
1. Envoy Proxy Configuration:
# envoy.yaml
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
http_filters:
- name: envoy.filters.http.lua
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
inline_code: |
function envoy_on_request(request_handle)
-- Inject auth header
request_handle:headers():add(
"Authorization",
"Bearer " .. os.getenv("API_TOKEN")
)
end
- name: envoy.filters.http.router2. Agent Using Proxy:
# Agent only knows proxy URL, not actual credentials
options = ClaudeAgentOptions(
env={
"API_BASE_URL": "http://localhost:8080" # Proxy address
# Actual token injected by proxy
}
)# Authentication handled by external MCP server
@tool(
name="call_api",
description="Call external API (auth handled by server)",
input_schema={"endpoint": str, "params": dict}
)
async def call_api(args: dict) -> dict:
# This function runs outside the agent
# Uses actual credentials
response = await http_client.post(
f"{API_URL}/{args['endpoint']}",
headers={"Authorization": f"Bearer {os.environ['API_TOKEN']}"},
json=args['params']
)
return {"content": [{"type": "text", "text": response.text}]}# Complete isolation
docker run --network none agent-image
# Allow only specific network (Docker network)
docker network create --internal agent-net
docker run --network agent-net agent-image{
"permissions": {
"allow": [
"WebFetch(domain:docs.example.com)",
"WebFetch(domain:api.example.com)",
"WebSearch"
],
"deny": [
"WebFetch(domain:*)" // Block all other domains
]
}
}# Apply allowlist in proxy
ALLOWED_DOMAINS = [
"api.github.com",
"registry.npmjs.org",
"pypi.org"
]
async def proxy_request(request):
host = urlparse(request.url).hostname
if host not in ALLOWED_DOMAINS:
return Response(status=403, text="Domain not allowed")
# Forward to allowed domain
return await forward_request(request)# Code is read-only
docker run \
-v /project/src:/workspace/src:ro \
-v /project/tests:/workspace/tests:ro \
agent-image# Exclude sensitive files before mounting
rsync -av --exclude='.env*' \
--exclude='.git-credentials' \
--exclude='*.pem' \
--exclude='*.key' \
/project/ /safe-project/
docker run -v /safe-project:/workspace:ro agent-imageRecord changes in separate layer for review before applying:
# Use OverlayFS
mkdir -p /overlay/{upper,work,merged}
mount -t overlay overlay \
-o lowerdir=/project,upperdir=/overlay/upper,workdir=/overlay/work \
/overlay/merged
docker run -v /overlay/merged:/workspace agent-image
# Review changes
ls /overlay/upper
# Apply only approved changes
cp -r /overlay/upper/* /project/{
"permissions": {
"deny": [
"Read(.env*)",
"Read(.git-credentials)",
"Read(**/*.pem)",
"Read(**/*.key)",
"Read(**/*secret*)",
"Read(**/*password*)",
"Read(**/*credential*)",
"Read(~/.ssh/**)",
"Read(~/.aws/**)",
"Write(.env*)",
"Edit(.env*)"
]
}
}import json
import logging
from datetime import datetime
async def audit_hook(input_data: dict, tool_use_id: str, context) -> dict:
"""Audit logging for all tool usage"""
log_entry = {
"timestamp": datetime.now().isoformat(),
"session_id": context.session_id,
"tool": input_data.get("tool_name"),
"input": input_data.get("tool_input"),
"tool_use_id": tool_use_id
}
# Structured logging
logging.info(json.dumps(log_entry))
# Send to external system (optional)
await send_to_siem(log_entry)
return {}
options = ClaudeAgentOptions(
hooks={
"PostToolUse": [HookMatcher(hooks=[audit_hook])]
}
)# Detect abnormal patterns
async def anomaly_detector(input_data: dict, tool_use_id: str, context) -> dict:
tool = input_data.get("tool_name")
# Frequency-based detection
if await get_tool_count_last_minute(tool) > THRESHOLD:
await alert(f"High frequency {tool} usage detected")
# Pattern-based detection
if tool == "Bash":
command = input_data.get("tool_input", {}).get("command", "")
if detect_suspicious_pattern(command):
await alert(f"Suspicious command: {command}")
return {
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": "Suspicious pattern detected"
}
}
return {}total_cost = 0
MAX_BUDGET = 10.0 # $10 limit
async for message in query(prompt="Task", options=options):
if isinstance(message, ResultMessage) and message.subtype == "success":
total_cost += message.total_cost_usd
if total_cost > MAX_BUDGET * 0.8:
logging.warning(f"Budget 80% used: ${total_cost:.2f}")
if total_cost > MAX_BUDGET:
raise Exception(f"Budget exceeded: ${total_cost:.2f}")-
allowrules include only necessary tools -
denyrules protect sensitive files -
askrules confirm dangerous actions -
bypassPermissionsnot used in production
- No secrets directly exposed to agent
- Proxy pattern or MCP server used
- No sensitive info in environment variables
- No credentials in configuration files
- Execution environment isolated via container/VM
- Network access restricted (only necessary domains)
- Filesystem access restricted (only necessary paths)
- Resource limits (memory, CPU)
- PreToolUse hooks block dangerous commands
- Input validation logic implemented
- Hook script permissions restricted (700)
- All tool usage logged
- Anomaly detection alerts configured
- Cost/usage monitoring
- Regular audit log review
| Item | Frequency | Responsible |
|---|---|---|
| Permission rule review | Monthly | Security team |
| Audit log analysis | Weekly | Operations team |
| Vulnerability scanning | Monthly | Security team |
| Dependency updates | Weekly | Development team |
| Access permission review | Quarterly | Security team |
- Immediate Isolation: Stop agent execution
- Evidence Collection: Preserve audit logs
- Impact Analysis: Identify accessed resources
- Remediation: Rotate leaked credentials
- Post-Mortem: Identify cause and prevent recurrence
# Backup audit logs
cp /var/log/claude-audit.jsonl /backup/audit-$(date +%Y%m%d).jsonl
# Compress and long-term storage
gzip /backup/audit-*.jsonl
aws s3 cp /backup/ s3://security-logs/claude/ --recursivePrevious: Configuration Guide | Next: Architecture Patterns