Skip to main content

MCP error handling

The complete error response structure

// Every MCP tool failure must return this structure
interface MCPErrorResponse {
isError: true;
errorCategory: 'transient' | 'validation' | 'business' | 'permission';
isRetryable: boolean;
message: string; // Specific human-readable description
retryAfterMs?: number; // Optional: hint for retry backoff
}

Error category reference

CategoryMeaningisRetryableAgent action
transientTemporary failure: DB timeout, service downtrueRetry with exponential backoff
validationBad input: wrong format, missing required fieldfalseFix input before retrying
businessPolicy violation: limit exceeded, not allowedfalseEscalate or communicate to user
permissionAccess denied: no auth, wrong scopefalseRotate credentials or escalate

The critical empty-result distinction

// ✅ Successful query, no records found
// isError: false — this is NOT an error
// The agent should accept this and continue
{
"isError": false,
"content": [],
"message": "No orders found for customer CUS-48291 in the last 90 days"
}

// ❌ Tool execution failed
// isError: true — agent must handle the failure
{
"isError": true,
"errorCategory": "transient",
"isRetryable": true,
"message": "Database connection timeout after 30 seconds"
}

Why this matters: An agent that treats empty results as errors will retry a successful query endlessly. An agent that treats errors as empty results will silently miss data.

Implementing the error structure

# MCP tool implementation
def get_customer(customer_id: str) -> dict:
try:
customer = db.query("SELECT * FROM customers WHERE id = ?", customer_id)

if customer is None:
# Not an error — valid query, no matching record
return {
"isError": False,
"content": [],
"message": f"No customer found with ID {customer_id}"
}

return {"isError": False, "content": [customer]}

except DatabaseTimeout:
return {
"isError": True,
"errorCategory": "transient",
"isRetryable": True,
"message": "Database connection timeout. Retry is appropriate.",
"retryAfterMs": 2000
}

except PermissionDenied:
return {
"isError": True,
"errorCategory": "permission",
"isRetryable": False,
"message": "Agent does not have permission to access customer records"
}

Agent-side error handling

def handle_tool_result(tool_name: str, result: dict) -> None:
if not result.get("isError"):
# Success — content may be empty (that's fine)
process_content(result.get("content", []))
return

category = result.get("errorCategory")
if result.get("isRetryable") and retry_count < MAX_RETRIES:
schedule_retry(tool_name, result.get("retryAfterMs", 1000))
elif category == "business":
escalate_to_human(reason=result["message"])
elif category == "permission":
alert_operations(f"Permission error on {tool_name}: {result['message']}")
else:
log_failure(tool_name, result)
propagate_to_coordinator(result)

Official documentation