Files
claude-central-gateway/docs/api-reference.md
tiennm99 170cdb1324 docs: Add comprehensive documentation suite
- Project overview, system architecture, code standards
- API reference with 15+ examples
- Quick start guide with troubleshooting
- Updated README with feature highlights and compatibility matrix
2026-04-05 11:47:18 +07:00

12 KiB

API Reference

Overview

Claude Central Gateway implements the Anthropic Messages API, making it a drop-in replacement for the official Anthropic API. All endpoints and request/response formats match the Anthropic API specification.

Endpoints

POST /v1/messages

Create a message and get a response from the model.

Authentication

All requests to /v1/messages require authentication via the x-api-key header:

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{...}'

Alternatively, use Authorization: Bearer header:

curl -X POST https://gateway.example.com/v1/messages \
  -H "Authorization: Bearer my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{...}'

Request Body

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hello, how are you?"
        }
      ]
    }
  ],
  "max_tokens": 1024,
  "stream": false,
  "temperature": 0.7,
  "top_p": 1.0,
  "stop_sequences": null,
  "system": "You are a helpful assistant.",
  "tools": null,
  "tool_choice": null
}
Request Parameters
Parameter Type Required Description
model string Yes Model identifier (e.g., claude-sonnet-4-20250514). Gateway maps to OpenAI model via MODEL_MAP env var.
messages array Yes Array of message objects with conversation history.
max_tokens integer Yes Maximum tokens to generate (1-4096 typical).
stream boolean No If true, stream response as Server-Sent Events. Default: false.
temperature number No Sampling temperature (0.0-1.0). Higher = more random. Default: 1.0.
top_p number No Nucleus sampling parameter (0.0-1.0). Default: 1.0.
stop_sequences array No Array of strings; generation stops when any is encountered. Max 5 sequences.
system string or array No System prompt. String or array of text blocks.
tools array No Array of tool definitions the model can call.
tool_choice object No Constraints on which tool to use.
Message Object
{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "What is 2 + 2?"
    },
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/jpeg",
        "data": "base64-encoded-image-data"
      }
    },
    {
      "type": "tool_result",
      "tool_use_id": "tool_call_123",
      "content": "Result from tool execution",
      "is_error": false
    }
  ]
}
Message Content Types

text

{
  "type": "text",
  "text": "String content"
}

image (user messages only)

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/jpeg",
    "data": "base64-encoded-image"
  }
}

Or from URL:

{
  "type": "image",
  "source": {
    "type": "url",
    "url": "https://example.com/image.jpg"
  }
}

tool_use (assistant messages only, in responses)

{
  "type": "tool_use",
  "id": "call_123",
  "name": "search",
  "input": {
    "query": "capital of France"
  }
}

tool_result (user messages only, after tool_use)

{
  "type": "tool_result",
  "tool_use_id": "call_123",
  "content": "The capital of France is Paris.",
  "is_error": false
}
Tool Definition
{
  "name": "search",
  "description": "Search the web for information",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query"
      }
    },
    "required": ["query"]
  }
}
Tool Choice

Control which tool the model uses.

Auto (default):

{
  "type": "auto"
}

Model must use a tool:

{
  "type": "any"
}

Model cannot use tools:

{
  "type": "none"
}

Model must use specific tool:

{
  "type": "tool",
  "name": "search"
}

Response (Non-Streaming)

{
  "id": "msg_1234567890abcdef",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "2 + 2 = 4"
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 5
  }
}
Response Parameters
Parameter Type Description
id string Unique message identifier.
type string Always "message".
role string Always "assistant".
content array Array of content blocks (text or tool_use).
model string Model identifier that processed the request.
stop_reason string Reason generation stopped (see Stop Reasons).
usage object Token usage: input_tokens, output_tokens.

Response (Streaming)

Stream responses as Server-Sent Events when stream: true:

event: message_start
data: {"type":"message_start","message":{"id":"msg_...","type":"message","role":"assistant","content":[],"model":"claude-sonnet-4-20250514","stop_reason":null,"usage":{"input_tokens":0,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" How"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" are"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":5}}

event: message_stop
data: {"type":"message_stop"}
Stream Event Types

message_start First event, contains message envelope.

content_block_start New content block begins (text or tool_use).

  • index: Position in content array.
  • content_block: Block metadata.

content_block_delta Incremental update to current block.

  • Text blocks: delta.type: "text_delta", delta.text: string
  • Tool blocks: delta.type: "input_json_delta", delta.partial_json: string

content_block_stop Current block complete.

message_delta Final message metadata.

  • delta.stop_reason: Reason generation stopped.
  • usage.output_tokens: Total output tokens.

message_stop Stream ended.

Stop Reasons

Stop Reason Meaning
end_turn Model completed generation naturally.
max_tokens Hit max_tokens limit.
stop_sequence Generation hit user-specified stop_sequences.
tool_use Model selected a tool to call.

Error Responses

401 Unauthorized (invalid token)

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Unauthorized"
  }
}

400 Bad Request (malformed request)

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Bad Request"
  }
}

500 Internal Server Error (server misconfiguration or API error)

{
  "type": "error",
  "error": {
    "type": "api_error",
    "message": "Internal server error"
  }
}

Health Check Endpoint

GET /

Returns gateway status (no authentication required).

curl https://gateway.example.com/

Response:

{
  "status": "ok",
  "name": "Claude Central Gateway"
}

Configuration

Gateway behavior controlled via environment variables:

Variable Required Description Example
GATEWAY_TOKEN Yes Shared token for authentication. sk-gatewaytoken123...
OPENAI_API_KEY Yes OpenAI API key for authentication. sk-proj-...
MODEL_MAP No Comma-separated model name mappings. claude-sonnet-4-20250514:gpt-4o,claude-opus:gpt-4-turbo

Usage Examples

Simple Text Request

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Say hello!"}
    ]
  }'

Streaming Response

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 256,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Count to 5"}
    ]
  }' \
  -N

Tool Use Workflow

Request with tools:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "tools": [
    {
      "name": "search",
      "description": "Search the web",
      "input_schema": {
        "type": "object",
        "properties": {
          "query": {"type": "string"}
        },
        "required": ["query"]
      }
    }
  ],
  "messages": [
    {"role": "user", "content": "What is the capital of France?"}
  ]
}

Response with tool_use:

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "call_123",
      "name": "search",
      "input": {"query": "capital of France"}
    }
  ],
  "stop_reason": "tool_use",
  "usage": {"input_tokens": 50, "output_tokens": 25}
}

Follow-up request with tool result:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "messages": [
    {"role": "user", "content": "What is the capital of France?"},
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "id": "call_123",
          "name": "search",
          "input": {"query": "capital of France"}
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "call_123",
          "content": "Paris is the capital of France"
        }
      ]
    }
  ]
}

Final response:

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Paris is the capital of France."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {"input_tokens": 100, "output_tokens": 15}
}

Image Request

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/jpeg",
            "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
          }
        },
        {
          "type": "text",
          "text": "Describe this image"
        }
      ]
    }
  ]
}

Set environment variables:

export ANTHROPIC_BASE_URL=https://gateway.example.com
export ANTHROPIC_AUTH_TOKEN=my-secret-token

Then use normally:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: process.env.ANTHROPIC_BASE_URL,
  apiKey: process.env.ANTHROPIC_AUTH_TOKEN,
});

const message = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 256,
  messages: [
    { role: "user", content: "Say hello!" }
  ],
});

console.log(message.content[0].text);

Limitations & Compatibility

Fully Supported

  • Text messages
  • Image content (base64 and URLs)
  • Tool definitions and tool use/tool result round-trips
  • System messages (string or array)
  • Streaming responses with proper SSE format
  • Stop sequences
  • Temperature, top_p, max_tokens
  • Usage token counts

Unsupported (Filtered Out)

  • Thinking blocks (Claude 3.7+)
  • Cache control directives
  • Multi-modal tool inputs (tools receive text input only)
  • Vision-specific model parameters

Behavioral Differences from Anthropic API

  • Single shared token (no per-user auth)
  • No rate limiting (implement on your end if needed)
  • No request logging/audit trail
  • Error messages may differ (OpenAI error format converted)
  • Latency slightly higher due to proxying

Rate Limiting Notes

Gateway itself has no rate limits. Limits come from:

  1. OpenAI API quota: Based on your API tier
  2. Network throughput: Hono/platform limits
  3. Token count: OpenAI pricing

Recommendations:

  • Implement client-side rate limiting
  • Monitor token usage via usage field in responses
  • Set aggressive max_tokens limits if cost is concern
  • Use smaller models in MODEL_MAP for cost reduction