mirror of https://github.com/tiennm99/claude-central-gateway.git synced 2026-04-17 11:20:28 +00:00

Files

tiennm99 170cdb1324 docs: Add comprehensive documentation suite

- Project overview, system architecture, code standards
- API reference with 15+ examples
- Quick start guide with troubleshooting
- Updated README with feature highlights and compatibility matrix

2026-04-05 11:47:18 +07:00

12 KiB

Raw Permalink Blame History

API Reference

Overview

Claude Central Gateway implements the Anthropic Messages API, making it a drop-in replacement for the official Anthropic API. All endpoints and request/response formats match the Anthropic API specification.

Endpoints

POST /v1/messages

Create a message and get a response from the model.

Authentication

All requests to /v1/messages require authentication via the x-api-key header:

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{...}'

Alternatively, use Authorization: Bearer header:

curl -X POST https://gateway.example.com/v1/messages \
  -H "Authorization: Bearer my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{...}'

Request Body

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hello, how are you?"
        }
      ]
    }
  ],
  "max_tokens": 1024,
  "stream": false,
  "temperature": 0.7,
  "top_p": 1.0,
  "stop_sequences": null,
  "system": "You are a helpful assistant.",
  "tools": null,
  "tool_choice": null
}

Request Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Model identifier (e.g., `claude-sonnet-4-20250514`). Gateway maps to OpenAI model via `MODEL_MAP` env var.
`messages`	array	Yes	Array of message objects with conversation history.
`max_tokens`	integer	Yes	Maximum tokens to generate (1-4096 typical).
`stream`	boolean	No	If `true`, stream response as Server-Sent Events. Default: `false`.
`temperature`	number	No	Sampling temperature (0.0-1.0). Higher = more random. Default: `1.0`.
`top_p`	number	No	Nucleus sampling parameter (0.0-1.0). Default: `1.0`.
`stop_sequences`	array	No	Array of strings; generation stops when any is encountered. Max 5 sequences.
`system`	string or array	No	System prompt. String or array of text blocks.
`tools`	array	No	Array of tool definitions the model can call.
`tool_choice`	object	No	Constraints on which tool to use.

Message Object

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "What is 2 + 2?"
    },
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/jpeg",
        "data": "base64-encoded-image-data"
      }
    },
    {
      "type": "tool_result",
      "tool_use_id": "tool_call_123",
      "content": "Result from tool execution",
      "is_error": false
    }
  ]
}

Message Content Types

text

{
  "type": "text",
  "text": "String content"
}

image (user messages only)

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/jpeg",
    "data": "base64-encoded-image"
  }
}

Or from URL:

{
  "type": "image",
  "source": {
    "type": "url",
    "url": "https://example.com/image.jpg"
  }
}

tool_use (assistant messages only, in responses)

{
  "type": "tool_use",
  "id": "call_123",
  "name": "search",
  "input": {
    "query": "capital of France"
  }
}

tool_result (user messages only, after tool_use)

{
  "type": "tool_result",
  "tool_use_id": "call_123",
  "content": "The capital of France is Paris.",
  "is_error": false
}

Tool Definition

{
  "name": "search",
  "description": "Search the web for information",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query"
      }
    },
    "required": ["query"]
  }
}

Tool Choice

Control which tool the model uses.

Auto (default):

{
  "type": "auto"
}

Model must use a tool:

{
  "type": "any"
}

Model cannot use tools:

{
  "type": "none"
}

Model must use specific tool:

{
  "type": "tool",
  "name": "search"
}

Response (Non-Streaming)

{
  "id": "msg_1234567890abcdef",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "2 + 2 = 4"
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 5
  }
}

Response Parameters

Parameter	Type	Description
`id`	string	Unique message identifier.
`type`	string	Always `"message"`.
`role`	string	Always `"assistant"`.
`content`	array	Array of content blocks (text or tool_use).
`model`	string	Model identifier that processed the request.
`stop_reason`	string	Reason generation stopped (see Stop Reasons).
`usage`	object	Token usage: `input_tokens`, `output_tokens`.

Response (Streaming)

Stream responses as Server-Sent Events when stream: true:

event: message_start
data: {"type":"message_start","message":{"id":"msg_...","type":"message","role":"assistant","content":[],"model":"claude-sonnet-4-20250514","stop_reason":null,"usage":{"input_tokens":0,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" How"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" are"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":5}}

event: message_stop
data: {"type":"message_stop"}

Stream Event Types

message_start First event, contains message envelope.

content_block_start New content block begins (text or tool_use).

index: Position in content array.
content_block: Block metadata.

content_block_delta Incremental update to current block.

Text blocks: delta.type: "text_delta", delta.text: string
Tool blocks: delta.type: "input_json_delta", delta.partial_json: string

content_block_stop Current block complete.

message_delta Final message metadata.

delta.stop_reason: Reason generation stopped.
usage.output_tokens: Total output tokens.

message_stop Stream ended.

Stop Reasons

Stop Reason	Meaning
`end_turn`	Model completed generation naturally.
`max_tokens`	Hit `max_tokens` limit.
`stop_sequence`	Generation hit user-specified `stop_sequences`.
`tool_use`	Model selected a tool to call.

Error Responses

401 Unauthorized (invalid token)

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Unauthorized"
  }
}

400 Bad Request (malformed request)

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Bad Request"
  }
}

500 Internal Server Error (server misconfiguration or API error)

{
  "type": "error",
  "error": {
    "type": "api_error",
    "message": "Internal server error"
  }
}

Health Check Endpoint

GET /

Returns gateway status (no authentication required).

curl https://gateway.example.com/

Response:

{
  "status": "ok",
  "name": "Claude Central Gateway"
}

Configuration

Gateway behavior controlled via environment variables:

Variable	Required	Description	Example
`GATEWAY_TOKEN`	Yes	Shared token for authentication.	`sk-gatewaytoken123...`
`OPENAI_API_KEY`	Yes	OpenAI API key for authentication.	`sk-proj-...`
`MODEL_MAP`	No	Comma-separated model name mappings.	`claude-sonnet-4-20250514:gpt-4o,claude-opus:gpt-4-turbo`

Usage Examples

Simple Text Request

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Say hello!"}
    ]
  }'

Streaming Response

curl -X POST https://gateway.example.com/v1/messages \
  -H "x-api-key: my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 256,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Count to 5"}
    ]
  }' \
  -N

Tool Use Workflow

Request with tools:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "tools": [
    {
      "name": "search",
      "description": "Search the web",
      "input_schema": {
        "type": "object",
        "properties": {
          "query": {"type": "string"}
        },
        "required": ["query"]
      }
    }
  ],
  "messages": [
    {"role": "user", "content": "What is the capital of France?"}
  ]
}

Response with tool_use:

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "call_123",
      "name": "search",
      "input": {"query": "capital of France"}
    }
  ],
  "stop_reason": "tool_use",
  "usage": {"input_tokens": 50, "output_tokens": 25}
}

Follow-up request with tool result:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "messages": [
    {"role": "user", "content": "What is the capital of France?"},
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "id": "call_123",
          "name": "search",
          "input": {"query": "capital of France"}
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "call_123",
          "content": "Paris is the capital of France"
        }
      ]
    }
  ]
}

Final response:

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Paris is the capital of France."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {"input_tokens": 100, "output_tokens": 15}
}

Image Request

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 256,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/jpeg",
            "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
          }
        },
        {
          "type": "text",
          "text": "Describe this image"
        }
      ]
    }
  ]
}

Using Claude SDK (Recommended)

Set environment variables:

export ANTHROPIC_BASE_URL=https://gateway.example.com
export ANTHROPIC_AUTH_TOKEN=my-secret-token

Then use normally:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: process.env.ANTHROPIC_BASE_URL,
  apiKey: process.env.ANTHROPIC_AUTH_TOKEN,
});

const message = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 256,
  messages: [
    { role: "user", content: "Say hello!" }
  ],
});

console.log(message.content[0].text);

Limitations & Compatibility

Fully Supported

Text messages
Image content (base64 and URLs)
Tool definitions and tool use/tool result round-trips
System messages (string or array)
Streaming responses with proper SSE format
Stop sequences
Temperature, top_p, max_tokens
Usage token counts

Unsupported (Filtered Out)

Thinking blocks (Claude 3.7+)
Cache control directives
Multi-modal tool inputs (tools receive text input only)
Vision-specific model parameters

Behavioral Differences from Anthropic API

Single shared token (no per-user auth)
No rate limiting (implement on your end if needed)
No request logging/audit trail
Error messages may differ (OpenAI error format converted)
Latency slightly higher due to proxying

Rate Limiting Notes

Gateway itself has no rate limits. Limits come from:

OpenAI API quota: Based on your API tier
Network throughput: Hono/platform limits
Token count: OpenAI pricing

Recommendations:

Implement client-side rate limiting
Monitor token usage via usage field in responses
Set aggressive max_tokens limits if cost is concern
Use smaller models in MODEL_MAP for cost reduction

12 KiB Raw Permalink Blame History