docs: Add comprehensive documentation suite

- Project overview, system architecture, code standards - API reference with 15+ examples - Quick start guide with troubleshooting - Updated README with feature highlights and compatibility matrix
2026-04-17 13:20:56 +00:00 · 2026-04-05 11:47:18 +07:00
parent a1113e02aa
commit 170cdb1324
8 changed files with 1929 additions and 10 deletions
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -0,0 +1,589 @@
+# API Reference
+
+## Overview
+
+Claude Central Gateway implements the Anthropic Messages API, making it a drop-in replacement for the official Anthropic API. All endpoints and request/response formats match the [Anthropic API specification](https://docs.anthropic.com/en/docs/about/api-overview).
+
+## Endpoints
+
+### POST /v1/messages
+
+Create a message and get a response from the model.
+
+#### Authentication
+
+All requests to `/v1/messages` require authentication via the `x-api-key` header:
+
+```bash
+curl -X POST https://gateway.example.com/v1/messages \
+  -H "x-api-key: my-secret-token" \
+  -H "Content-Type: application/json" \
+  -d '{...}'
+```
+
+Alternatively, use `Authorization: Bearer` header:
+
+```bash
+curl -X POST https://gateway.example.com/v1/messages \
+  -H "Authorization: Bearer my-secret-token" \
+  -H "Content-Type: application/json" \
+  -d '{...}'
+```
+
+#### Request Body
+
+```json
+{
+  "model": "claude-sonnet-4-20250514",
+  "messages": [
+    {
+      "role": "user",
+      "content": [
+        {
+          "type": "text",
+          "text": "Hello, how are you?"
+        }
+      ]
+    }
+  ],
+  "max_tokens": 1024,
+  "stream": false,
+  "temperature": 0.7,
+  "top_p": 1.0,
+  "stop_sequences": null,
+  "system": "You are a helpful assistant.",
+  "tools": null,
+  "tool_choice": null
+}
+```
+
+##### Request Parameters
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `model` | string | Yes | Model identifier (e.g., `claude-sonnet-4-20250514`). Gateway maps to OpenAI model via `MODEL_MAP` env var. |
+| `messages` | array | Yes | Array of message objects with conversation history. |
+| `max_tokens` | integer | Yes | Maximum tokens to generate (1-4096 typical). |
+| `stream` | boolean | No | If `true`, stream response as Server-Sent Events. Default: `false`. |
+| `temperature` | number | No | Sampling temperature (0.0-1.0). Higher = more random. Default: `1.0`. |
+| `top_p` | number | No | Nucleus sampling parameter (0.0-1.0). Default: `1.0`. |
+| `stop_sequences` | array | No | Array of strings; generation stops when any is encountered. Max 5 sequences. |
+| `system` | string or array | No | System prompt. String or array of text blocks. |
+| `tools` | array | No | Array of tool definitions the model can call. |
+| `tool_choice` | object | No | Constraints on which tool to use. |
+
+##### Message Object
+
+```json
+{
+  "role": "user",
+  "content": [
+    {
+      "type": "text",
+      "text": "What is 2 + 2?"
+    },
+    {
+      "type": "image",
+      "source": {
+        "type": "base64",
+        "media_type": "image/jpeg",
+        "data": "base64-encoded-image-data"
+      }
+    },
+    {
+      "type": "tool_result",
+      "tool_use_id": "tool_call_123",
+      "content": "Result from tool execution",
+      "is_error": false
+    }
+  ]
+}
+```
+
+###### Message Content Types
+
+**text**
+```json
+{
+  "type": "text",
+  "text": "String content"
+}
+```
+
+**image** (user messages only)
+```json
+{
+  "type": "image",
+  "source": {
+    "type": "base64",
+    "media_type": "image/jpeg",
+    "data": "base64-encoded-image"
+  }
+}
+```
+
+Or from URL:
+```json
+{
+  "type": "image",
+  "source": {
+    "type": "url",
+    "url": "https://example.com/image.jpg"
+  }
+}
+```
+
+**tool_use** (assistant messages only, in responses)
+```json
+{
+  "type": "tool_use",
+  "id": "call_123",
+  "name": "search",
+  "input": {
+    "query": "capital of France"
+  }
+}
+```
+
+**tool_result** (user messages only, after tool_use)
+```json
+{
+  "type": "tool_result",
+  "tool_use_id": "call_123",
+  "content": "The capital of France is Paris.",
+  "is_error": false
+}
+```
+
+##### Tool Definition
+
+```json
+{
+  "name": "search",
+  "description": "Search the web for information",
+  "input_schema": {
+    "type": "object",
+    "properties": {
+      "query": {
+        "type": "string",
+        "description": "Search query"
+      }
+    },
+    "required": ["query"]
+  }
+}
+```
+
+##### Tool Choice
+
+Control which tool the model uses.
+
+Auto (default):
+```json
+{
+  "type": "auto"
+}
+```
+
+Model must use a tool:
+```json
+{
+  "type": "any"
+}
+```
+
+Model cannot use tools:
+```json
+{
+  "type": "none"
+}
+```
+
+Model must use specific tool:
+```json
+{
+  "type": "tool",
+  "name": "search"
+}
+```
+
+#### Response (Non-Streaming)
+
+```json
+{
+  "id": "msg_1234567890abcdef",
+  "type": "message",
+  "role": "assistant",
+  "content": [
+    {
+      "type": "text",
+      "text": "2 + 2 = 4"
+    }
+  ],
+  "model": "claude-sonnet-4-20250514",
+  "stop_reason": "end_turn",
+  "usage": {
+    "input_tokens": 10,
+    "output_tokens": 5
+  }
+}
+```
+
+##### Response Parameters
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `id` | string | Unique message identifier. |
+| `type` | string | Always `"message"`. |
+| `role` | string | Always `"assistant"`. |
+| `content` | array | Array of content blocks (text or tool_use). |
+| `model` | string | Model identifier that processed the request. |
+| `stop_reason` | string | Reason generation stopped (see Stop Reasons). |
+| `usage` | object | Token usage: `input_tokens`, `output_tokens`. |
+
+#### Response (Streaming)
+
+Stream responses as Server-Sent Events when `stream: true`:
+
+```
+event: message_start
+data: {"type":"message_start","message":{"id":"msg_...","type":"message","role":"assistant","content":[],"model":"claude-sonnet-4-20250514","stop_reason":null,"usage":{"input_tokens":0,"output_tokens":0}}}
+
+event: content_block_start
+data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
+
+event: content_block_delta
+data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" How"}}
+
+event: content_block_delta
+data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" are"}}
+
+event: content_block_stop
+data: {"type":"content_block_stop","index":0}
+
+event: message_delta
+data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":5}}
+
+event: message_stop
+data: {"type":"message_stop"}
+```
+
+###### Stream Event Types
+
+**message_start**
+First event, contains message envelope.
+
+**content_block_start**
+New content block begins (text or tool_use).
+- `index`: Position in content array.
+- `content_block`: Block metadata.
+
+**content_block_delta**
+Incremental update to current block.
+- Text blocks: `delta.type: "text_delta"`, `delta.text: string`
+- Tool blocks: `delta.type: "input_json_delta"`, `delta.partial_json: string`
+
+**content_block_stop**
+Current block complete.
+
+**message_delta**
+Final message metadata.
+- `delta.stop_reason`: Reason generation stopped.
+- `usage.output_tokens`: Total output tokens.
+
+**message_stop**
+Stream ended.
+
+#### Stop Reasons
+
+| Stop Reason | Meaning |
+|------------|---------|
+| `end_turn` | Model completed generation naturally. |
+| `max_tokens` | Hit `max_tokens` limit. |
+| `stop_sequence` | Generation hit user-specified `stop_sequences`. |
+| `tool_use` | Model selected a tool to call. |
+
+#### Error Responses
+
+**401 Unauthorized** (invalid token)
+```json
+{
+  "type": "error",
+  "error": {
+    "type": "authentication_error",
+    "message": "Unauthorized"
+  }
+}
+```
+
+**400 Bad Request** (malformed request)
+```json
+{
+  "type": "error",
+  "error": {
+    "type": "invalid_request_error",
+    "message": "Bad Request"
+  }
+}
+```
+
+**500 Internal Server Error** (server misconfiguration or API error)
+```json
+{
+  "type": "error",
+  "error": {
+    "type": "api_error",
+    "message": "Internal server error"
+  }
+}
+```
+
+## Health Check Endpoint
+
+### GET /
+
+Returns gateway status (no authentication required).
+
+```bash
+curl https://gateway.example.com/
+```
+
+Response:
+```json
+{
+  "status": "ok",
+  "name": "Claude Central Gateway"
+}
+```
+
+## Configuration
+
+Gateway behavior controlled via environment variables:
+
+| Variable | Required | Description | Example |
+|----------|----------|-------------|---------|
+| `GATEWAY_TOKEN` | Yes | Shared token for authentication. | `sk-gatewaytoken123...` |
+| `OPENAI_API_KEY` | Yes | OpenAI API key for authentication. | `sk-proj-...` |
+| `MODEL_MAP` | No | Comma-separated model name mappings. | `claude-sonnet-4-20250514:gpt-4o,claude-opus:gpt-4-turbo` |
+
+## Usage Examples
+
+### Simple Text Request
+
+```bash
+curl -X POST https://gateway.example.com/v1/messages \
+  -H "x-api-key: my-secret-token" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "claude-sonnet-4-20250514",
+    "max_tokens": 256,
+    "messages": [
+      {"role": "user", "content": "Say hello!"}
+    ]
+  }'
+```
+
+### Streaming Response
+
+```bash
+curl -X POST https://gateway.example.com/v1/messages \
+  -H "x-api-key: my-secret-token" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "claude-sonnet-4-20250514",
+    "max_tokens": 256,
+    "stream": true,
+    "messages": [
+      {"role": "user", "content": "Count to 5"}
+    ]
+  }' \
+  -N
+```
+
+### Tool Use Workflow
+
+**Request with tools:**
+```json
+{
+  "model": "claude-sonnet-4-20250514",
+  "max_tokens": 256,
+  "tools": [
+    {
+      "name": "search",
+      "description": "Search the web",
+      "input_schema": {
+        "type": "object",
+        "properties": {
+          "query": {"type": "string"}
+        },
+        "required": ["query"]
+      }
+    }
+  ],
+  "messages": [
+    {"role": "user", "content": "What is the capital of France?"}
+  ]
+}
+```
+
+**Response with tool_use:**
+```json
+{
+  "id": "msg_...",
+  "type": "message",
+  "role": "assistant",
+  "content": [
+    {
+      "type": "tool_use",
+      "id": "call_123",
+      "name": "search",
+      "input": {"query": "capital of France"}
+    }
+  ],
+  "stop_reason": "tool_use",
+  "usage": {"input_tokens": 50, "output_tokens": 25}
+}
+```
+
+**Follow-up request with tool result:**
+```json
+{
+  "model": "claude-sonnet-4-20250514",
+  "max_tokens": 256,
+  "messages": [
+    {"role": "user", "content": "What is the capital of France?"},
+    {
+      "role": "assistant",
+      "content": [
+        {
+          "type": "tool_use",
+          "id": "call_123",
+          "name": "search",
+          "input": {"query": "capital of France"}
+        }
+      ]
+    },
+    {
+      "role": "user",
+      "content": [
+        {
+          "type": "tool_result",
+          "tool_use_id": "call_123",
+          "content": "Paris is the capital of France"
+        }
+      ]
+    }
+  ]
+}
+```
+
+**Final response:**
+```json
+{
+  "id": "msg_...",
+  "type": "message",
+  "role": "assistant",
+  "content": [
+    {
+      "type": "text",
+      "text": "Paris is the capital of France."
+    }
+  ],
+  "stop_reason": "end_turn",
+  "usage": {"input_tokens": 100, "output_tokens": 15}
+}
+```
+
+### Image Request
+
+```json
+{
+  "model": "claude-sonnet-4-20250514",
+  "max_tokens": 256,
+  "messages": [
+    {
+      "role": "user",
+      "content": [
+        {
+          "type": "image",
+          "source": {
+            "type": "base64",
+            "media_type": "image/jpeg",
+            "data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
+          }
+        },
+        {
+          "type": "text",
+          "text": "Describe this image"
+        }
+      ]
+    }
+  ]
+}
+```
+
+### Using Claude SDK (Recommended)
+
+Set environment variables:
+```bash
+export ANTHROPIC_BASE_URL=https://gateway.example.com
+export ANTHROPIC_AUTH_TOKEN=my-secret-token
+```
+
+Then use normally:
+```javascript
+import Anthropic from "@anthropic-ai/sdk";
+
+const client = new Anthropic({
+  baseURL: process.env.ANTHROPIC_BASE_URL,
+  apiKey: process.env.ANTHROPIC_AUTH_TOKEN,
+});
+
+const message = await client.messages.create({
+  model: "claude-sonnet-4-20250514",
+  max_tokens: 256,
+  messages: [
+    { role: "user", content: "Say hello!" }
+  ],
+});
+
+console.log(message.content[0].text);
+```
+
+## Limitations & Compatibility
+
+### Fully Supported
+- Text messages
+- Image content (base64 and URLs)
+- Tool definitions and tool use/tool result round-trips
+- System messages (string or array)
+- Streaming responses with proper SSE format
+- Stop sequences
+- Temperature, top_p, max_tokens
+- Usage token counts
+
+### Unsupported (Filtered Out)
+- Thinking blocks (Claude 3.7+)
+- Cache control directives
+- Multi-modal tool inputs (tools receive text input only)
+- Vision-specific model parameters
+
+### Behavioral Differences from Anthropic API
+- Single shared token (no per-user auth)
+- No rate limiting (implement on your end if needed)
+- No request logging/audit trail
+- Error messages may differ (OpenAI error format converted)
+- Latency slightly higher due to proxying
+
+## Rate Limiting Notes
+
+Gateway itself has no rate limits. Limits come from:
+1. **OpenAI API quota**: Based on your API tier
+2. **Network throughput**: Hono/platform limits
+3. **Token count**: OpenAI pricing
+
+Recommendations:
+- Implement client-side rate limiting
+- Monitor token usage via `usage` field in responses
+- Set aggressive `max_tokens` limits if cost is concern
+- Use smaller models in `MODEL_MAP` for cost reduction