Skip to content

OpenAI格式API

闪电API提供完全兼容OpenAI的API接口,支持所有GPT系列模型以及其他兼容OpenAI格式的模型。

接口地址

https://ai.flashapi.top/v1

认证方式

在请求头中添加API密钥:

http
Authorization: Bearer YOUR_API_KEY

Chat Completions

接口说明

用于文本对话的核心接口,支持单轮和多轮对话。

请求地址

POST /v1/chat/completions

重要说明

GPT模型流式输出要求

  • 所有GPT系列模型必须使用流式输出,因此最好直接用于Codex中
  • 请求中必须设置 "stream": true
  • 非流式请求将返回错误

其他模型(Claude、Gemini等)支持流式和非流式两种模式。

请求示例

GPT模型(必须流式)

bash
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      {
        "role": "system",
        "content": "你是一个有帮助的AI助手"
      },
      {
        "role": "user",
        "content": "用Python写一个快速排序"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 2000
  }'

其他模型(可选流式)

bash
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": false,
    "temperature": 0.7
  }'

请求参数

参数类型必填说明
modelstring模型名称,如gpt-5.2
messagesarray对话消息列表
streambooleanGPT模型必填GPT模型必须设为true,其他模型可选
temperaturenumber温度参数,0-2,默认1
max_tokensinteger最大生成token数
top_pnumber核采样参数,0-1
stopstring/array停止序列
stream_optionsobject流式输出选项

Messages格式

json
{
  "messages": [
    {
      "role": "system",
      "content": "系统提示词"
    },
    {
      "role": "user",
      "content": "用户消息"
    },
    {
      "role": "assistant",
      "content": "助手回复"
    }
  ]
}

角色说明

  • system - 系统提示词,设定AI的行为
  • user - 用户消息
  • assistant - AI的回复(用于多轮对话)

响应格式

json
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-5.2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "这是AI的回复内容"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 56,
    "completion_tokens": 31,
    "total_tokens": 87
  }
}

流式输出

GPT模型必须使用流式输出,其他模型可选。

GPT模型流式请求

bash
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "messages": [{"role": "user", "content": "你好"}],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

流式响应格式(SSE)

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-5.2","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-5.2","choices":[{"index":0,"delta":{"content":"你"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-5.2","choices":[{"index":0,"delta":{"content":"好"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-5.2","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":9,"completion_tokens":2,"total_tokens":11}}

data: [DONE]

非GPT模型(可选流式)

bash
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "你好"}],
    "stream": false
  }'

支持的模型

GPT系列(必须流式输出)

模型说明上下文流式要求
gpt-5.2最新GPT-5模型200K必须
gpt-5.2-codex代码优化版200K必须
gpt-5.1稳定版本128K必须
gpt-5-codex-mini轻量版128K必须

Claude系列(OpenAI格式,可选流式)

模型说明上下文流式要求
claude-opus-4-6最强Claude200K可选
claude-sonnet-4-6平衡版200K可选
claude-haiku-4-5快速版200K可选

Gemini系列(OpenAI格式,可选流式)

模型说明上下文流式要求
gemini-2.5-proGemini旗舰1M可选
gemini-2.0-flash快速版1M可选

代码示例

Python(GPT模型流式)

python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai.flashapi.top/v1"
)

# GPT模型必须使用流式输出
stream = client.chat.completions.create(
    model="gpt-5.2",
    messages=[
        {"role": "system", "content": "你是一个编程助手"},
        {"role": "user", "content": "写一个Python函数计算斐波那契数列"}
    ],
    stream=True,
    stream_options={"include_usage": True}
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")
    
    # 获取使用统计(最后一个chunk)
    if hasattr(chunk, 'usage') and chunk.usage:
        print(f"\n\n使用了 {chunk.usage.total_tokens} tokens")

Python(非GPT模型,可选流式)

python
# 非流式请求(仅非GPT模型支持)
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "user", "content": "写一个快速排序算法"}
    ],
    stream=False
)

print(response.choices[0].message.content)

Node.js(GPT模型流式)

javascript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://ai.flashapi.top/v1'
});

async function main() {
  // GPT模型必须使用流式输出
  const stream = await client.chat.completions.create({
    model: 'gpt-5.2',
    messages: [
      { role: 'system', content: '你是一个编程助手' },
      { role: 'user', content: '写一个快速排序算法' }
    ],
    stream: true,
    stream_options: { include_usage: true }
  });

  for await (const chunk of stream) {
    if (chunk.choices[0]?.delta?.content) {
      process.stdout.write(chunk.choices[0].delta.content);
    }
    
    // 获取使用统计
    if (chunk.usage) {
      console.log(`\n\n使用了 ${chunk.usage.total_tokens} tokens`);
    }
  }
}

main();

Node.js(非GPT模型)

javascript
// 非GPT模型可以使用非流式
async function callClaude() {
  const completion = await client.chat.completions.create({
    model: 'claude-sonnet-4-6',
    messages: [
      { role: 'user', content: '写一个快速排序算法' }
    ],
    stream: false
  });

  console.log(completion.choices[0].message.content);
}

cURL(GPT模型)

bash
# GPT模型必须流式
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

cURL(非GPT模型)

bash
# 非GPT模型可以非流式
curl https://ai.flashapi.top/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": false
  }'

错误处理

错误响应格式

json
{
  "error": {
    "message": "错误描述",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

常见错误码

状态码错误类型说明
401invalid_api_keyAPI密钥无效
429rate_limit_exceeded请求频率超限
500server_error服务器错误
503service_unavailable服务不可用

错误处理示例

python
from openai import OpenAI, APIError

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai.flashapi.top/v1"
)

try:
    # GPT模型必须流式
    stream = client.chat.completions.create(
        model="gpt-5.2",
        messages=[{"role": "user", "content": "Hello"}],
        stream=True
    )
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")
            
except APIError as e:
    print(f"API错误: {e.message}")
    print(f"错误类型: {e.type}")
    print(f"状态码: {e.status_code}")

最佳实践

1. 设置合理的超时

python
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://ai.flashapi.top/v1",
    timeout=30.0  # 30秒超时
)

2. 实现重试机制

python
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10)
)
def call_api():
    return client.chat.completions.create(...)

3. 使用流式输出(GPT模型必须)

GPT模型必须使用流式输出,其他模型可选:

python
# GPT模型(必须流式)
stream = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "写一篇文章"}],
    stream=True,
    stream_options={"include_usage": True}
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end='')

# 非GPT模型(可选流式)
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "写一篇文章"}],
    stream=False  # 可以设为False
)
print(response.choices[0].message.content)

4. 控制Token消耗

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[...],
    max_tokens=500,  # 限制输出长度
    temperature=0.3  # 降低随机性
)

# 查看token使用
print(f"使用了 {response.usage.total_tokens} tokens")

速率限制

闪电API实施以下速率限制:

限制类型限制值
每分钟请求数60
每小时请求数3600
并发请求数10

超过限制会返回429错误,建议实现指数退避重试。

下一步

闪电API | Flash API - 让全球顶级AI模型触手可达