使用流式输出实时显示 Claude 代理的回复——减少等待感知。
什么是流式输出
流式输出让 Claude 的回复逐字到达,而不是等全部生成完。这类似于 ChatGPT 的效果。
Python 流式
from anthropic import Anthropic
client = Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
system="你是一个编程助手",
messages=[{"role": "user", "content": "写一个快速排序"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print()
# 获取完整的消息和用量
final_message = stream.get_final_message()
print(f"输入 tokens: {final_message.usage.input_tokens}")
print(f"输出 tokens: {final_message.usage.output_tokens}")
Node.js 流式
const stream = await anthropic.messages.stream({
model: "claude-sonnet-4-20250514",
max_tokens: 2048,
system: "你是一个编程助手",
messages: [{ role: "user", content: "写一个快速排序" }]
});
for await (const chunk of stream) {
if (chunk.type === "content_block_delta") {
process.stdout.write(chunk.delta.text);
}
}
const finalMessage = await stream.finalMessage();
console.log("\n输入 tokens:", finalMessage.usage.input_tokens);
在 Web 应用中使用
from flask import Response
@app.route("/chat", methods=["POST"])
def chat_stream():
data = request.json
def generate():
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": data["prompt"]}]
) as stream:
for text in stream.text_stream:
yield f"data: {text}\n\n"
yield "data: [DONE]\n\n"
return Response(generate(), mimetype="text/event-stream")