SDK `timeout` option does not cover response-body read phase (can hang forever)

### Versions affected

Reproduced on `openai@6.34.0` (current latest). The same code path exists at least back to `openai@6.3.0`, so all v4+ releases appear to be affected (`fetchWithTimeout` in `src/client.ts`).

### Summary

The `timeout` option advertised by `new OpenAI({ timeout })` and by per-request `{ timeout }` options only covers the **response-header arrival** (TTFB) phase. Once headers are received the internal timer is cleared in a `finally` block, so the subsequent body read (`await response.json()`, etc.) has **no timer** and can hang indefinitely.

This has caused a real 37-minute hang in a production aiAct workflow for us: the upstream model returned `200 OK` quickly, then stalled while writing the body. The SDK's 10-minute default timeout never fired, the promise never rejected, retries never kicked in, and the caller had no signal of failure.

### Root cause

`src/client.ts` → `fetchWithTimeout` (abbreviated from `node_modules/openai/client.mjs@6.34.0`):

```js
async fetchWithTimeout(url, init, ms, controller) {
  const { signal, method, ...options } = init || {};
  const abort = this._makeAbort(controller);
  if (signal) signal.addEventListener('abort', abort, { once: true });
  const timeout = setTimeout(abort, ms);
  // ...
  try {
    return await this.fetch.call(undefined, url, fetchOptions);  // resolves at headers
  } finally {
    clearTimeout(timeout);   // ← cleared the instant headers arrive
  }
}
```

After this returns, the caller does `await response.json()` in `internal/parse.mjs`. That body read is unguarded.

### Minimal reproducer

```js
// repro.mjs — run with `node repro.mjs` after `npm install openai@6.34.0`
import { createServer } from 'node:http';
import { once } from 'node:events';
import OpenAI from 'openai';

const TIMEOUT_MS = 2000;
const HANG_DETECTED_MS = 6000;

const server = createServer((req, res) => {
  res.writeHead(200, { 'Content-Type': 'application/json' });
  res.flushHeaders();
  // intentionally never res.end() — simulates a server stalled mid-body
});
server.listen(0, '127.0.0.1');
await once(server, 'listening');
const { port } = server.address();

const client = new OpenAI({
  apiKey: 'sk-test',
  baseURL: `http://127.0.0.1:${port}/v1`,
  timeout: TIMEOUT_MS,
  maxRetries: 0,
});

const start = Date.now();
let outcome = 'pending';
const request = client.chat.completions
  .create({ model: 'gpt-whatever', messages: [{ role: 'user', content: 'hi' }] })
  .then(() => (outcome = 'resolved'), (err) => (outcome = `rejected: ${err?.message}`));

const winner = await Promise.race([
  request,
  new Promise((r) => setTimeout(() => r('watchdog-fired'), HANG_DETECTED_MS)),
]);

console.log({ configuredTimeoutMs: TIMEOUT_MS, elapsedMs: Date.now() - start, winner, outcome });
server.close();
process.exit(winner === 'watchdog-fired' ? 1 : 0);
```

**Expected:** rejects at ~2000 ms with a timeout / abort error.

**Actual:**

```json
{
  "openaiVersion": "6.34.0",
  "configuredTimeoutMs": 2000,
  "watchdogMs": 6000,
  "elapsedMs": 6000,
  "winner": "watchdog-fired",
  "outcome": "pending"
}
BUG: SDK timeout=2000ms did not fire; still hanging after 6000ms
```

The request is still pending 3× past the configured timeout.

### Why this matters

Many providers are OpenAI-compatible and sometimes send headers immediately but stream the body slowly (reasoning-heavy models, proxy gateways, rate-limited paths). Any such stall currently translates into an unbounded hang in the SDK — not a timeout error. Downstream retry logic can't recover because no error is ever observed.

### Suggested fix direction

Keep the timer armed across the body-read phase. One approach: instead of `clearTimeout(timeout)` in `finally`, clear it only when

- the response has no body (no `response.body`) — clear immediately; or
- the response body is fully drained / cancelled / errored — clear in the body stream's `done` / `cancel` / `error` handler.

A minimal implementation wraps `response.body` in a `ReadableStream` whose pipe-through clears the timeout on `pull` completion or error. We're happy to send a PR once a direction is acknowledged (generated-code considerations noted per `CONTRIBUTING.md`).

### Client-side mitigation we applied

For context, downstream (Midscene.js) we now inject our own `AbortSignal` with a hard timeout (default 180 s) and pass it via the SDK's `signal` option on every `completion.create`, because the SDK forwards that signal into the underlying fetch body stream. This works end-to-end but should not be the normal expectation for users of the SDK. Reference: https://github.com/web-infra-dev/midscene/pull/2350

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK `timeout` option does not cover response-body read phase (can hang forever) #1825

Versions affected

Summary

Root cause

Minimal reproducer

Why this matters

Suggested fix direction

Client-side mitigation we applied

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

SDK timeout option does not cover response-body read phase (can hang forever) #1825

Description

Versions affected

Summary

Root cause

Minimal reproducer

Why this matters

Suggested fix direction

Client-side mitigation we applied

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

SDK `timeout` option does not cover response-body read phase (can hang forever) #1825