Versions affected
Reproduced on openai@6.34.0 (current latest). The same code path exists at least back to openai@6.3.0, so all v4+ releases appear to be affected (fetchWithTimeout in src/client.ts).
Summary
The timeout option advertised by new OpenAI({ timeout }) and by per-request { timeout } options only covers the response-header arrival (TTFB) phase. Once headers are received the internal timer is cleared in a finally block, so the subsequent body read (await response.json(), etc.) has no timer and can hang indefinitely.
This has caused a real 37-minute hang in a production aiAct workflow for us: the upstream model returned 200 OK quickly, then stalled while writing the body. The SDK's 10-minute default timeout never fired, the promise never rejected, retries never kicked in, and the caller had no signal of failure.
Root cause
src/client.ts → fetchWithTimeout (abbreviated from node_modules/openai/client.mjs@6.34.0):
async fetchWithTimeout(url, init, ms, controller) {
const { signal, method, ...options } = init || {};
const abort = this._makeAbort(controller);
if (signal) signal.addEventListener('abort', abort, { once: true });
const timeout = setTimeout(abort, ms);
// ...
try {
return await this.fetch.call(undefined, url, fetchOptions); // resolves at headers
} finally {
clearTimeout(timeout); // ← cleared the instant headers arrive
}
}
After this returns, the caller does await response.json() in internal/parse.mjs. That body read is unguarded.
Minimal reproducer
// repro.mjs — run with `node repro.mjs` after `npm install openai@6.34.0`
import { createServer } from 'node:http';
import { once } from 'node:events';
import OpenAI from 'openai';
const TIMEOUT_MS = 2000;
const HANG_DETECTED_MS = 6000;
const server = createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.flushHeaders();
// intentionally never res.end() — simulates a server stalled mid-body
});
server.listen(0, '127.0.0.1');
await once(server, 'listening');
const { port } = server.address();
const client = new OpenAI({
apiKey: 'sk-test',
baseURL: `http://127.0.0.1:${port}/v1`,
timeout: TIMEOUT_MS,
maxRetries: 0,
});
const start = Date.now();
let outcome = 'pending';
const request = client.chat.completions
.create({ model: 'gpt-whatever', messages: [{ role: 'user', content: 'hi' }] })
.then(() => (outcome = 'resolved'), (err) => (outcome = `rejected: ${err?.message}`));
const winner = await Promise.race([
request,
new Promise((r) => setTimeout(() => r('watchdog-fired'), HANG_DETECTED_MS)),
]);
console.log({ configuredTimeoutMs: TIMEOUT_MS, elapsedMs: Date.now() - start, winner, outcome });
server.close();
process.exit(winner === 'watchdog-fired' ? 1 : 0);
Expected: rejects at ~2000 ms with a timeout / abort error.
Actual:
{
"openaiVersion": "6.34.0",
"configuredTimeoutMs": 2000,
"watchdogMs": 6000,
"elapsedMs": 6000,
"winner": "watchdog-fired",
"outcome": "pending"
}
BUG: SDK timeout=2000ms did not fire; still hanging after 6000ms
The request is still pending 3× past the configured timeout.
Why this matters
Many providers are OpenAI-compatible and sometimes send headers immediately but stream the body slowly (reasoning-heavy models, proxy gateways, rate-limited paths). Any such stall currently translates into an unbounded hang in the SDK — not a timeout error. Downstream retry logic can't recover because no error is ever observed.
Suggested fix direction
Keep the timer armed across the body-read phase. One approach: instead of clearTimeout(timeout) in finally, clear it only when
- the response has no body (no
response.body) — clear immediately; or
- the response body is fully drained / cancelled / errored — clear in the body stream's
done / cancel / error handler.
A minimal implementation wraps response.body in a ReadableStream whose pipe-through clears the timeout on pull completion or error. We're happy to send a PR once a direction is acknowledged (generated-code considerations noted per CONTRIBUTING.md).
Client-side mitigation we applied
For context, downstream (Midscene.js) we now inject our own AbortSignal with a hard timeout (default 180 s) and pass it via the SDK's signal option on every completion.create, because the SDK forwards that signal into the underlying fetch body stream. This works end-to-end but should not be the normal expectation for users of the SDK. Reference: web-infra-dev/midscene#2350
Versions affected
Reproduced on
openai@6.34.0(current latest). The same code path exists at least back toopenai@6.3.0, so all v4+ releases appear to be affected (fetchWithTimeoutinsrc/client.ts).Summary
The
timeoutoption advertised bynew OpenAI({ timeout })and by per-request{ timeout }options only covers the response-header arrival (TTFB) phase. Once headers are received the internal timer is cleared in afinallyblock, so the subsequent body read (await response.json(), etc.) has no timer and can hang indefinitely.This has caused a real 37-minute hang in a production aiAct workflow for us: the upstream model returned
200 OKquickly, then stalled while writing the body. The SDK's 10-minute default timeout never fired, the promise never rejected, retries never kicked in, and the caller had no signal of failure.Root cause
src/client.ts→fetchWithTimeout(abbreviated fromnode_modules/openai/client.mjs@6.34.0):After this returns, the caller does
await response.json()ininternal/parse.mjs. That body read is unguarded.Minimal reproducer
Expected: rejects at ~2000 ms with a timeout / abort error.
Actual:
{ "openaiVersion": "6.34.0", "configuredTimeoutMs": 2000, "watchdogMs": 6000, "elapsedMs": 6000, "winner": "watchdog-fired", "outcome": "pending" } BUG: SDK timeout=2000ms did not fire; still hanging after 6000msThe request is still pending 3× past the configured timeout.
Why this matters
Many providers are OpenAI-compatible and sometimes send headers immediately but stream the body slowly (reasoning-heavy models, proxy gateways, rate-limited paths). Any such stall currently translates into an unbounded hang in the SDK — not a timeout error. Downstream retry logic can't recover because no error is ever observed.
Suggested fix direction
Keep the timer armed across the body-read phase. One approach: instead of
clearTimeout(timeout)infinally, clear it only whenresponse.body) — clear immediately; ordone/cancel/errorhandler.A minimal implementation wraps
response.bodyin aReadableStreamwhose pipe-through clears the timeout onpullcompletion or error. We're happy to send a PR once a direction is acknowledged (generated-code considerations noted perCONTRIBUTING.md).Client-side mitigation we applied
For context, downstream (Midscene.js) we now inject our own
AbortSignalwith a hard timeout (default 180 s) and pass it via the SDK'ssignaloption on everycompletion.create, because the SDK forwards that signal into the underlying fetch body stream. This works end-to-end but should not be the normal expectation for users of the SDK. Reference: web-infra-dev/midscene#2350