AWS Lambda as a stateless trigger for durable workflows

Lambda receives the request, returns 202, and fires a durable Resonate workflow that can run for hours — past Lambda's timeout ceiling.

Lambda Workers banner

A Lambda function behind API Gateway acts as a stateless trigger. POST /process-document fires a durable Resonate workflow and returns 202 immediately. The workflow itself — download, OCR, LLM analysis, storage, notification — runs on a Resonate worker that doesn't share Lambda's 15-minute ceiling.

SDK versions

TypeScript: @resonatehq/sdk v0.10.1 (current). Python: resonate-sdk v0.6.x against the legacy Resonate Server. Rust example repo is forthcoming.

The problem#

Lambda's 15-minute execution ceiling is fine for short tasks but fatal for anything LLM-heavy, OCR-heavy, or human-in-the-loop. Workflows that run for an hour, a day, or a week need state that survives function timeouts — and the conventional escape (Step Functions, EventBridge, DynamoDB-backed state machines) is a lot of glue per workflow.

Restate's answer is to make Lambda the executor — your Lambda function is the workflow handler, with Restate Cloud calling back into it for each step. That requires a CDK stack, IAM roles, a Restate Cloud environment, service registration, and the Restate Lambda adapter.

Resonate's solution#

Lambda becomes a thin trigger. The handler does one thing: call resonate.run(...) to start a durable workflow, then return 202. The workflow executes on a Resonate worker (long-running Node process) — it can take hours, await human approval, suspend on ctx.sleep for days. Status polling is a separate GET /status/:jobId Lambda that asks Resonate whether the promise has resolved.

No CDK. No service registration. The Lambda calls Resonate the same way it would call any other HTTP API.

Code walkthrough#

Two pieces inside the Lambda handler — start the workflow, poll its status — and the workflow itself, which lives in a separate file because it doesn't run in Lambda.

The Lambda handler (the stateless trigger)#

src/handler.ts
import { Resonate } from "@resonatehq/sdk";
import { processDocument, type DocumentJob } from "./workflow.js";

// Module-level: created once per cold start, reused across warm invocations.
const resonate = new Resonate({ url: process.env["RESONATE_URL"] });
resonate.register("processDocument", processDocument);

async function handleProcessDocument(event: APIGatewayEvent) {
  const job: DocumentJob = JSON.parse(event.body ?? "{}");

  // Fire-and-forget: the workflow runs on a Resonate worker, not in Lambda.
  // resonate.run() returns immediately; Lambda exits without waiting.
  resonate.run(`doc/${job.jobId}`, processDocument, job).catch(console.error);

  return {
    statusCode: 202,
    body: JSON.stringify({ status: "accepted", jobId: job.jobId }),
  };
}

async function handleStatus(event: APIGatewayEvent) {
  const jobId = event.pathParameters?.["jobId"];
  const handle = await resonate.get(`doc/${jobId}`);
  if (!(await handle.done())) {
    return { statusCode: 200, body: JSON.stringify({ status: "processing", jobId }) };
  }
  return { statusCode: 200, body: JSON.stringify({ status: "done", result: await handle.result() }) };
}

Same jobId deduplicates retries — API Gateway can re-fire the request without spawning duplicate work.

The workflow (runs on the worker, not Lambda)#

src/workflow.ts
import type { Context } from "@resonatehq/sdk";

export function* processDocument(ctx: Context, job: DocumentJob) {
  // Each ctx.run is a checkpoint. Crash mid-step, resume from the next.
  const pageCount = yield* ctx.run(downloadDocument, job);
  const text = yield* ctx.run(extractText, job, pageCount);
  const { summary, data } = yield* ctx.run(analyzeDocument, job, text);
  const storedAt = yield* ctx.run(storeResults, job, summary, data);
  const notifiedAt = yield* ctx.run(notifyRequester, job, storedAt);

  return { jobId: job.jobId, summary, extractedData: data, storedAt, notifiedAt };
}

The five steps are plain functions invoked through ctx.run. Each result is checkpointed in a durable promise, so a worker crash mid-LLM-call resumes from the next step rather than re-running the whole pipeline.

Run it locally#

Each repo simulates API Gateway + Lambda + the worker on your machine.

code
git clone https://github.com/resonatehq-examples/example-aws-lambda-ts
cd example-aws-lambda-ts
npm install
Terminal 1 — Resonate Server
brew install resonatehq/tap/resonate
resonate dev
Terminal 2 — worker
npm run worker
Terminal 3 — local Lambda + API Gateway simulator
npm run local

Trigger a job:

code
curl -X POST http://localhost:3000/process-document \
  -H "Content-Type: application/json" \
  -d '{"jobId": "demo-1", "documentUrl": "https://example.com/doc.pdf", "requesterId": "alice", "type": "invoice"}'

Poll for the result:

code
curl http://localhost:3000/status/demo-1

Watch the worker terminal — the five steps log as they checkpoint.

Try the recovery story#

Start the workflow, then kill the worker mid-pipeline (e.g. during the LLM analysis step). Restart the worker. Resonate replays only the steps that hadn't checkpointed; everything before the kill is reused from durable promises. The GET /status/:jobId endpoint never reports a failure — it just shows processing longer.

  • Cloud Run workers — same suspend-and-resume pattern on GCP, with the workflow itself running on the serverless platform.
  • Async HTTP API endpoints — the same fire-and-forget shape on a non-serverless gateway.