Load balancing across worker instances

Run multiple workers in a group; Resonate dispatches work to whichever is available, recovers it when one dies.

Run several workers in the same group. Calls to target: "poll://any@<group>" get dispatched to whichever worker claims them first. When a worker dies mid-execution, Resonate reassigns the work to a survivor — no service registry, no leader election, no glue.

SDK versions

TypeScript: @resonatehq/sdk v0.10.2 (current). Python: resonate-sdk v0.7.0 (current). Rust: 0.4.0, in active development. Go: pre-release — no semver tag yet, so the example pins a specific commit (see Go SDK).

example-load-balancing-tsTypeScript

Worker group with random-cost compute jobs dispatched via async RPC.

example-load-balancing-pyPython

Worker group with random-cost compute jobs dispatched via async RPC.

example-load-balancing-rsRust

Worker group with random-cost compute jobs dispatched via async RPC + spawn().

example-load-balancing-goGo

Worker group with simulated compute jobs dispatched via async RPC to a poll://any target.

The problem#

A single worker eventually runs out of capacity, and a single worker is also a single point of failure. The textbook fix is to run several — but that opens its own can: which worker has spare capacity, how does the caller find one, what happens when the chosen worker dies mid-job, who takes the work over?

Most teams end up bolting service discovery, load balancing, and recovery onto application code in three different places, each with its own bugs.

Resonate's solution#

Resonate ships service discovery, load balancing, and crash recovery behind one primitive: the target schema. Workers in the same group long-poll the server; the caller dispatches with target: "poll://any@<group>" and the server hands the work to whichever worker is ready. If that worker dies before completing, the workflow's durable promise stays open and another worker in the group picks it up.

Code walkthrough#

Two pieces: a worker that registers a durable function and joins a group, and a client that dispatches work to that group.

The worker group#

Each worker process is identical except for the group it joins. Run as many as you want — they share work automatically.

worker.ts·typescript

import { Resonate } from "@resonatehq/sdk";
import type { Context } from "@resonatehq/sdk";

const resonate = new Resonate({
  url: "http://localhost:8001",
  group: "workers",
});

function computeSomething(ctx: Context, args: { id: string; computeCost: number }) {
  console.log(`${args.id} starting computation`);
  setTimeout(() => {
    console.log(`${args.id} computed something that cost ${args.computeCost} seconds`);
  }, args.computeCost * 1000);
}

resonate.register("computeSomething", computeSomething);
console.log("worker is running...");

worker.py·python

from __future__ import annotations

import asyncio
import os
from datetime import timedelta
from typing import TYPE_CHECKING

from resonate.resonate import Resonate

if TYPE_CHECKING:
    from resonate.context import Context

url = os.environ.get("RESONATE_URL", "http://localhost:8001")
resonate = Resonate(url=url, group="worker-group")


async def compute_something(ctx: Context, id: str, compute_cost: int) -> None:
    print(f"starting computation {id}")
    await ctx.sleep(timedelta(seconds=compute_cost))
    print(f"computed something that cost {compute_cost} seconds")


resonate.register(compute_something)


async def main() -> None:
    print("worker running...")
    # Keep the worker alive to receive invocations.
    await asyncio.Event().wait()


if __name__ == "__main__":
    asyncio.run(main())

src/bin/worker.rs·rust

use resonate::prelude::*;
use std::time::Duration;

#[resonate::function]
async fn compute_something(ctx: &Context, id: String, compute_cost: u64) -> Result<()> {
    println!("{id} starting computation");
    ctx.sleep(Duration::from_secs(compute_cost)).await?;
    println!("{id} computed something that cost {compute_cost} seconds");
    Ok(())
}

#[tokio::main]
async fn main() {
    let resonate = Resonate::new(ResonateConfig {
        url: Some("http://localhost:8001".into()),
        group: Some("workers".into()),
        ..Default::default()
    });
    resonate.register(compute_something).unwrap();
    println!("worker is running...");
    tokio::signal::ctrl_c().await.unwrap();
}

main.go·go

// WorkArgs carries the task identifier sent from client to worker.
type WorkArgs struct {
	TaskName string `json:"taskName"`
}

// computeSomething simulates a unit of work. It records which worker
// handled the task so the client can print the distribution.
func computeSomething(workerID string) func(_ *resonate.Context, args WorkArgs) (string, error) {
	return func(_ *resonate.Context, args WorkArgs) (string, error) {
		// Simulate a small amount of work.
		time.Sleep(50 * time.Millisecond)
		result := fmt.Sprintf("worker-%s handled %s", workerID, args.TaskName)
		fmt.Printf("[worker-%s] handling %s → done\n", workerID, args.TaskName)
		return result, nil
	}
}

Each worker instance is created with httpnet.NewHTTP(url, httpnet.HTTPOptions{Group: ...}) and registers computeSomething under that shared group. The example spins several up inside one process — in production each would be its own process:

main.go·go

		r, err := resonate.New(resonate.Config{
			Network: httpnet.NewHTTP(*url, httpnet.HTTPOptions{
				PID:   pid,
				Group: *group,
			}),
		})
		if err != nil {
			log.Fatalf("resonate.New (worker-%s): %v", workerID, err)
		}
		workerInstances[i] = r

		// Register the compute function with a closure that captures the worker ID.
		if _, err := resonate.Register(r, "computeSomething", computeSomething(workerID)); err != nil {
			log.Fatalf("Register (worker-%s): %v", workerID, err)
		}

Dispatching to the group#

The caller picks a target with the poll://any@<group> schema. any means "whichever worker in the group claims it first."

client.ts·typescript

import { Resonate } from "@resonatehq/sdk";
import { v4 as uuid } from "uuid";

const resonate = new Resonate({
  url: "http://localhost:8001",
  group: "client",
});

const id = uuid();
const computeCost = Math.floor(Math.random() * 10) + 1;
await resonate.beginRpc(
  id,
  "computeSomething",
  { id, computeCost },
  resonate.options({ target: "poll://any@workers" }),
);
await resonate.stop();

invoke.py·python

from __future__ import annotations

import asyncio
import os
from random import randint
from uuid import uuid4

from resonate.resonate import Resonate


async def main() -> None:
    url = os.environ.get("RESONATE_URL", "http://localhost:8001")
    resonate = Resonate(url=url, group="invoke-group")
    try:
        promise_id = str(uuid4())
        compute_cost = randint(1, 10)
        handle = resonate.options(target="poll://any@worker-group").rpc(
            promise_id, "compute_something", promise_id, compute_cost,
        )
        await handle.result()
    finally:
        await resonate.stop()


if __name__ == "__main__":
    asyncio.run(main())

src/bin/client.rs·rust

use rand::Rng;
use resonate::prelude::*;
use uuid::Uuid;

#[tokio::main]
async fn main() {
    let resonate = Resonate::new(ResonateConfig {
        url: Some("http://localhost:8001".into()),
        ..Default::default()
    });
    let id = Uuid::new_v4().to_string();
    let cost: u64 = rand::thread_rng().gen_range(1..=10);

    let _: () = resonate
        .rpc(&id, "compute_something", (id.clone(), cost))
        .target("poll://any@workers")
        .spawn()
        .await
        .unwrap();
    resonate.stop().await;
}

A separate client instance dispatches each task with client.RPC and a poll://any@<group> target — the server hands each task to whichever worker in the group claims it first. Both blocks below are excerpts from main(), where *url, *group, and ctx come from the parsed flags and request context:

main.go·go

	clientPID := fmt.Sprintf("client-%d", time.Now().UnixNano())
	client, err := resonate.New(resonate.Config{
		Network: httpnet.NewHTTP(*url, httpnet.HTTPOptions{
			PID:   clientPID,
			Group: "client",
		}),
	})
	if err != nil {
		log.Fatalf("resonate.New (client): %v", err)
	}
	defer func() { _ = client.Stop() }()

	// Build the anycast target address for the worker group.
	target := fmt.Sprintf("poll://any@%s", *group)

main.go·go

		id := fmt.Sprintf("%s-%s", runID, taskName)
		h, err := client.RPC(ctx, id, "computeSomething", WorkArgs{TaskName: taskName},
			resonate.RPCOptions{Target: target},
		)

Run it locally#

Start the server, run several workers, then dispatch repeatedly from the client.

shell

git clone https://github.com/resonatehq-examples/example-load-balancing-ts
cd example-load-balancing-ts
npm install

Terminal 1Resonate Server·shell

brew install resonatehq/tap/resonate
resonate dev

Terminals 2–4three workers·shell

npx tsx worker.ts

Terminal 5dispatch in a loop·shell

for i in 1 2 3 4 5 6; do npx tsx client.ts; done

Watch the work spread across the three worker terminals. Now kill one of them mid-execution — Resonate reassigns its in-flight workflow to a survivor.

shell

git clone https://github.com/resonatehq-examples/example-load-balancing-py
cd example-load-balancing-py
uv sync

Terminal 1Resonate Server·shell

brew install resonatehq/tap/resonate
resonate dev

Terminals 2–4three workers·shell

uv run python worker.py

Terminal 5dispatch in a loop·shell

for i in 1 2 3 4 5 6; do uv run python invoke.py; done

Watch the work spread across the three worker terminals. Kill one of them mid-execution — Resonate reassigns its in-flight workflow to a survivor.

shell

git clone https://github.com/resonatehq-examples/example-load-balancing-rs
cd example-load-balancing-rs
cargo build

Terminal 1Resonate Server·shell

brew install resonatehq/tap/resonate
resonate dev

Terminals 2–4three workers·shell

cargo run --bin worker

Terminal 5dispatch in a loop·shell

for i in 1 2 3 4 5 6; do cargo run --bin client; done

Unlike the other SDKs, the Go example runs the workers and the client in a single binary — there's no separate terminal per worker. -workers controls how many worker instances it spins up.

shell

git clone https://github.com/resonatehq-examples/example-load-balancing-go
cd example-load-balancing-go
go mod download

Terminal 1Resonate Server·shell

brew install resonatehq/tap/resonate
resonate dev

Terminal 2workers + client (one process)·shell

go run . -url=http://localhost:8001 -workers=3 -tasks=6

The single process starts three worker instances and a client that dispatches six tasks across the group; the [worker-N] log lines show the work spreading. Increase -workers and -tasks to watch the distribution change. Because the workers share this one process, the kill-a-single-worker recovery story below is the one shown by the separate-process TypeScript, Python, and Rust examples.

Try the recovery story#

Start three workers and dispatch enough jobs to keep all of them busy. Kill the worker holding a long-running job. The Resonate Server detects the loss, reassigns the workflow's durable promise to one of the survivors, and the work continues. The client never sees an error — it just gets a slightly delayed result.

The TypeScript, Python, and Rust examples run each worker as its own process, so you can Ctrl-C a single one to watch this happen. The Go example runs its workers inside one process, so it demonstrates the distribution but not single-worker recovery — recursive factorial shows the Go crash-recovery story with a separate worker process.

Human-in-the-loop — same group dispatch, with a workflow that suspends on a durable promise.
Async HTTP API endpoints — long-running HTTP work without holding the connection.

Load balancing across worker instances

The problem#

Resonate's solution#

Code walkthrough#

The worker group#

Dispatching to the group#

Run it locally#

Try the recovery story#

Related#