Load balancing across worker instances
Run multiple workers in a group; Resonate dispatches work to whichever is available, recovers it when one dies.
Run several workers in the same group. Calls to target: "poll://any@<group>" get dispatched to whichever worker claims them first. When a worker dies mid-execution, Resonate reassigns the work to a survivor — no service registry, no leader election, no glue.
TypeScript: @resonatehq/sdk v0.10.2 (current). Python: resonate-sdk v0.6.x against the legacy Resonate Server. Rust: 0.4.0, in active development. Go: pre-release — no semver tag yet, so the example pins a specific commit (see Go SDK).
Worker group with random-cost compute jobs dispatched via async RPC.
Worker group with random-cost compute jobs dispatched via async RPC.
Worker group with random-cost compute jobs dispatched via async RPC + spawn().
Worker group with simulated compute jobs dispatched via async RPC to a poll://any target.
The problem#
A single worker eventually runs out of capacity, and a single worker is also a single point of failure. The textbook fix is to run several — but that opens its own can: which worker has spare capacity, how does the caller find one, what happens when the chosen worker dies mid-job, who takes the work over?
Most teams end up bolting service discovery, load balancing, and recovery onto application code in three different places, each with its own bugs.
Resonate's solution#
Resonate ships service discovery, load balancing, and crash recovery behind one primitive: the target schema. Workers in the same group long-poll the server; the caller dispatches with target: "poll://any@<group>" and the server hands the work to whichever worker is ready. If that worker dies before completing, the workflow's durable promise stays open and another worker in the group picks it up.
Code walkthrough#
Two pieces: a worker that registers a durable function and joins a group, and a client that dispatches work to that group.
The worker group#
Each worker process is identical except for the group it joins. Run as many as you want — they share work automatically.
import { Resonate } from "@resonatehq/sdk";
import type { Context } from "@resonatehq/sdk";
const resonate = new Resonate({
url: "http://localhost:8001",
group: "workers",
});
function computeSomething(ctx: Context, args: { id: string; computeCost: number }) {
console.log(`${args.id} starting computation`);
setTimeout(() => {
console.log(`${args.id} computed something that cost ${args.computeCost} seconds`);
}, args.computeCost * 1000);
}
resonate.register("computeSomething", computeSomething);
console.log("worker is running...");from resonate import Resonate
from threading import Event
import time
resonate = Resonate.remote(group="worker-group")
@resonate.register
def compute_something(_, id, compute_cost):
print(f"starting computation {id}")
time.sleep(compute_cost)
print(f"computed something that cost {compute_cost} seconds")
resonate.start()
print("worker running...")
Event().wait()use resonate::prelude::*;
use std::time::Duration;
#[resonate::function]
async fn compute_something(ctx: &Context, id: String, compute_cost: u64) -> Result<()> {
println!("{id} starting computation");
ctx.sleep(Duration::from_secs(compute_cost)).await?;
println!("{id} computed something that cost {compute_cost} seconds");
Ok(())
}
#[tokio::main]
async fn main() {
let resonate = Resonate::new(ResonateConfig {
url: Some("http://localhost:8001".into()),
group: Some("workers".into()),
..Default::default()
});
resonate.register(compute_something).unwrap();
println!("worker is running...");
tokio::signal::ctrl_c().await.unwrap();
}// WorkArgs carries the task identifier sent from client to worker.
type WorkArgs struct {
TaskName string `json:"taskName"`
}
// computeSomething simulates a unit of work. It records which worker
// handled the task so the client can print the distribution.
func computeSomething(workerID string) func(_ *resonate.Context, args WorkArgs) (string, error) {
return func(_ *resonate.Context, args WorkArgs) (string, error) {
// Simulate a small amount of work.
time.Sleep(50 * time.Millisecond)
result := fmt.Sprintf("worker-%s handled %s", workerID, args.TaskName)
fmt.Printf("[worker-%s] handling %s → done\n", workerID, args.TaskName)
return result, nil
}
}Each worker instance is created with httpnet.NewHTTP(url, httpnet.HTTPOptions{Group: ...}) and registers computeSomething under that shared group. The example spins several up inside one process — in production each would be its own process:
r, err := resonate.New(resonate.Config{
Network: httpnet.NewHTTP(*url, httpnet.HTTPOptions{
PID: pid,
Group: *group,
}),
})
if err != nil {
log.Fatalf("resonate.New (worker-%s): %v", workerID, err)
}
workerInstances[i] = r
// Register the compute function with a closure that captures the worker ID.
if _, err := resonate.Register(r, "computeSomething", computeSomething(workerID)); err != nil {
log.Fatalf("Register (worker-%s): %v", workerID, err)
}Dispatching to the group#
The caller picks a target with the poll://any@<group> schema. any means "whichever worker in the group claims it first."
import { Resonate } from "@resonatehq/sdk";
import { v4 as uuid } from "uuid";
const resonate = new Resonate({
url: "http://localhost:8001",
group: "client",
});
const id = uuid();
const computeCost = Math.floor(Math.random() * 10) + 1;
await resonate.beginRpc(
id,
"computeSomething",
{ id, computeCost },
resonate.options({ target: "poll://any@workers" }),
);
await resonate.stop();from resonate import Resonate
from uuid import uuid4
from random import randint
resonate = Resonate.remote(group="invoke-group")
promise_id = str(uuid4())
compute_cost = randint(1, 10)
_ = resonate.options(target="poll://any@worker-group").begin_rpc(
promise_id, "compute_something", promise_id, compute_cost,
)use rand::Rng;
use resonate::prelude::*;
use uuid::Uuid;
#[tokio::main]
async fn main() {
let resonate = Resonate::new(ResonateConfig {
url: Some("http://localhost:8001".into()),
..Default::default()
});
let id = Uuid::new_v4().to_string();
let cost: u64 = rand::thread_rng().gen_range(1..=10);
let _: () = resonate
.rpc(&id, "compute_something", (id.clone(), cost))
.target("poll://any@workers")
.spawn()
.await
.unwrap();
resonate.stop().await;
}A separate client instance dispatches each task with client.RPC and a poll://any@<group> target — the server hands each task to whichever worker in the group claims it first. Both blocks below are excerpts from main(), where *url, *group, and ctx come from the parsed flags and request context:
clientPID := fmt.Sprintf("client-%d", time.Now().UnixNano())
client, err := resonate.New(resonate.Config{
Network: httpnet.NewHTTP(*url, httpnet.HTTPOptions{
PID: clientPID,
Group: "client",
}),
})
if err != nil {
log.Fatalf("resonate.New (client): %v", err)
}
defer func() { _ = client.Stop() }()
// Build the anycast target address for the worker group.
target := fmt.Sprintf("poll://any@%s", *group) id := fmt.Sprintf("%s-%s", runID, taskName)
h, err := client.RPC(ctx, id, "computeSomething", WorkArgs{TaskName: taskName},
resonate.RPCOptions{Target: target},
)Run it locally#
Start the server, run several workers, then dispatch repeatedly from the client.
git clone https://github.com/resonatehq-examples/example-load-balancing-ts
cd example-load-balancing-ts
npm installbrew install resonatehq/tap/resonate
resonate devnpx tsx worker.tsfor i in 1 2 3 4 5 6; do npx tsx client.ts; doneWatch the work spread across the three worker terminals. Now kill one of them mid-execution — Resonate reassigns its in-flight workflow to a survivor.
git clone https://github.com/resonatehq-examples/example-load-balancing-py
cd example-load-balancing-py
uv syncbrew install resonatehq/tap/resonate
resonate serveuv run python worker.pyfor i in 1 2 3 4 5 6; do uv run python invoke.py; doneWatch the work spread across the three worker terminals. Kill one of them mid-execution — Resonate reassigns its in-flight workflow to a survivor.
git clone https://github.com/resonatehq-examples/example-load-balancing-rs
cd example-load-balancing-rs
cargo buildbrew install resonatehq/tap/resonate
resonate devcargo run --bin workerfor i in 1 2 3 4 5 6; do cargo run --bin client; doneUnlike the other SDKs, the Go example runs the workers and the client in a single binary — there's no separate terminal per worker. -workers controls how many worker instances it spins up.
git clone https://github.com/resonatehq-examples/example-load-balancing-go
cd example-load-balancing-go
go mod downloadbrew install resonatehq/tap/resonate
resonate devgo run . -url=http://localhost:8001 -workers=3 -tasks=6The single process starts three worker instances and a client that dispatches six tasks across the group; the [worker-N] log lines show the work spreading. Increase -workers and -tasks to watch the distribution change. Because the workers share this one process, the kill-a-single-worker recovery story below is the one shown by the separate-process TypeScript, Python, and Rust examples.
Try the recovery story#
Start three workers and dispatch enough jobs to keep all of them busy. Kill the worker holding a long-running job. The Resonate Server detects the loss, reassigns the workflow's durable promise to one of the survivors, and the work continues. The client never sees an error — it just gets a slightly delayed result.
The TypeScript, Python, and Rust examples run each worker as its own process, so you can Ctrl-C a single one to watch this happen. The Go example runs its workers inside one process, so it demonstrates the distribution but not single-worker recovery — recursive factorial shows the Go crash-recovery story with a separate worker process.
Related#
- Human-in-the-loop — same group dispatch, with a workflow that suspends on a durable promise.
- Async HTTP API endpoints — long-running HTTP work without holding the connection.