Logging
Both the Resonate Server and Resonate SDK emit structured logs that can help you observe and diagnose the behavior of your application.
Server logs
Resonate emits structured logs through Go’s slog package.
When the Server starts, it installs a text handler that writes key/value records to standard output at the operator-selected minimum log level.
Configuring the log level
Log levels: debug, info, warn, or error.
You can set the log level via the resonate.yml configuration file.
The default is info if the field is omitted.
logLevel: debug
You can also set the log level via the --log-level CLI flag when starting the server:
resonate serve --log-level debug
The CLI flag takes precedence over the configuration file.
Resonate validates the string before applying it. An unrecognized value raises "failed to parse log level" and prevents startup.
Log levels and common messages
Debug – detailed flow diagnostics
Enabled with logLevel: debug.
Useful for tracing queue activity or individual requests.
- API queues: api:sqe:enqueue, api:sqe:dequeue, api:cqe:enqueue, and api:cqe:dequeue show requests moving through the API buffers with request IDs and payload metadata.
- AIO queues: aio:sqe:enqueue, aio:cqe:enqueue, and aio:cqe:dequeue mirror queue flow inside asynchronous subsystems.
- Scheduler lifecycle: scheduler:add and scheduler:rmv mark coroutine scheduling, including coroutine name and generated ID.
- Per-request traces: HTTP middleware logs http method, URL, and status; the gRPC interceptor logs grpc method names and returned errors.
Info – lifecycle and service announcements
Emitted regardless of log level unless you choose warn/error, which suppress lower levels.
- Startup of subsystems: starting http server, starting grpc server, and starting poll server announce listener addresses when the respective subsystems come online.
- Metrics endpoint: starting metrics server indicates the Prometheus exporter is listening.
- Controlled shutdowns: shutdown signal received, shutting down records the signal value before the server begins graceful cleanup.
Warn – recoverable or throttling conditions
Warnings surface when Resonate recovers automatically but an operator may want to intervene.
- Capacity limits: scheduler queue full fires when coroutine capacity is exhausted, signaling the need to increase buffers or investigate load spikes.
- Metrics shutdown issues: error stopping metrics server appears if the Prometheus endpoint does not close cleanly during shutdown.
- Task processing quirks: Warnings such as failed to parse task, failed to parse promise, error decoding task, or failed to enqueue task highlight malformed data or transient delivery issues detected by background workers.
- Router misses: failed to match promise is logged when no routing rule claims a new promise; Resonate continues by creating the promise without a task
Error – actionable failures
Error logs identify conditions that usually require operator action.
- Critical startup failures: failed to start api, failed to start aio, or control loop failed abort the serve command and indicate fatal initialization issues.
- Shutdown triggers: api error received, shutting down or aio error received, shutting down explain why an emergency shutdown began.
- Data-layer problems: Errors such as failed to read promise propagate exceptions returned by the storage layer, including the triggering command for root-cause analysis.
Log output format
Logs are written to stdout in key-value format:
time=2026-02-04T10:30:00.123Z level=INFO msg="starting http server" addr=":8001"
time=2026-02-04T10:30:01.456Z level=INFO msg="starting grpc server" addr=":50051"
time=2026-02-04T10:30:01.789Z level=INFO msg="starting metrics server" addr=":9090"
Fields:
time- ISO 8601 timestamp with millisecond precisionlevel- Log level (DEBUG, INFO, WARN, ERROR)msg- Human-readable message- Additional fields - Context-specific data (addr, id, method, status, etc.)
SDK logs
The Resonate SDKs also emit logs for observing application behavior.
TypeScript SDK
The TypeScript SDK uses console logging by default:
import { Resonate } from "@resonatehq/sdk";
const resonate = Resonate.remote({
url: "http://localhost:8001",
logLevel: "debug", // debug | info | warn | error
});
What gets logged:
- Function execution (start, completion, errors)
- Context operations (
ctx.run(),ctx.sleep(), etc.) - RPC calls to workers
- Promise resolution attempts
- Retry attempts and failures
Python SDK
The Python SDK uses Python's standard logging module:
import logging
from resonate import Resonate
# Configure Python logging
logging.basicConfig(level=logging.INFO)
resonate = Resonate.remote(
url="http://localhost:8001",
log_level="info", # debug | info | warn | error
)
Production logging patterns
Log aggregation
In production, collect logs from all servers and workers into a centralized system:
Common patterns:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Datadog logs
- CloudWatch Logs (AWS)
- Google Cloud Logging
- Azure Monitor Logs
- Grafana Loki (lightweight alternative)
Docker / Kubernetes
Docker Compose:
services:
resonate-server:
image: resonatehq/resonate:latest
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Kubernetes:
Logs are automatically collected from stdout. Use a log aggregation solution like:
apiVersion: v1
kind: Pod
metadata:
name: resonate-server
annotations:
# Datadog log collection
ad.datadoghq.com/resonate.logs: '[{"source":"resonate","service":"resonate-server"}]'
spec:
containers:
- name: server
image: resonatehq/resonate:latest
Or use Fluent Bit / Fluentd as a DaemonSet to forward logs to your aggregation system.
Structured logging for analysis
Parse structured logs into fields for querying:
Logstash filter example:
filter {
grok {
match => { "message" => "time=%{TIMESTAMP_ISO8601:timestamp} level=%{WORD:level} msg=\"%{DATA:message}\"" }
}
}
Query examples (CloudWatch Insights):
# Find all errors
fields @timestamp, level, msg
| filter level = "ERROR"
| sort @timestamp desc
# Count warnings by type
fields msg
| filter level = "WARN"
| stats count() by msg
What to log and monitor
Critical events
Always monitor these log messages:
Server startup failures:
level=ERROR msg="failed to start api"
level=ERROR msg="failed to start aio"
Action: Check configuration and database connectivity.
Worker heartbeat failures:
level=WARN msg="worker heartbeat timeout" workerId="abc123"
Action: Investigate worker health. Normal during rolling deployments, concerning if sustained.
Database errors:
level=ERROR msg="failed to read promise"
level=ERROR msg="failed to write promise"
Action: Check database health and connection pool.
Capacity issues:
level=WARN msg="scheduler queue full"
Action: Increase server resources or optimize workload.
Normal operational events
These logs indicate healthy operation:
level=INFO msg="starting http server"
level=INFO msg="starting grpc server"
level=INFO msg="shutdown signal received, shutting down"
Log retention and storage
Development
- Retention: 1-7 days
- Level:
debugorinfo - Storage: Local files or stdout
Staging
- Retention: 7-30 days
- Level:
info - Storage: Centralized log aggregation
Production
- Retention: 30-90 days (or per compliance requirements)
- Level:
info(usedebugtemporarily for troubleshooting) - Storage: Centralized log aggregation with archival to object storage (S3, GCS)
Performance considerations
Log volume
Debug logging produces significant volume. In production:
- Use
infoby default - Enable
debugtemporarily when troubleshooting - Monitor log storage costs
Estimate: Debug logging can produce 10-100x more log data than info level.
Log sampling
For very high-throughput systems, consider sampling:
# Hypothetical config (not currently supported)
logSampling:
enabled: true
rate: 0.1 # Log 10% of requests at debug level
Alternative: Use tracing (see Tracing) for detailed execution visibility without overwhelming logs.
Correlating logs across components
Use request IDs to trace requests across server and workers:
Server logs:
level=INFO msg="api:sqe:enqueue" requestId="req-abc123" method="POST" path="/promises"
SDK logs:
level=INFO msg="promise created" requestId="req-abc123" promiseId="order.123"
Search logs by requestId to see the full request lifecycle.
Common debugging scenarios
Task not being processed
Look for:
- Worker registration:
starting poll server(server) + connection logs (worker) - Task creation:
api:sqe:enqueuewith promise/task IDs - Task routing: Check for
failed to match promisewarnings - Worker heartbeat: Look for heartbeat timeout warnings
Promise stuck pending
Look for:
- Promise creation:
api:sqe:enqueuewith promiseId - Task assignment: Check if task was created and routed
- Worker processing: Worker should log function execution start
- Completion: Look for promise resolution logs
Slow performance
Look for:
scheduler queue full- Capacity exhausted- Database query latency - Check database logs
- High request volume - Count
api:sqe:enqueueper second
Best practices
- Start with
infolevel - Debug is too verbose for production - Use structured logging - Parse key-value pairs for analysis
- Aggregate centrally - Don't rely on local log files
- Set up alerts - Monitor critical error patterns
- Retain logs adequately - Balance cost vs troubleshooting needs
- Correlate with metrics - Cross-reference logs with metrics for complete picture
- Test log queries - Ensure you can find what you need during incidents
Summary
For development:
- Use
debugorinfolevel - Logs to stdout are fine
- Focus on understanding normal behavior
For production:
- Use
infolevel (enabledebugonly when troubleshooting) - Centralize logs (ELK, Datadog, CloudWatch, etc.)
- Alert on critical errors (startup failures, database errors)
- Retain logs 30-90 days minimum
- Correlate logs with metrics and traces
Logs are your debugging lifeline. Set them up properly from day one.