Observe your deployment
Withing the context of Resonate, there are several ways to observe the behavior of your application and diagnose issues.
Server logs
Resonate emits structured logs through Go’s slog
package.
When the Server starts, it installs a text handler that writes key/value records to standard output at the operator-selected minimum log level.
Configuring the log level
Log levels: debug
, info
, warn
, or error
.
You can set the log level via the resonate.yml
configuration file.
The default is info
if the field is omitted.
logLevel: debug
You can also set the log level via the --log-level
CLI flag when starting the server:
resonate serve --log-level debug
The CLI flag takes precedence over the configuration file.
Resonate validates the string before applying it. An unrecognized value raises "failed to parse log level" and prevents startup.
Log levels and common messages
Debug – detailed flow diagnostics
Enabled with logLevel: debug
.
Useful for tracing queue activity or individual requests.
- API queues: api:sqe:enqueue, api:sqe:dequeue, api:cqe:enqueue, and api:cqe:dequeue show requests moving through the API buffers with request IDs and payload metadata.
- AIO queues: aio:sqe:enqueue, aio:cqe:enqueue, and aio:cqe:dequeue mirror queue flow inside asynchronous subsystems.
- Scheduler lifecycle: scheduler:add and scheduler:rmv mark coroutine scheduling, including coroutine name and generated ID.
- Per-request traces: HTTP middleware logs http method, URL, and status; the gRPC interceptor logs grpc method names and returned errors.
Info – lifecycle and service announcements
Emitted regardless of log level unless you choose warn/error, which suppress lower levels.
- Startup of subsystems: starting http server, starting grpc server, and starting poll server announce listener addresses when the respective subsystems come online.
- Metrics endpoint: starting metrics server indicates the Prometheus exporter is listening.
- Controlled shutdowns: shutdown signal received, shutting down records the signal value before the server begins graceful cleanup.
Warn – recoverable or throttling conditions
Warnings surface when Resonate recovers automatically but an operator may want to intervene.
- Capacity limits: scheduler queue full fires when coroutine capacity is exhausted, signaling the need to increase buffers or investigate load spikes.
- Metrics shutdown issues: error stopping metrics server appears if the Prometheus endpoint does not close cleanly during shutdown.
- Task processing quirks: Warnings such as failed to parse task, failed to parse promise, error decoding task, or failed to enqueue task highlight malformed data or transient delivery issues detected by background workers.
- Router misses: failed to match promise is logged when no routing rule claims a new promise; Resonate continues by creating the promise without a task
Error – actionable failures
Error logs identify conditions that usually require operator action.
- Critical startup failures: failed to start api, failed to start aio, or control loop failed abort the serve command and indicate fatal initialization issues.
- Shutdown triggers: api error received, shutting down or aio error received, shutting down explain why an emergency shutdown began.
- Data-layer problems: Errors such as failed to read promise propagate exceptions returned by the storage layer, including the triggering command for root-cause analysis.
Metrics
The Resonate server exposes a metrics endpoint :9090/metrics
that is compatible with Prometheus.
The following metrics are available:
# HELP aio_connection number of aio subsystem connections
# TYPE aio_connection gauge
aio_connection{type="sender:poll"} 0
# HELP aio_in_flight_submissions number of in flight aio submissions
# TYPE aio_in_flight_submissions gauge
aio_in_flight_submissions{type="store"} 0
# HELP aio_total_submissions total number of aio submissions
# TYPE aio_total_submissions counter
aio_total_submissions{status="success",type="store"} 0
# HELP aio_worker_count number of aio subsystem workers
# TYPE aio_worker_count gauge
aio_worker_count{type="router"} 0
aio_worker_count{type="sender"} 0
aio_worker_count{type="sender:http"} 0
aio_worker_count{type="sender:poll"} 0
aio_worker_count{type="store:sqlite"} 0
# HELP aio_worker_in_flight_submissions number of in flight aio submissions
# TYPE aio_worker_in_flight_submissions gauge
aio_worker_in_flight_submissions{type="router",worker="0"} 0
aio_worker_in_flight_submissions{type="sender",worker="0"} 0
aio_worker_in_flight_submissions{type="sender:http",worker="0"} 0
aio_worker_in_flight_submissions{type="sender:poll",worker="0"} 0
aio_worker_in_flight_submissions{type="store:sqlite",worker="0"} 0
# HELP coroutines_in_flight number of in flight coroutines
# TYPE coroutines_in_flight gauge
coroutines_in_flight{type="EnqueueTasks"} 0
coroutines_in_flight{type="SchedulePromises"} 0
coroutines_in_flight{type="TimeoutLocks"} 0
coroutines_in_flight{type="TimeoutPromises"} 0
coroutines_in_flight{type="TimeoutTasks"} 0
# ... For all coroutine types
# HELP coroutines_total total number of coroutines
# TYPE coroutines_total counter
coroutines_total{type="EnqueueTasks"} 0
coroutines_total{type="SchedulePromises"} 0
coroutines_total{type="TimeoutLocks"} 0
coroutines_total{type="TimeoutPromises"} 0
coroutines_total{type="TimeoutTasks"} 0
# ... For all coroutine types
The aio
prefix refers to stuff that “goes out” of the Server such as requests to the store, sending tasks to nodes, etc.
Coroutines refers to the units of business logic in the Server.
Using Prometheus and Grafana
The way Grafana works is by pulling data from any timeseries db, in our case this db will be Prometheus. The way Prometheus works is by pulling data from the Resonate Server every X seconds. X is part of the config.
Get Prometheus and Grafana both from their websites and follow the instructions to get both projects onto your machine, whether that is via Docker or direct binary download
- Grafana: https://grafana.com/grafana/download
- Prometheus: https://prometheus.io/download/
Run Grafana.
./grafana server
Define a prometheus.yml
config file so Prometheus understands how to connect to the Resonate Server.
# prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "Resonate"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
# The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.
labels:
app: "resonate"
The Resonate specific part is the scrape_configs
section.
Run Prometheus with the config file you just created, specifying port 9091
in the --web.listen-address
to avoid port conflicts with the Resonate Server.
You can customize the port that Resonate exports metrics to with the Resonate Server config.
./prometheus --config.file=prometheus.yml --web.listen-address=:9091
In the Grafana UI, add Prometheus as a data source.