Skip to content

Observability

These are copy-pasteable patterns for adding observability to your workflows. They use the hooks and plugin system described in the Hooks & Plugins guide — no additional packages required.

@rytejs/otel

For production OpenTelemetry instrumentation, use the official plugin:

bash
pnpm add @rytejs/otel

One line to instrument a router with tracing and metrics:

ts
const router = new WorkflowRouter(taskWorkflow);
router.use(createOtelPlugin());

This automatically:

  • Creates a span per dispatch (ryte.dispatch.{commandType})
  • Records transitions and events as span events
  • Sets error status with category, code, and source attributes
  • Emits three metrics: ryte.dispatch.count, ryte.dispatch.duration, ryte.transition.count

By default it uses the global OpenTelemetry API. To use a specific tracer or meter:

ts
declare const trace: { getTracer(name: string): unknown };
declare const metrics: { getMeter(name: string): unknown };

const customRouter = new WorkflowRouter(taskWorkflow);
customRouter.use(
	createOtelPlugin({
		// biome-ignore lint/suspicious/noExplicitAny: external OTel Tracer type
		tracer: trace.getTracer("my-service") as any,
		// biome-ignore lint/suspicious/noExplicitAny: external OTel Meter type
		meter: metrics.getMeter("my-service") as any,
	}),
);

The patterns below are still useful if you want custom observability without the @rytejs/otel dependency.

Executor Tracing

createOtelExecutorMiddleware traces executor operations with span attributes for workflow ID and command type:

ts
const store = memoryStore();
const executor = new WorkflowExecutor(taskRouter, store);
executor.use(createOtelExecutorMiddleware());

// Traces executor operations:
// - ryte.execute.{commandType} spans for execute()
// - Attributes: ryte.workflow.id, ryte.command.type

Import from @rytejs/otel (the root export, no subpath).

Full Stack Tracing

Combine the router plugin with the executor plugin for end-to-end traces. The executor span wraps the router dispatch span:

ts
// Router-level: dispatch spans, transition events, metrics
const tracedRouter = new WorkflowRouter(taskWorkflow);
tracedRouter.use(createOtelPlugin());

// Executor-level: operation spans wrapping the router dispatch
const tracedStore = memoryStore();
const tracedExecutor = new WorkflowExecutor(tracedRouter, tracedStore);
tracedExecutor.use(createOtelExecutorMiddleware());

// End-to-end: executor span → router dispatch span → handler

Structured Logging

Captures command type, final state, success/failure, and duration on every dispatch.

ts
const startTimeKey = createKey<number>("startTime");

// biome-ignore lint/complexity/noBannedTypes: {} means "no deps", matching the router default
const loggingPlugin = definePlugin<TaskConfig, {}>((router) => {
	router.on("pipeline:start", ({ set }) => {
		set(startTimeKey, Date.now());
	});
	router.on("pipeline:end", ({ get, command, workflow }, result) => {
		const duration = Date.now() - get(startTimeKey);
		console.log(
			JSON.stringify({
				command: command.type,
				state: workflow.state,
				ok: result.ok,
				duration,
			}),
		);
	});
});

Because pipeline:end is guaranteed to fire whenever pipeline:start fires, the duration is always recorded — even when the handler throws an unexpected error.

OpenTelemetry Tracing

Creates a span per dispatch, sets its status based on the result, and ends it after the dispatch completes.

ts
// biome-ignore lint/suspicious/noExplicitAny: external OTel span type
const spanKey = createKey<any>("span");

// biome-ignore lint/complexity/noBannedTypes: {} means "no deps", matching the router default
const otelPlugin = definePlugin<TaskConfig, {}>((router) => {
	router.on("pipeline:start", ({ command, set }) => {
		const span = tracer.startSpan(`ryte.dispatch.${command.type}`);
		set(spanKey, span);
	});
	router.on("pipeline:end", ({ get }, result) => {
		const span = get(spanKey);
		span.setStatus({ code: result.ok ? SpanStatusCode.OK : SpanStatusCode.ERROR });
		span.end();
	});
});

Replace the commented import with your actual OpenTelemetry tracer setup. The span name includes the command type, making traces easy to filter by command.

Audit Trail

Records every state transition and every domain error to an external audit log.

ts
// biome-ignore lint/complexity/noBannedTypes: {} means "no deps", matching the router default
const auditPlugin = definePlugin<TaskConfig, {}>((router) => {
	router.on("transition", (from, to, workflow) => {
		auditLog.record({
			workflowId: workflow.id,
			from,
			to,
			timestamp: new Date(),
		});
	});
	router.on("error", (error, { workflow, command }) => {
		auditLog.record({
			workflowId: workflow.id,
			error: error.category,
			command: command.type,
			timestamp: new Date(),
		});
	});
});

The error hook fires for domain and validation errors — the same errors returned in the result rather than thrown. Unexpected errors (handler throws a non-domain, non-validation error) are not captured here; use pipeline:end with result.ok === false and result.error.category === "unexpected" if you need those. Unexpected errors are captured as { category: "unexpected", error, message } in the result, and pipeline:end always fires.

Metrics

Increments counters for every dispatch and every state transition, tagged with relevant dimensions.

ts
// biome-ignore lint/complexity/noBannedTypes: {} means "no deps", matching the router default
const metricsPlugin = definePlugin<TaskConfig, {}>((router) => {
	router.on("pipeline:end", ({ command, workflow }, result) => {
		metrics.increment("ryte.dispatch.total", {
			command: command.type,
			state: workflow.state,
			ok: String(result.ok),
		});
	});
	router.on("transition", (from, to) => {
		metrics.increment("ryte.transition.total", { from, to });
	});
});

Replace metrics.increment with whatever client your metrics backend provides (StatsD, Prometheus, Datadog, etc.). The tags give you per-command and per-state breakdown out of the box.

Registering Plugins

All four plugins above are registered the same way:

ts
const router = new WorkflowRouter(taskWorkflow);

router.use(loggingPlugin);
router.use(otelPlugin);
router.use(auditPlugin);
router.use(metricsPlugin);

See Hooks & Plugins for the full hook event reference and error isolation behaviour.