Telemetry

A TelemetrySpec defines what metrics, logs, and traces to collect for observability. ContractSpec automatically instruments your application based on these specs, ensuring you have the visibility you need to monitor, debug, and optimize your system.

Why telemetry matters

You can't fix what you can't see. Telemetry provides visibility into how your application is performing, where errors are occurring, and how users are interacting with your system. Without proper instrumentation, you're flying blind in production.

ContractSpec takes a spec-first approach to telemetry: you declare what you want to observe, and runtime adapters instrument operations automatically. This ensures consistent, comprehensive coverage without manual effort.

Three pillars of observability

Metrics

Numerical measurements collected over time. Examples: request count, error rate, latency percentiles, active users, queue depth. Metrics are cheap to collect and store, making them ideal for high-level monitoring and alerting.

Logs

Timestamped text records of events. Examples: "User 123 logged in", "Payment failed for order 456", "Database connection pool exhausted". Logs provide detailed context for debugging specific issues.

Traces

Records of requests as they flow through your system. A trace shows the complete path of a request—which services it touched, how long each step took, and where errors occurred. Traces are essential for debugging distributed systems.

Example TelemetrySpec

Here's how telemetry is configured in TypeScript:

import { defineTelemetry } from '@lssm/lib.contracts';

export const OrderProcessingTelemetry = defineTelemetry({
  meta: {
    name: 'order.processing.observability',
    version: 1,
  },
  metrics: [
    {
      name: 'orders_created_total',
      type: 'counter',
      description: 'Total number of orders created',
      labels: ['status', 'payment_method'],
    },
    {
      name: 'order_processing_duration_seconds',
      type: 'histogram',
      description: 'Time to process an order',
      buckets: [0.1, 0.5, 1.0, 2.0, 5.0, 10.0],
      labels: ['status'],
    },
  ],
  traces: [
    {
      operation: 'createOrder',
      sampleRate: 1.0, // Trace 100% of requests
      includeInputs: true,
      includeOutputs: true,
      redactFields: ['creditCard', 'ssn'],
    },
    {
      operation: 'processPayment',
      sampleRate: 0.1, // Trace 10% of requests
      includeInputs: false, // Don't log payment details
    },
  ],
  alerts: [
    {
      name: 'high-error-rate',
      condition: 'error_rate > 0.05',
      duration: '5m',
      severity: 'critical',
      notify: ['pagerduty', 'slack'],
    },
  ],
});

Automatic instrumentation

ContractSpec automatically instruments:

All operations – Request count, latency, error rate per Command/Query
All workflows – Step execution time, retry counts, compensation events
All data views – Query execution time, result set size
All policy decisions – Decision time, permit/deny ratio
System resources – CPU, memory, disk, network usage

You don't need to add instrumentation code manually—the runtime handles it based on your specs.

Integration with observability platforms

ContractSpec supports multiple observability backends:

Prometheus – For metrics collection and alerting
Grafana – For dashboards and visualization
Jaeger / Tempo – For distributed tracing
Loki – For log aggregation
Datadog – All-in-one observability platform
New Relic – Application performance monitoring
Honeycomb – Observability for complex systems

You can configure multiple backends and send telemetry to all of them simultaneously.

Sampling and performance

Collecting telemetry has a cost—CPU, memory, network bandwidth, and storage. ContractSpec provides several mechanisms to control overhead:

Sampling – Trace only a percentage of requests (e.g., 10%)
Adaptive sampling – Automatically reduce sampling rate under high load
Tail-based sampling – Keep traces for failed requests, sample successful ones
Field redaction – Remove sensitive data from traces and logs
Aggregation – Pre-aggregate metrics before sending to reduce network traffic

Best practices

Start with high-level metrics (request rate, error rate, latency) and add more detailed instrumentation as needed.
Use structured logging—log events with structured fields, not free-form text.
Set up alerts for critical metrics so you're notified when things go wrong.
Use traces to debug complex issues—they show the complete picture of a request.
Redact sensitive data from logs and traces to comply with privacy regulations.
Review dashboards regularly to understand normal behavior—this makes anomalies easier to spot.
Use sampling to control costs, but always trace errors and slow requests.

Previous: MCP Adapters Next: Comparison