Telemetry
A TelemetrySpec defines what metrics, logs, and traces to collect for observability. ContractSpec automatically instruments your application based on these specs, ensuring you have the visibility you need to monitor, debug, and optimize your system.
Why telemetry matters
You can't fix what you can't see. Telemetry provides visibility into how your application is performing, where errors are occurring, and how users are interacting with your system. Without proper instrumentation, you're flying blind in production.
ContractSpec takes a spec-first approach to telemetry: you declare what you want to observe, and runtime adapters instrument operations automatically. This ensures consistent, comprehensive coverage without manual effort.
Three pillars of observability
Metrics
Numerical measurements collected over time. Examples: request count, error rate, latency percentiles, active users, queue depth. Metrics are cheap to collect and store, making them ideal for high-level monitoring and alerting.
Logs
Timestamped text records of events. Examples: "User 123 logged in", "Payment failed for order 456", "Database connection pool exhausted". Logs provide detailed context for debugging specific issues.
Traces
Records of requests as they flow through your system. A trace shows the complete path of a request—which services it touched, how long each step took, and where errors occurred. Traces are essential for debugging distributed systems.
Example TelemetrySpec
Here's how telemetry is configured in TypeScript:
import { defineTelemetry } from '@lssm/lib.contracts';
export const OrderProcessingTelemetry = defineTelemetry({
meta: {
name: 'order.processing.observability',
version: 1,
},
metrics: [
{
name: 'orders_created_total',
type: 'counter',
description: 'Total number of orders created',
labels: ['status', 'payment_method'],
},
{
name: 'order_processing_duration_seconds',
type: 'histogram',
description: 'Time to process an order',
buckets: [0.1, 0.5, 1.0, 2.0, 5.0, 10.0],
labels: ['status'],
},
],
traces: [
{
operation: 'createOrder',
sampleRate: 1.0, // Trace 100% of requests
includeInputs: true,
includeOutputs: true,
redactFields: ['creditCard', 'ssn'],
},
{
operation: 'processPayment',
sampleRate: 0.1, // Trace 10% of requests
includeInputs: false, // Don't log payment details
},
],
alerts: [
{
name: 'high-error-rate',
condition: 'error_rate > 0.05',
duration: '5m',
severity: 'critical',
notify: ['pagerduty', 'slack'],
},
],
});Automatic instrumentation
ContractSpec automatically instruments:
- All operations – Request count, latency, error rate per Command/Query
- All workflows – Step execution time, retry counts, compensation events
- All data views – Query execution time, result set size
- All policy decisions – Decision time, permit/deny ratio
- System resources – CPU, memory, disk, network usage
You don't need to add instrumentation code manually—the runtime handles it based on your specs.
Integration with observability platforms
ContractSpec supports multiple observability backends:
- Prometheus – For metrics collection and alerting
- Grafana – For dashboards and visualization
- Jaeger / Tempo – For distributed tracing
- Loki – For log aggregation
- Datadog – All-in-one observability platform
- New Relic – Application performance monitoring
- Honeycomb – Observability for complex systems
You can configure multiple backends and send telemetry to all of them simultaneously.
Sampling and performance
Collecting telemetry has a cost—CPU, memory, network bandwidth, and storage. ContractSpec provides several mechanisms to control overhead:
- Sampling – Trace only a percentage of requests (e.g., 10%)
- Adaptive sampling – Automatically reduce sampling rate under high load
- Tail-based sampling – Keep traces for failed requests, sample successful ones
- Field redaction – Remove sensitive data from traces and logs
- Aggregation – Pre-aggregate metrics before sending to reduce network traffic
Best practices
- Start with high-level metrics (request rate, error rate, latency) and add more detailed instrumentation as needed.
- Use structured logging—log events with structured fields, not free-form text.
- Set up alerts for critical metrics so you're notified when things go wrong.
- Use traces to debug complex issues—they show the complete picture of a request.
- Redact sensitive data from logs and traces to comply with privacy regulations.
- Review dashboards regularly to understand normal behavior—this makes anomalies easier to spot.
- Use sampling to control costs, but always trace errors and slow requests.