Knowledge Binding

Knowledge binding connects your app's workflows and agents to structured knowledge spaces. This enables semantic search, RAG (Retrieval-Augmented Generation), and context-aware decision-making.

How it works

Knowledge binding follows a three-layer model:

KnowledgeSpaceSpec (global) - Defines a logical knowledge domain
KnowledgeSourceConfig (per-tenant) - Tenant's data sources feeding spaces
AppKnowledgeBinding (per-app) - Maps spaces to workflows/agents

Example: Support agent with RAG

Let's build a support agent that uses canonical product documentation and operational support history.

Step 1: Blueprint declares knowledge needs

// AppBlueprintSpec
{
  id: "support-app",
  version: "1.0.0",
  knowledgeSpaces: [
    {
      spaceId: "product-canon",
      category: "canonical",
      required: true,
      purpose: "Official product documentation and specs"
    },
    {
      spaceId: "support-history",
      category: "operational",
      required: true,
      purpose: "Past support tickets and resolutions"
    },
    {
      spaceId: "external-docs",
      category: "external",
      required: false,
      purpose: "Third-party integration documentation"
    }
  ]
}

Step 2: Tenant configures sources

// KnowledgeSourceConfig (per-tenant)
[
  {
    id: "src_notion_product_docs",
    tenantId: "acme-corp",
    spaceId: "product-canon",
    kind: "notion",
    location: "https://notion.so/acme/product-docs",
    syncPolicy: { interval: "1h" },
    lastSyncedAt: "2025-01-15T10:00:00Z"
  },
  {
    id: "src_gmail_support_threads",
    tenantId: "acme-corp",
    spaceId: "support-history",
    kind: "gmail",
    location: "support@acme.com",
    syncPolicy: { webhook: true },
    lastSyncedAt: "2025-01-15T10:30:00Z"
  },
  {
    id: "src_stripe_docs",
    tenantId: "acme-corp",
    spaceId: "external-docs",
    kind: "url",
    location: "https://stripe.com/docs",
    syncPolicy: { interval: "24h" },
    lastSyncedAt: "2025-01-15T00:00:00Z"
  }
]

Step 3: TenantAppConfig binds spaces

// TenantAppConfig
{
  tenantId: "acme-corp",
  blueprintId: "support-app",
  blueprintVersion: "1.0.0",
  knowledgeBindings: [
    {
      spaceId: "product-canon",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "generate-docs"],
        agentIds: ["support-agent", "docs-agent"]
      },
      allowedCategories: ["canonical"],
      sources: ["src_notion_product_docs"]
    },
    {
      spaceId: "support-history",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "escalate-ticket"],
        agentIds: ["support-agent"]
      },
      allowedCategories: ["operational"],
      sources: ["src_gmail_support_threads"]
    },
    {
      spaceId: "external-docs",
      enabled: true,
      allowedConsumers: {
        agentIds: ["support-agent"]
      },
      allowedCategories: ["external"],
      sources: ["src_stripe_docs"]
    }
  ]
}

Step 4: Workflow uses knowledge

// WorkflowSpec
workflowId: answer-question
version: 1.0.0

steps:
  - id: generate-embedding
    capability: openai-embeddings
    inputs:
      text: ${input.question}
  
  - id: search-canonical
    capability: vector.search
    inputs:
      collection: "product-canon"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 5
  
  - id: search-support-history
    capability: vector.search
    inputs:
      collection: "support-history"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 3
  
  - id: generate-answer
    capability: openai-chat
    inputs:
      messages:
        - role: "system"
          content: |
            You are a support agent. Answer based on:
            1. Canonical docs (authoritative)
            2. Support history (helpful context)
            Only use external docs for integration questions.
        - role: "user"
          content: |
            Question: ${input.question}
            
            Canonical docs:
            ${steps.search-canonical.output.results}
            
            Support history:
            ${steps.search-support-history.output.results}

Category-based access control

Different knowledge categories have different trust levels and access patterns:

Category	Trust Level	Use Cases	Policy Impact
canonical	Highest	Product specs, schemas, official policies	Can drive policy decisions
operational	High	Support tickets, sales docs, internal runbooks	Can inform decisions
external	Medium	Third-party docs, regulations, PSP guides	Reference only, not authoritative
ephemeral	Low	Agent scratchpads, session context, drafts	Never used for decisions

Multi-space workflows

Workflows can query multiple knowledge spaces and combine results:

knowledgeBindings: [
  {
    spaceId: "product-canon",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical"],
    sources: ["src_database_schema", "src_product_catalog"]
  },
  {
    spaceId: "pricing-rules",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical", "operational"],
    sources: ["src_pricing_database", "src_discount_policies"]
  },
  {
    spaceId: "customer-history",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["quote-creation"]
    },
    allowedCategories: ["operational"],
    sources: ["src_crm_data", "src_past_invoices"]
  }
]

Security & validation

Knowledge sources are validated before sync - credentials and permissions checked
PDP enforces which workflows/agents can access which spaces
All knowledge queries are audited with search terms and results
Canonical knowledge is immutable once indexed - changes require re-sync
Ephemeral knowledge is automatically purged based on retention policies

Best practices

Use canonical spaces for policy-critical decisions, operational for suggestions
Never allow workflows to write to canonical spaces - maintain read-only access
Set up monitoring for sync failures and stale knowledge sources
Document the purpose and trust level of each knowledge space
Test knowledge queries in sandbox before promoting to production
Use explicit allowedConsumers - avoid wildcard access

Previous: Integration Binding Knowledge & Context