Knowledge Binding

Knowledge binding connects your app's workflows and agents to structured knowledge spaces. This enables semantic search, RAG (Retrieval-Augmented Generation), and context-aware decision-making.

How it works

Knowledge binding follows a three-layer model:

  1. KnowledgeSpaceSpec (global) - Defines a logical knowledge domain
  2. KnowledgeSourceConfig (per-tenant) - Tenant's data sources feeding spaces
  3. AppKnowledgeBinding (per-app) - Maps spaces to workflows/agents

Example: Support agent with RAG

Let's build a support agent that uses canonical product documentation and operational support history.

Step 1: Blueprint declares knowledge needs

// AppBlueprintSpec
{
  id: "support-app",
  version: "1.0.0",
  knowledgeSpaces: [
    {
      spaceId: "product-canon",
      category: "canonical",
      required: true,
      purpose: "Official product documentation and specs"
    },
    {
      spaceId: "support-history",
      category: "operational",
      required: true,
      purpose: "Past support tickets and resolutions"
    },
    {
      spaceId: "external-docs",
      category: "external",
      required: false,
      purpose: "Third-party integration documentation"
    }
  ]
}

Step 2: Tenant configures sources

// KnowledgeSourceConfig (per-tenant)
[
  {
    id: "src_notion_product_docs",
    tenantId: "acme-corp",
    spaceId: "product-canon",
    kind: "notion",
    location: "https://notion.so/acme/product-docs",
    syncPolicy: { interval: "1h" },
    lastSyncedAt: "2025-01-15T10:00:00Z"
  },
  {
    id: "src_gmail_support_threads",
    tenantId: "acme-corp",
    spaceId: "support-history",
    kind: "gmail",
    location: "support@acme.com",
    syncPolicy: { webhook: true },
    lastSyncedAt: "2025-01-15T10:30:00Z"
  },
  {
    id: "src_stripe_docs",
    tenantId: "acme-corp",
    spaceId: "external-docs",
    kind: "url",
    location: "https://stripe.com/docs",
    syncPolicy: { interval: "24h" },
    lastSyncedAt: "2025-01-15T00:00:00Z"
  }
]

Step 3: TenantAppConfig binds spaces

// TenantAppConfig
{
  tenantId: "acme-corp",
  blueprintId: "support-app",
  blueprintVersion: "1.0.0",
  knowledgeBindings: [
    {
      spaceId: "product-canon",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "generate-docs"],
        agentIds: ["support-agent", "docs-agent"]
      },
      allowedCategories: ["canonical"],
      sources: ["src_notion_product_docs"]
    },
    {
      spaceId: "support-history",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "escalate-ticket"],
        agentIds: ["support-agent"]
      },
      allowedCategories: ["operational"],
      sources: ["src_gmail_support_threads"]
    },
    {
      spaceId: "external-docs",
      enabled: true,
      allowedConsumers: {
        agentIds: ["support-agent"]
      },
      allowedCategories: ["external"],
      sources: ["src_stripe_docs"]
    }
  ]
}

Step 4: Workflow uses knowledge

// WorkflowSpec
workflowId: answer-question
version: 1.0.0

steps:
  - id: generate-embedding
    capability: openai-embeddings
    inputs:
      text: ${input.question}
  
  - id: search-canonical
    capability: vector.search
    inputs:
      collection: "product-canon"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 5
  
  - id: search-support-history
    capability: vector.search
    inputs:
      collection: "support-history"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 3
  
  - id: generate-answer
    capability: openai-chat
    inputs:
      messages:
        - role: "system"
          content: |
            You are a support agent. Answer based on:
            1. Canonical docs (authoritative)
            2. Support history (helpful context)
            Only use external docs for integration questions.
        - role: "user"
          content: |
            Question: ${input.question}
            
            Canonical docs:
            ${steps.search-canonical.output.results}
            
            Support history:
            ${steps.search-support-history.output.results}

Category-based access control

Different knowledge categories have different trust levels and access patterns:

CategoryTrust LevelUse CasesPolicy Impact
canonicalHighestProduct specs, schemas, official policiesCan drive policy decisions
operationalHighSupport tickets, sales docs, internal runbooksCan inform decisions
externalMediumThird-party docs, regulations, PSP guidesReference only, not authoritative
ephemeralLowAgent scratchpads, session context, draftsNever used for decisions

Multi-space workflows

Workflows can query multiple knowledge spaces and combine results:

knowledgeBindings: [
  {
    spaceId: "product-canon",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical"],
    sources: ["src_database_schema", "src_product_catalog"]
  },
  {
    spaceId: "pricing-rules",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical", "operational"],
    sources: ["src_pricing_database", "src_discount_policies"]
  },
  {
    spaceId: "customer-history",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["quote-creation"]
    },
    allowedCategories: ["operational"],
    sources: ["src_crm_data", "src_past_invoices"]
  }
]

Security & validation

  • Knowledge sources are validated before sync - credentials and permissions checked
  • PDP enforces which workflows/agents can access which spaces
  • All knowledge queries are audited with search terms and results
  • Canonical knowledge is immutable once indexed - changes require re-sync
  • Ephemeral knowledge is automatically purged based on retention policies

Best practices

  • Use canonical spaces for policy-critical decisions, operational for suggestions
  • Never allow workflows to write to canonical spaces - maintain read-only access
  • Set up monitoring for sync failures and stale knowledge sources
  • Document the purpose and trust level of each knowledge space
  • Test knowledge queries in sandbox before promoting to production
  • Use explicit allowedConsumers - avoid wildcard access