Design: Trellis Patterns

Prerequisites

None.

Context

Trellis is a distributed system for aggregating, processing, and distributing organizational data. Services communicate exclusively over NATS.

This document establishes the top-level cross-cutting system patterns for:

  • service boundaries
  • communication patterns
  • platform boundaries
  • the relationship between subsystem-specific pattern docs

Detailed coding, storage, type-system, observability, frontend, and capability guidance is split into companion documents.

Architecture

Service Categories

CategoryPurposeExamples
InfrastructurePlatform capabilities for all servicesAuth, Jobs
IngestPull external data, emit domain eventsZendesk, FoodLogiQ
RepositoryPersist and query domain dataGraph, Search
ProcessingTransform, enrich, derive knowledgeClassification
EgressPush data to external systemsLaserfiche

Categories describe primary responsibility. Any service may still subscribe to events for cache invalidation or local state.

Platform Boundary

Trellis platform code and cloud/domain code are intentionally separate.

Rules:

  • the Trellis platform repo owns protocol/runtime libraries, the trellis runtime service, jobs, Trellis-owned contracts, and contract tooling
  • cloud repos own domain services, domain contracts, apps, and domain models unless a model is required by a Trellis-owned contract or shared Trellis runtime library
  • @qlever-llc/trellis is a runtime library, not a central registry for every service API
  • service APIs are defined with the service that owns them and consumed through contract packages

Communication Patterns

Events

Events announce state changes. Publishers fire and forget.

Subject naming:

events.v1.<Domain>.<...tokens>

Examples:

events.v1.Partner.Changed.<origin>.<id>
events.v1.Identity.Changed.<origin>.<id>
events.v1.Document.Uploaded.<contentType>.<partnerId>

Rules:

  • add subject tokens only when consumers need selective subscription and the cardinality is bounded and stable
  • token order matters; put the most-filtered tokens first
  • event handlers must be idempotent because delivery is at-least-once
  • direct event publish is the default; use a prepared event and service-owned outbox only when event publication must be coupled to service-local durable state
  • outbox dispatch MAY use a process-local wakeup helper to reduce latency, but the wakeup MUST happen after the outbox write commits; enqueueing a row inside a transaction must not directly publish work that can later roll back
  • consumers should use an inbox only for handlers that are not naturally idempotent
  • SQL services own migrations and transactions for local state, outbox rows, and inbox rows; NATS KV inbox/outbox helpers provide durable dedupe/queue storage but are not transactional with unrelated database side effects
  • process-local outbox wakeups are latency optimizations only; durable retry and recovery still depend on persisted outbox state and an explicit dispatch or recovery scan
  • a prepared event transport record MUST NOT duplicate the publisher’s contract id or contract digest
  • event subscribe permissions are event-type gates, not per-entity ACLs; do not encode user-owned object lists into Trellis runtime permissions

Feeds

Feeds expose caller-visible live views that are authorized by the owning service. They are request/reply streams: a caller requests a feed with typed input, and the service emits typed frames to the caller’s reply inbox.

Subject naming:

feeds.v1.<Domain>.<LiveView>

Examples:

feeds.v1.Device.Events
feeds.v1.Audit.Feed
feeds.v1.Inspection.Updates

Rules:

  • use feeds when normal apps need reactive UI updates filtered by application authorization, such as devices visible to the logged-in user
  • feed subjects are request subjects, not raw event subjects
  • the service owns fine-grained authorization against the authenticated caller, feed input, and every emitted frame
  • normal apps that use feeds do not receive raw events.v1.* subscribe permissions for the backing domain events
  • feeds are live streams, not durable operations; use operations when the caller needs resumable workflow state or terminal completion

RPCs

RPCs query data or perform bounded synchronous operations. Caller-visible long-running workflows use operations.

Subject naming is domain-based rather than service-based:

rpc.v1.User.Find
rpc.v1.Partner.List
rpc.v1.Documents.Search

Rules:

  • callers use method names, not raw subjects, in normal code
  • the API schema maps methods to transport subjects
  • implementation details may change without changing caller-visible method names

Operations

Operations are caller-visible asynchronous workflows with durable state, explicit progress, and watchable completion.

Subject naming:

operations.v1.<Domain>.<...tokens>

Rules:

  • use operations when the caller must observe progress or wait across reconnects
  • use RPCs for bounded synchronous work and jobs for service-private execution machinery
  • operation control and watch semantics are defined in ../operations/trellis-operations.md

Runtime subject spaces

Some subsystem-owned subject spaces are not events.v1.*, rpc.v1.*, or operations.v1.*. They exist for infrastructure coordination, stream projections, and service-private transport protocols.

Rules:

  • public and cross-service boundaries must be modeled as contract-owned RPCs, operations, events, feeds, jobs, state, or resources rather than caller- authored raw subject declarations
  • Trellis-owned runtime protocols may still use raw subjects behind a contract-owned public API; file transfer chunk subjects are an example of this pattern
  • subsystem docs should define the semantics and naming rules for any runtime subject space they introduce
  • examples include jobs work subjects and other platform-owned control surfaces described in companion docs

Companion Documents

This document defines the high-level system style. Detailed companion docs are split by concern: