Architecture

A small, focused set of components that keeps Bosca simple, reliable, and scalable.

Bosca uses a small, thoughtfully designed set of components. Fewer moving parts mean fewer surprises, easier operations, and predictable scaling.

You can start simple: many core functions run in a single server. As your needs grow, you can split responsibilities and scale components independently.

At a glance, this approach helps you:

  • Keep operations straightforward for small teams
  • Add capabilities gradually with modular growth
  • Balance performance, cost, and reliability

Component Organization

Bosca's components are grouped into key functional areas to maintain clarity and ensure effective modularization:

  • Object Storage
  • Structured Storage
  • Search
  • Caching
  • Workflows
  • AI/ML
  • Analytics
  • General Operations

These functional areas allow us to design and organize the system in a way that is both efficient and scalable. In the following sections, we will explore these components in greater depth and explain how they work together.

Ingress

Component Type: General Operations, See More

Bosca is agnostic about the ingress method you choose for deployment. But, we do recommend nginx as a starting point.

  • Kubernetes Deployment: We leverage nginx ingress because we have experienced it running at scale and find it suitable.
  • Docker Compose Deployments: In this setup, all services are routed through nginx, enabling it to handle SSL termination and load balancing effectively.

Analytics

Component Type: Analytics, AI/ML, Workflows
See More

The Analytics Collector is a Ktor service that captures first-party events and persists them for downstream analysis. It is optional, but recommended when you want full control over data retention and personalization workflows.

We still recommend using third-party analytics alongside Bosca for validation and redundancy.

This control allows for validation, advanced system capabilities, and a safety net in case additional privacy laws cause unexpected changes in how you leverage third party systems through systems like the App Store or Play Store.

The collector writes data through an Iceberg catalog and object storage configuration, which makes it suitable for batch processing and long-term retention.

If you don't want to use Bosca's analytics system, you can bypass this component.

Bosca Server

Component Type: General Operations

The Bosca Server serves as the backbone of the Bosca platform, offering GraphQL interfaces to manage and interact with your content. It handles critical functions, including workflow state transitions, authentication, permissions, profiles, collections, metadata, supplementary content, documents, guides, AI agents, scripting, backup & restore, configuration, and more.

The server also includes an optional MCP server that allows AI clients like Claude to discover and query your GraphQL API.

Other Servers

  • Analytics Collector: event ingestion and storage (:backend:servers:analytics-collector)

Job Runners

Component Type: General Operations, Workflows

Bosca job runners process background work such as indexing, transition validation, and content processing. Runners are part of the same server binary and can be enabled or isolated by configuration, allowing you to separate API traffic from background processing when needed.

Job runners support distributed locking for clustered environments, preventing duplicate execution of the same job across multiple instances.

PostgreSQL

Component Type: General Operations, Structured Storage, See More

Bosca uses two PostgreSQL instances:

  • Primary database (default port 5433 in development): Stores operational data including content, profiles, security, workflows, and configuration.
  • Analytics warehouse (default port 5434 in development): A separate database for analytics data, used by the Iceberg catalog and Trino for batch queries.

Most major cloud providers provide managed PostgreSQL services, allowing for low overhead backups and scaling (through things like read-replicas). There are also several PostgreSQL compliant databases that allow for other scaling approaches like CockroachDB and YugabyteDB. We typically use CloudNativePG to manage our PostgreSQL deployments.

Trino

Component Type: Analytics, AI/ML, See More

Trino is a distributed SQL query engine used for analytics workloads. It connects to the analytics warehouse and S3 object storage, enabling SQL queries over Iceberg tables for reporting and data exploration.

AI agents can use Trino for natural language data queries, allowing non-technical users to explore analytics through chat interfaces.

Meilisearch

Component Type: General Operations, Search, See More

Meilisearch is our preferred search index. Thanks to its foundations in Rust, it has a very reasonable memory footprint and is very fast. It also has many advanced features. While there are certain trade-offs in functionality that they have chosen to make to achieve some of the capabilities they have, we have found them to be acceptable in most cases. With their vector store, things like semantic search are extremely easy to integrate and manage.

Search indexing supports Jsonata transformations for customizable field mapping, giving you fine-grained control over what gets indexed and how.

While Meilisearch doesn't have native clustering, there are easy ways to achieve eventually consistent read replicas via Bosca Workflows. Combined with Kubernetes load balancing, this is a practical way to scale search efficiently.

Redis or NATS

Component Type: General Operations, Caching, Messaging

Bosca supports either Redis or NATS (with JetStream) as the backend for caching, pub/sub, job queues, and distributed locking. You can choose whichever fits your infrastructure best — both are fully supported and interchangeable for these roles.

Most cloud providers offer managed Redis services. NATS is lightweight and well-suited for event-driven architectures. Either option works for small or large-scale deployments.

Object Storage (S3 or Cloud Storage)

Component Type: General Operations, Object Storage

Bosca uses S3-compatible object storage for assets and analytics data. But it also supports Cloud Storage, and standard file systems. In development, an S3 proxy provides local S3-compatible storage.

Text Extractor

Component Type: General Operations, Content Processing

The text extractor is a standalone service used for extracting text from uploaded documents. It runs as a separate container in local development.

Image Processor

Component Type: General Operations

Publishing images often requires creating multiple size and format variants. The image processor handles tasks such as resizing, format conversion, optimization, and more. Default size variants include thumbnail (480x270), small (960x540), medium (1440x810), and large (1920x1080).

OpenTelemetry

Component Type: General Operations, Telemetry

Bosca includes OpenTelemetry instrumentation and can export traces to any OpenTelemetry compatible backend.