Architecture
Bosca is built using a small, carefully designed set of components. By minimizing the number of components, we reduce complexity and mitigate potential issues that often arise in larger systems. To streamline operations and enhance reliability, many of Bosca's core functions are consolidated into a single server binary, though modularity and scalability are pragmatically considered where necessary.
Component Organization
Bosca's components are grouped into key functional areas to maintain clarity and ensure effective modularization:
- Object Storage
- Structured Storage
- Search
- Caching
- Workflows
- AI/ML
- Analytics
- General Operations
These functional areas allow us to design and organize the system in a way that is both efficient and scalable. In the following sections, we will explore these components in greater depth and explain how they work together.
Ingress
{id="ingress-component-type"} Component Type: General Operations, See More
Bosca is agnostic about the ingress method you choose for deployment. But, we do recommend nginx as a starting point.
- Kubernetes Deployment: We leverage nginx ingress because we have experienced it running at scale and find it suitable.
- Docker Compose Deployments: In this setup, all services are routed through nginx, enabling it to handle SSL termination and load balancing effectively.
- Serverless Deployments: If you're deploying Bosca in a serverless environment, you can typically ignore the ingress component because the service provider will manage this for you.
Analytics
{id="bosca-analytics-component-type"}
Component Type: Analytics, AI/ML, Workflows
Language: Rust, See More
The Bosca Analytics Server is a very low overhead way in which to capture first party analytics (meaning, you are collecting the data and not someone like Google). You might be wondering why not just use Google Analytics? Well, you can and probably should. This is an optional component. However, it's one that we recommend. The primary reason is as it matures, its ability to invoke workflows within the Bosca system will provide you with very interesting and unique ways to offer personalized experiences to your users through training ML models and other triggers.
We've also found that at scale, it's extremely important to be able to verify third party analytics systems... because sometimes they have issues that can leave you in the dark. And, for clarity, this statement isn't to say those systems are bad. We actually recommend you use both a third party data collection system and a first party data collection system.
This control allows for validation, advanced system capabilities, and a safety net in case additional privacy laws cause unexpected changes in how you leverage third party systems through systems like the App Store or Play Store.
The current status of this component is immature, but there are lots of plans for it, with some of them coming in the next month or so. Some of the first use cases will be around sending transational emails when a user hasn't interacted with an element for a period of time.
{id="bosca-analytics-licensing"}
Licensing: Apache 2.0, See More
If you don't want to use Bosca's analytics system, you can bypass this component.
Bosca Server
{id="bosca-server-component-type"}
Component Type: General Operations
Language: Rust
The Bosca Server serves as the backbone of the Bosca platform, offering GraphQL interfaces to manage and interact with your content. It handles critical functions, including workflow, scheduling, and state, authentication, permissions, profiles, collections, metadata, supplementary content, documents, guides, and more.
{id="bosca-server-licensing"} Licensing: Apache 2.0, See More
Bosca Workflow Server (Optional)
{id="bosca-workflow-server-component-type"} Component Type: Workflows
This is the same as Bosca Server but is dedicated to the workflow runners. This approach allows segmenting workflow operations to be internally facing infrastructure, such that it doesn't affect user facing infrastructure.
Bosca Workflow Runners (Optional)
{id="bosca-workflow-component-type"}
Component Type: General Operations, AI/ML
Language: Kotlin, C#, Python, Rust (or whatever language you want), See More
Bosca Runners are standalone programs that actively monitor the Workflow Queue within Bosca Server to process various activities. These activities can range from transitioning metadata or collections between states (e.g., draft to published) to more complex workflows like transcribing videos, extracting highlights using LLMs, and creating content snippets.
Workflow runners are language-agnostic, allowing you to build them in any preferred language. They operate as external processes, communicating with Bosca Server via GraphQL to receive activity jobs and report statuses.
NOTE: We're thinking through ways to allow for the runners to run in a serverless environment. But, this isn't currently a priority.
{id="bosca-workflow-licensing"}
Licensing: TBD, though what will most likely happen is there will be core runners (we will be releasing them
soon) that are licensed under Apache 2.0, and then there will be enterprise workflows that are available, possibly
under a Functional Source License or some other
enterprise license.
*If you don't want to use Bosca's workflow system, you can bypass this component.
PostgreSQL
{id="postgresql-component-type"} Component Type: General Operations, Structured Storage, See More
We leverage PostgreSQL for many aspects of the Bosca System. Ranging from structured storage, to its JSONB storage. Most major cloud providers provide managed PostgreSQL services. Allowing for low overhead backups and scaling (through things like read-replicas). There are also several PostgreSQL compliant databases that allow for other scaling approaches like CockroachDB and YugabyteDB. There are even serverless PostgreSQL offerings from companies like Neon.
{id="postgresql-licensing"} Licensing: PostgreSQL License, See More
Meilisearch
{id="meilisearch-component-type"} Component Type: General Operations, Search, See More
Meilisearch is our preferred search index. Thanks to its foundations in Rust, it has a very reasonable memory footprint and is very fast. It also has many advanced features. While there are certain trade-offs in functionality that they have chosen to make to achieve some of the capabilities they have, we have found them to be acceptable in most cases. With their vector store, things like semantic search are extremely easy to integrate and manage.
While they don't have native clustering, there easy ways to achieve eventually consistent read replicas through Bosca Workflows. That combined with kubernetes load balancing, we think this is a great way to efficiently scale search.
{id="meilisearch-licensing"} Licensing: MIT, See More
Redis (Dragonfly)
{id="dragonfly-component-type"} Component Type: General Operations, Caching, See More
We use a Redis compliant server (though you can choose any Redis compliant server), called Dragonfly. While it is licensed under the BSL, presently, we find their additional use grant to be acceptable. If anything changes, we can easily switch this out.
We chose Dragonfly because it has a Kubernetes operator that makes it extremely easy to spin up a primary and failover node. In addition, its performance is on par with our expectations thanks to it being multithreaded.
We're primarily using this component for Caching, Workflow Job Queues, and PubSub. Most cloud providers provide managed Redis services that can satisfy all the use cases. And, optionally, if it became a priority we could support MemCache for caching instead (but, this isn't currently a priority). Our current caching mechanics are very immature, this is a high priority for us, and will likely be improved in the coming weeks.
{id="dragonfly-licensing"} Licensing: BSL, See More
Object Storage (S3 or Cloud Storage)
{id="object-storage-component-type"} Component Type: General Operations, Object Storage
...
Qdrant
{id="qdrant-component-type"} Component Type: AI/ML, Search, See More
Qdrant is a robust vector storage and search system that we utilize to enable several key functionalities within the Bosca platform. It plays a pivotal role in delivering:
- Basic Recommendations: Enhancing user experiences through personalized content suggestions.
- Context During RAG Operations: Providing contextual relevance to improve results.
While Qdrant offers excellent vector search capabilities, it is optional. Meilisearch can act as an alternative for certain vector storage tasks. However, Qdrant stands out when vector search is critical for your application, reducing the load on other systems while ensuring high-performance operations.
{id="qdrant-licensing"} Licensing: Apache 2.0, See More
Image Processor
{id="image-processor-component-type"}
Component Type: General Operations
{id="image-processor-language"}
Language: TypeScript
Publishing images often requires creating multiple size and format variants. The image processor handles tasks such as resizing, format conversion, optimization, and more. While primarily designed to work within workflows for pre-caching image variants, it can also be safely exposed as a public-facing service.
However, image manipulation can be resource-intensive. If you choose to utilize this component in a public-facing capacity, ensure adequate memory, compute resources, and proper edge caching to optimize performance.
{id="image-processor-license"} Licensing: Apache 2.0, See More
Open Telemetry
While immature in Bosca, we are levering OTEL as a telemetry system. As such, we can forward events to many of the most popular telemetry services.