Recommendations

Deliver personalized content recommendations powered by analytics, audience segments, and machine learning.

Bosca includes a recommendation engine that surfaces the right content for each user. It combines analytics-driven strategies, semantic similarity, audience segmentation, and machine learning to generate personalized suggestions for both metadata items and collections.

What you get

  • Multiple strategy types — trending, segment-based, content similarity, curated lists, collaborative filtering, and ML-powered predictions
  • Placement system — define named display slots in your UI and control which strategies feed each one
  • Segment targeting — strategies can target specific audience segments or apply to all users
  • Scheduled evaluation — strategies re-evaluate automatically on cron schedules to keep recommendations fresh
  • Dismissals — users can dismiss content they don't want to see, and undo dismissals later
  • Diversity controls — category caps and freshness boosting prevent filter bubbles
  • TensorFlow Recommenders — train two-tower retrieval models on your interaction data and serve predictions via TF Serving
  • Works for anonymous users — trending and curated strategies serve content without requiring a profile

Strategy Types

Recommendations are generated by strategies, each using a different approach to match content with users.

TypeHow it worksData source
TRENDINGRanks content by recent interaction velocity across all usersTrino analytics queries over Iceberg events
SEGMENT_BASEDSurfaces content popular among users in the same audience segmentsTrino queries filtered by segment membership
CONTENT_BASEDFinds content similar to what a user has interacted with, using categories, labels, and vector embeddingsMeilisearch hybrid search with semantic similarity
CURATEDAdmin-managed content lists for editorial picks or featured contentManual selection, no analytics query needed
COLLABORATIVERecommends content that similar users engaged with but the target user hasn't seenTrino co-occurrence query with IDF weighting (seeded by default installer)
ML_MODELUses a trained TensorFlow Recommenders model to predict personalized content rankingsTF Serving REST API backed by a TFRS two-tower model

Strategy Lifecycle

StatusDescription
DRAFTBeing configured, not yet generating recommendations
ACTIVELive and producing recommendations on its evaluation schedule
PAUSEDTemporarily suspended
ARCHIVEDRetired and no longer evaluated

Placements

A placement is a named location in your application where recommendations are displayed — for example, home_feed, article_sidebar, or post_read_next. Each placement links to one or more strategies in priority order and defines a maximum number of items.

When a client requests recommendations for a placement, the system:

  1. Resolves the strategies linked to that placement
  2. Filters to strategies applicable to the requesting user's segments
  3. Queries pre-computed recommendations from each strategy
  4. Runs the results through the assembler pipeline
  5. Returns a ranked list

Recommendation Assembly

Raw recommendations from multiple strategies pass through an assembly pipeline before being served:

  1. Dismissed filtering — content the user has dismissed is removed
  2. Deduplication — when multiple strategies recommend the same content, only the highest-scoring entry is kept
  3. Freshness boost — scores are adjusted by a time-decay factor so recently generated recommendations surface above stale ones
  4. Category diversity cap — no single category dominates the result set (configurable, default 3 items per category)
  5. Final ranking — sorted by adjusted score, top N returned

How It Stays Fresh

Recommendations update automatically through scheduled evaluation:

  • Each strategy can have an evaluation schedule (cron expression) that triggers periodic re-evaluation
  • Trending strategies typically refresh hourly, segment-based every 6 hours, ML models daily
  • The scheduler runs the strategy's analytics query against the latest Iceberg event data and upserts fresh scores
  • Expired recommendations are cleaned up automatically by a periodic job
  • Content-based similarity is always real-time — it queries Meilisearch on demand, so new content is discoverable immediately after indexing

Machine Learning with TensorFlow Recommenders

For platforms with sufficient interaction data, Bosca supports ML-powered recommendations using TensorFlow Recommenders (TFRS).

Architecture

recommendation-trainer (Python)          tf-serving (Google)
  1. Connect to Trino                      Loads trained SavedModel
  2. Load interaction history              Serves predictions via REST
  3. Train two-tower model                 Hot-reloads new model versions
  4. Upload model to Bosca storage
                                         bosca-server (Kotlin)
recommendation-model-loader (sidecar)      ML_MODEL strategy evaluation
  Polls Bosca for new models               calls TF Serving
  Downloads to TF Serving                  maps predictions to recommendations

Two-Tower Model

The TFRS model uses a two-tower architecture:

  • User tower — maps user IDs to embedding vectors based on interaction patterns
  • Content tower — maps content IDs plus features (content type, language, categories) to embedding vectors
  • Scoring — the dot product of user and content embeddings predicts relevance

The model is trained on implicit feedback from Bosca's analytics events (impressions, interactions, completions) loaded directly from Trino.

Training Pipeline

Training is triggered by Bosca's TrainModelJob, which calls the trainer service's HTTP endpoint. The trainer:

  1. Queries Trino for interaction data and content features
  2. Trains the two-tower retrieval model
  3. Builds a ScaNN index for fast approximate nearest neighbor search
  4. Exports the SavedModel and uploads it to Bosca's content storage
  5. TF Serving detects the new version and hot-reloads it

Model artifacts are stored as Bosca metadata content — versioned, access-controlled, and backed up with everything else.

Cold Start

The system handles cold start gracefully:

  • New users get trending and curated content, plus segment-based recommendations if they belong to any segments
  • New content is immediately discoverable via Meilisearch similarity (embeddings are generated at index time) and appears in trending feeds once it accumulates interactions
  • ML models use content features (type, language, categories) alongside IDs, so items with zero interaction history still receive predictions based on their metadata

User Engagement Tracking

User interactions with recommended content (views, clicks, completions) are tracked through Bosca's existing analytics event pipeline — not a separate tracking system. When a client displays a recommendation, it should include attribution context in the analytics event's extras field so that strategy effectiveness can be measured:

{
  "type": "Interaction",
  "element": {
    "type": "recommendation",
    "content": [{"id": "content-uuid", "type": "metadata"}],
    "extras": {
      "recommendation_strategy": "trending-24h",
      "recommendation_position": 3
    }
  }
}

This data feeds back into strategy evaluation on the next cycle, creating a continuous improvement loop.

Dismissals are the one exception — they are stored as a persistent user preference (not an analytics event) because they need to be:

  • Queried at serving time with low latency
  • Revocable (users can undo a dismissal)
  • Filtered in real-time, not on a batch schedule

GraphQL API

Querying Recommendations

# Personalized feed for a profile
query {
  recommendation {
    profile(profileId: "...", offset: 0, limit: 10) {
      metadata { id name contentType }
      collection { id name }
      score
      reason
      strategy { name type }
    }
  }
}

# Recommendations for a specific UI placement
query {
  recommendation {
    placement(profileId: "...", placementSlug: "home_feed", limit: 5) {
      metadata { id name }
      score
    }
  }
}

# Similar content (real-time vector search)
query {
  recommendation {
    similar(metadataId: "...", limit: 5) {
      metadata { id name }
      score
    }
  }
}

# Trending (works without authentication)
query {
  recommendation {
    trending(offset: 0, limit: 10) {
      metadata { id name }
      score
    }
  }
}

Managing Dismissals

mutation {
  recommendation {
    dismiss(profileId: "...", metadataId: "...")
    undismiss(profileId: "...", metadataId: "...")
  }
}

Administration

Strategies and placements are managed through admin-only fields nested under recommendation:

# Create a trending strategy with hourly refresh
mutation {
  recommendation {
    strategies {
      add(
        strategy: {
          name: "Trending Content"
          type: TRENDING
          status: ACTIVE
          analyticsQueryId: "..."
          evaluationSchedule: "0 * * * *"
          priority: 5
          maxRecommendations: 20
        }
        segmentIds: []
      ) { id name evaluationSchedule }
    }
  }
}

# Create a placement that blends multiple strategies
mutation {
  recommendation {
    placements {
      add(
        placement: {
          name: "Home Feed"
          slug: "home_feed"
          maxItems: 10
        }
        strategyIds: ["strategy-1", "strategy-2"]
      ) { id slug }
    }
  }
}

# Browse strategies and placements
query {
  recommendation {
    strategies {
      all(offset: 0, limit: 10) { id name type status }
    }
    placements {
      all { id name slug maxItems }
    }
  }
}

Default Setup

On first startup, Bosca's package installer seeds:

  • Trending Content analytics query (interaction velocity over 7 days with recency weighting)
  • Popular Content analytics query (interaction counts over 14 days)
  • Collaborative Filtering analytics query (co-occurrence with IDF weighting over 30 days)
  • A Trending strategy with hourly evaluation
  • A Popular strategy with 6-hour evaluation
  • A Collaborative Filtering strategy with daily evaluation
  • A home_feed placement linking all strategies

These defaults provide working recommendations out of the box as soon as analytics data starts flowing.

For developers

Related modules:

  • Core interfaces: backend/framework/core-recommendations
  • Implementation: backend/framework/recommendations
  • ML pipeline: ml/recommendation-trainer
  • GraphQL schema: backend/framework/recommendations/src/main/resources/graphql/recommendations.graphqls

Related: