Skip to main content
The Experiments section lets you run controlled tests to optimize feed layout, ranking, ad frequency, and other parameters.

How experiments work

  1. Define variants: Different configurations to test
  2. Assign users: Deterministic assignment based on user ID hash
  3. Collect metrics: Track engagement per variant
  4. Analyze results: Statistical significance testing
  5. Promote winner: Apply winning config to all users

Creating an experiment

Step 1: Basic info

Go to Experiments → Create New:
  1. Name: Descriptive name (e.g., “Recency vs Engagement Weight Test”)
  2. Description: Hypothesis and goals
  3. Type: What you’re testing

Experiment types

TypeWhat You Can Test
RankingSignal weights, decay rates, thresholds
LayoutFeed orientation, height, adjacent feeds
ControlsEnabled controls, auto-hide settings
ThemeColors, fonts, control styling
Ad frequencyAd placement intervals, modes
CustomAny SDK config parameter

Step 2: Define variants

Add 2-4 variants:
Variant A (Control): Current weights
{
  "weights": {
    "recency": 0.25,
    "engagement": 0.25,
    "geoRelevance": 0.15,
    "topicAffinity": 0.20,
    "completionRate": 0.15
  }
}
Variant B (Test): More recency
{
  "weights": {
    "recency": 0.40,
    "engagement": 0.15,
    "geoRelevance": 0.15,
    "topicAffinity": 0.15,
    "completionRate": 0.15
  }
}

Step 3: Traffic allocation

Set percentage of users for each variant:
  • Equal split: Automatically divide traffic
  • Custom split: Set specific percentages (must sum to 100%)
For high-traffic apps, start with a small test population (e.g., 10%) before scaling to full traffic.

Step 4: Success metrics

Select primary and secondary metrics: Primary metrics (choose 1-2):
  • Watch time per session
  • Session duration
  • Videos watched per session
  • Completion rate
  • Return rate (next day)
Secondary metrics (optional):
  • Share rate
  • Ad impressions
  • Ad revenue
  • Rebuffer rate

Step 5: Launch

Review and launch:
  1. Preview how variants will look
  2. Set experiment duration (recommended: 2+ weeks)
  3. Click Launch

Managing experiments

Experiment states

StateDescription
DraftNot yet launched, can edit
RunningActively collecting data
PausedTemporarily stopped
CompletedReached end date or stopped
ArchivedHistorical record

Monitoring

While running, view:
  • Users per variant
  • Real-time metrics
  • Statistical significance progress

Pausing

Pause an experiment to:
  • Investigate unexpected results
  • Fix a bug in one variant
  • Temporarily stop traffic split
Paused experiments can be resumed.

Analyzing results

Results dashboard

View at Experiments → [Experiment] → Results:
MetricControlTestLiftConfidence
Watch time45.2s52.1s+15.3%94%
Completion rate42%48%+14.3%97%
Session length4.2min4.8min+14.3%91%

Statistical significance

Results show:
  • Lift: Percentage change vs. control
  • Confidence: Probability that the difference is real (not random)
  • Status: Significant (>95%), trending, or inconclusive
Wait for at least 95% confidence before making decisions. Early results often change as more data is collected.

Segmented analysis

Break down results by:
  • Platform (iOS/Android/Web)
  • Region
  • User tenure (new vs. returning)
  • Device type

Promoting a winner

When you have significant results:
  1. Go to experiment results
  2. Click Promote Variant on the winning variant
  3. Choose:
    • Apply to 100%: Immediately apply to all users
    • Gradual rollout: Slowly increase from test percentage
The winning configuration becomes the new default.

Best practices

Experiment design

Changing multiple variables makes it hard to attribute results. If testing both ranking weights and ad frequency, run separate experiments.
Short experiments may show false positives. Run for at least 2 weeks to capture different usage patterns (weekdays vs. weekends, etc.).
More users = faster significant results. For smaller apps, plan for longer experiment duration.
Write down what you expect to happen and why. This helps interpret results and plan follow-up tests.

Common experiments

GoalWhat to Test
Increase watch timeRanking weights, content diversity
Improve retentionSession start experience, personalization depth
Boost completionContent length filters, quality thresholds
Increase ad revenueAd frequency, placement timing
Reduce rebufferingABR settings, prefetch depth

Experiment history

View all past experiments at Experiments → Completed:
  • Search by name, type, or date
  • Filter by outcome (positive lift, no change, negative)
  • Export results for reporting

Next steps