Rally Scoring System

The Rally scoring system uses AI-powered evaluation to ensure fair and transparent reward distribution based on content quality and engagement.

TLDR

Quality Gates: Submissions must pass 4 AI-evaluated gates (alignment, accuracy, compliance, originality)
Dual Scoring: Fixed quality component + dynamic engagement metrics determine your Campaign Points
Metric Weights: Each of the 11 metrics has a weight (0–1); 0 disables it, higher means more impact
Fair Distribution: Projects choose reward distribution curves (Balanced, Default, or Extreme)
Refresh Engagement: Update engagement metrics in later periods for additional Campaign Points
Multi-LLM Consensus: All evaluations use multiple AI models for fairness

Overview

The scoring system evaluates tweets across multiple dimensions, combining intrinsic quality metrics with real-time engagement data to calculate fair rewards. Campaign Points determine a content creator’s share of the rewards pot — higher points mean higher rewards. During the Alpha phase, these rewards are distributed as Rally Points.

Categories of Evaluation

At a glance, Rally evaluates content across 3 categories (11 total metrics):

Gates (4, scored 0–2 each): Content Alignment, Information Accuracy, Campaign Compliance, Originality & Authenticity. Submissions must pass all gates (score > 0) to be eligible for rewards.
Quality Metrics (2, scored 0–5): Engagement Potential, Technical Quality. These capture intrinsic quality beyond the gates.
Engagement Metrics (5, dynamic): Retweets, Likes, Replies, Quality of Replies (AI-evaluated), Followers of Repliers. These update over time based on real audience interaction.

Reward Distribution Curves

Projects select how sharply rewards concentrate among top-performing creators:

Balanced: ~25% of rewards to the top 10%
Default: ~90% of rewards to the top 10%
Extreme: ~99% of rewards to the top 10%

These curves are implemented in the scoring formula via alpha (α). Higher α concentrates rewards more among top performers; lower α spreads them more evenly. See the math section for the exact mapping.

Weights (0–1 per metric)

Each of the 11 metrics can be assigned a weight from 0 to 1:

1.0: Maximum importance
0.0: Metric is completely turned off (does not affect scoring)
Values in between scale the metric’s relative impact

Weights are visible to campaign managers in the setup wizard and to content creators in the campaign briefing, so priorities are transparent. Projects can tailor these per campaign. For example:

To emphasize quality over engagement, lower the weights for engagement metrics (RT, LK, RP, QR, FR) and raise the weights for EP and TQ.
If uniqueness is not a priority, set the Originality & Authenticity gate weight to 0 to turn it off. Turning a gate off means it will not disqualify submissions and will not influence the gate multiplier.

Refresh Engagement (Overview)

Refresh Engagement allows content creators to update engagement metrics for prior submissions in later periods to capture ongoing performance.

After the initial submission in a period, a transaction can be submitted in subsequent periods to refresh engagement metrics for that submission
The quality component remains fixed from the first submission; only engagement metrics are refreshed
The refresh credits only the positive difference versus the previous baseline, rewarding genuine growth in engagement

Detailed Metrics

Gates (pass/fail with quality)

Submissions must pass through four quality gates. Each gate scores from 0-2:

0 = Fail (disqualifies the submission)
1-2 = Pass with quality rating

1. Content Alignment (0–2)

How well the content aligns with the campaign’s message and values:

Message accuracy
Correct terminology usage
Brand consistency
Target audience fit

2. Information Accuracy (0–2)

Factual correctness of the content:

Technical accuracy
Consistency with official materials
Accurate data and statistics
Proper context

3. Campaign Compliance (0–2)

Adherence to campaign rules:

Required hashtags and mentions
Format requirements
Style guidelines
Necessary disclosures

4. Originality & Authenticity (0–2)

Uniqueness and authentic voice:

Fresh perspective
Personal insights
Natural language
Creative expression

Quality Metrics

Beyond the gates, two additional quality metrics are evaluated:

Engagement Potential (0–5)

Hook effectiveness
Call-to-action quality
Content structure
Conversation potential

Technical Quality (0–5)

Grammar and spelling
Formatting and structure
Platform optimization
Media integration

Engagement Metrics

These metrics update over time based on content performance:

Direct Metrics (dynamic)

Retweets (RT) - Amplification of the message (log-scaled)
Likes (LK) - Audience appreciation (log-scaled)
Replies (RP) - Conversation generation (log-scaled)

Advanced Metrics (dynamic)

Quality of Replies (QR) - AI analysis of reply quality (0–1)
Followers of Repliers (FR) - Reach of engaged audience (log-scaled)

Thread Scoring

For multi-tweet threads, we use the peak performance of any tweet in the thread for each metric. Ranges and scaling:

RT, LK, RP: counted via log(R+1), log(L+1), log(RP+1)
QR: 0–1 AI score reflecting relevance, civility, informativeness
FR: log(Followers+1) for accounts that replied

Score Calculation (Full Math)

The Complete Formula

The scoring system uses a multi-step calculation:

Step 1: Gate Pass & Multiplier


gate_pass = min(G₁, G₂, G₃, G₄) > 0
g_star = avg(G₁, G₂, G₃, G₄)
M_gate = 1 + β × (g_star - 1)

Where G₁-G₄ are the four gate scores (0-2) and β = 0.5

Step 2: Campaign Points (Q Score)


Campaign_Points = M_gate × Σ(W[i] × normalized_metrics[i])

Where:

W is the vector of metric weights in [0,1] for the 11 metrics (W[i]=0 turns a metric off; higher values increase impact)
Normalized metrics include: EP, TQ, log(RT+1), log(LK+1), log(RP+1), QR, log(FR+1)

Step 3: Period Accumulation (Refresh Engagement)

New submissions: user_Q[period] += Campaign_Points
Refresh Engagement: user_Q[period] += max(0, Q_current - Q_baseline)

Step 4: Final Distribution (Distribution Curves)


S_user = max(user_Q[period], 0)^α
share_user = S_user / Σ(S_all_users)
rewards_user = share_user × total_rewards

Where α corresponds to the selected curve:

Balanced → α = 1.0
Default → α = 3.0
Extreme → α = 8.0

Higher α values increase concentration (more to the very top performers), while lower α values distribute rewards more broadly.

The Formula (Simplified)

Gate Multiplier: Exceptional gate performance (scores >1) provides a bonus
Campaign Points: Weighted combination of all metrics
Period Accumulation: Scores accumulate throughout the campaign period
Final Distribution: At period end, rewards are distributed based on relative performance

Refresh Engagement System

Content creators can refresh engagement in later periods to capture additional performance:

Quality component remains fixed from first submission
Only engagement metrics are updated
You earn the difference between new and old Campaign Points
Prevents gaming through repeated refreshes

Tips for High Scores

Quality Tips

Research the campaign thoroughly before tweeting
Be authentic - use a distinctive voice and perspective
Follow all requirements exactly
Add value - don’t just repeat talking points

Engagement Tips

Post at optimal times for the intended audience
Engage with replies to boost conversation
Create compelling hooks to grab attention
Use threads strategically - the best-performing tweet counts

Distribution Curves (at a glance)

Projects choose the model that best fits their goals. At a high level:

Balanced: ~25% of rewards to the top 10% of content creators
Default: ~90% of rewards to the top 10% of content creators
Extreme: ~99% of rewards to the top 10% of content creators

Important Notes

Images are not evaluated - Focus on text content
Quality component is permanent - The first submission is the baseline
Engagement updates - Check back to resubmit high-performing tweets
Period-based distribution - Rewards distributed at campaign period end

Technical Details

For developers and projects wanting deeper understanding:

Scores stored as “atto” values (×10^18) for precision
Vector similarity used for originality comparison
Non-deterministic LLM calls use consensus validation
Maximum 2-point variance allowed between validators

This scoring system ensures fair rewards for quality content while preventing spam and low-effort submissions. The goal is to support campaigns and audiences with valuable, engaging content.