Comparing Rate Limiting Strategies

Rate limiting protects the LudoKing API infrastructure from traffic spikes, abusive clients, and denial-of-service conditions while ensuring fair resource allocation across all consumers. Choosing the right algorithm depends on your traffic patterns, consistency requirements, and infrastructure constraints. The three dominant algorithms — fixed window, sliding window, and token bucket — each make distinct trade-offs between implementation complexity, memory efficiency, and fairness.

Fixed window counting divides time into discrete windows (e.g., 60-second intervals) and counts requests within each window. It is the simplest algorithm to implement but suffers from the boundary burst problem: a client can send twice its limit in the final second of one window and the first second of the next, effectively doubling throughput at window boundaries. For a Ludo multiplayer API where game-match endpoints receive burst traffic at match start and end, this creates dangerous traffic spikes.

Sliding window logging tracks every request timestamp in a sorted data structure (typically a Redis sorted set) and counts only requests within the current time range. This eliminates boundary bursts completely and provides a smooth, accurate rate limit. The cost is higher memory usage — each request needs a timestamp entry — and slower lookups for high-throughput endpoints. However, with Redis sorted sets and pipelined Lua scripts, sliding window logging handles thousands of requests per second efficiently.

Token bucket models each client as a bucket that fills with tokens at a constant rate. Each request consumes one token; if the bucket is empty, the request is rejected. The bucket has a maximum capacity, allowing clients to accumulate tokens during idle periods and consume them in bursts. Token bucket is ideal for Ludo game APIs because match-making creates natural burst patterns: a client may send 10 rapid requests when joining a lobby, then go quiet during gameplay. Token bucket accommodates this pattern without penalizing the client, whereas fixed or sliding window would throttle the burst unfairly.

Redis Token Bucket Implementation

The token bucket algorithm requires storing two pieces of state per client: the current token count and the last refill timestamp. Redis is the canonical backing store because its atomic Lua scripting capability ensures that read-modify-write operations on the bucket happen atomically, eliminating race conditions under concurrent load. The Lua script executes the entire token consumption logic in a single Redis operation, guaranteeing that two simultaneous requests from the same client cannot both consume the same token.

Lua — Redis token bucket via Lua script
-- KEYS[1] = bucket key (e.g., "ratelimit:player:abc123:games")
-- ARGV[1] = capacity (max tokens)
-- ARGV[2] = refill rate (tokens per second)
-- ARGV[3] = current timestamp in milliseconds
-- ARGV[4] = tokens to consume (default 1)

local key        = KEYS[1]
local capacity   = tonumber(ARGV[1])
local refillRate = tonumber(ARGV[2])
local now       = tonumber(ARGV[3])
local toConsume = tonumber(ARGV[4]) or 1

-- Fetch current bucket state
local bucket = redis.call('HMGET', key, 'tokens', 'lastRefill')
local tokens     = tonumber(bucket[1]) or capacity
local lastRefill = tonumber(bucket[2]) or now

-- Calculate tokens to add based on elapsed time
local elapsed      = (now - lastRefill) / 1000.0
local tokensToAdd = elapsed * refillRate
tokens = math.min(capacity, tokens + tokensToAdd)

local allowed = 0
local newTokens = tokens

if tokens >= toConsume then
  allowed   = 1
  newTokens = tokens - toConsume
  lastRefill = now
end

-- Persist updated bucket state with TTL (2x window for cleanup)
redis.call('HMSET', key, 'tokens', newTokens, 'lastRefill', lastRefill)
redis.call('EXPIRE', key, 120)

-- Return [allowed (0|1), remaining tokens, retry-after ms]
local retryAfter = 0
if allowed == 0 then
  retryAfter = math.ceil((toConsume - newTokens) / refillRate * 1000)
end
return { allowed, math.floor(newTokens), retryAfter }

The Lua script executes atomically within Redis's single-threaded event loop, making it safe for concurrent access without locks. The capacity parameter defines the maximum burst size — for the LudoKing API's free tier game endpoints (100 requests/minute), a capacity of 100 with a refill rate of 1.67 tokens/second allows a client to accumulate tokens during idle periods and then burst up to 100 requests instantly. The TTL of 120 seconds ensures stale bucket keys are automatically evicted, preventing unbounded Redis memory growth.

TypeScript — TokenBucketRateLimiter class
import Redis from 'ioredis';

interface RateLimitConfig {
  capacity: number;        // Max tokens in bucket
  refillRate: number;   // Tokens added per second
  windowSecs: number;  // Reporting window (e.g. 60 for per-minute)
}

interface RateLimitResult {
  allowed: boolean;
  remaining: number;
  retryAfterMs: number;
  limit: number;
  resetAt: number;      // Unix timestamp when bucket refills fully
}

const LUA_TOKEN_BUCKET = `
local key        = KEYS[1]
local capacity   = tonumber(ARGV[1])
local refillRate = tonumber(ARGV[2])
local now       = tonumber(ARGV[3])
local toConsume = tonumber(ARGV[4]) or 1
local bucket = redis.call('HMGET', key, 'tokens', 'lastRefill')
local tokens     = tonumber(bucket[1]) or capacity
local lastRefill = tonumber(bucket[2]) or now
local elapsed      = (now - lastRefill) / 1000.0
local tokensToAdd = elapsed * refillRate
tokens = math.min(capacity, tokens + tokensToAdd)
local allowed = 0
local newTokens = tokens
if tokens >= toConsume then
  allowed   = 1
  newTokens = tokens - toConsume
  lastRefill = now
end
redis.call('HMSET', key, 'tokens', newTokens, 'lastRefill', lastRefill)
redis.call('EXPIRE', key, 120)
local retryAfter = 0
if allowed == 0 then
  retryAfter = math.ceil((toConsume - newTokens) / refillRate * 1000)
end
return { allowed, math.floor(newTokens), retryAfter }
`;

export class TokenBucketRateLimiter {
  private redis: Redis;
  private configs: Map<string, RateLimitConfig>;

  constructor(redisUrl: string) {
    this.redis   = new Redis(redisUrl);
    this.configs = new Map();
    this.redis.defineCommand('tokenBucket', {
      numberOfKeys: 1,
      lua: LUA_TOKEN_BUCKET,
    });
  }

  registerEndpoint(key: string, config: RateLimitConfig) {
    this.configs.set(key, config);
  }

  async check(
    identifier: string,       // e.g., apiKey, playerId, or IP
    endpointKey: string,
    tokensToConsume = 1
  ): Promise<RateLimitResult> {
    const config = this.configs.get(endpointKey);
    if (!config) throw new Error(`Unknown endpoint: ${endpointKey}`);

    const redisKey = `ratelimit:${endpointKey}:${identifier}`;
    const now = Date.now();

    const result = await this.redis.tokenBucket(
      redisKey,
      config.capacity,
      config.refillRate,
      now,
      tokensToConsume
    ) as [number, number, number];

    const [allowedNum, remaining, retryAfterMs] = result;
    const resetAt = now + Math.ceil((config.capacity - remaining) / config.refillRate * 1000);

    return {
      allowed:     allowedNum === 1,
      remaining,
      retryAfterMs,
      limit:       config.capacity,
      resetAt:     Math.floor(resetAt / 1000),
    };
  }

  async close() { await this.redis.quit(); }
}

Sliding Window Counter Alternative

The sliding window counter provides rate limiting accuracy between fixed window simplicity and sliding window log precision. It maintains a fixed number of sub-windows (typically 60, one per second) and computes the effective request count as a weighted sum: the full count from recent sub-windows plus a fractional count from the partially-elapsed current sub-window. This reduces memory usage compared to per-request logging while eliminating the boundary burst problem that plagues fixed window counters.

Lua — Sliding window counter
-- Sliding window counter using Redis sorted sets

-- KEYS[1] = window key, ARGV[1] = max requests, ARGV[2] = window size (ms)
-- ARGV[3] = current timestamp (ms), ARGV[4] = request ID

local key     = KEYS[1]
local limit  = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now    = tonumber(ARGV[3])
local reqId  = ARGV[4]

-- Remove expired entries outside the window
redis.call('ZREMRANGEBYSCORE', key, '-inf', now - window)

-- Count requests in current window
local count = redis.call('ZCARD', key)

if count < limit then
  -- Add this request with timestamp as score
  redis.call('ZADD', key, now, reqId)
  redis.call('EXPIRE', key, math.ceil(window / 1000) + 1)
  return { 1, limit - count - 1, 0 } -- allowed, remaining, retryAfter
else
  -- Get oldest entry to calculate retry-after
  local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
  local retryAfter = 0
  if oldest[2] then
    retryAfter = math.ceil((tonumber(oldest[2]) + window - now) / 1000)
  end
  return { 0, 0, retryAfter } -- denied, remaining, retryAfter
end

The sliding window counter trades off per-request memory (using a sorted set entry per request rather than two numbers per bucket) for absolute accuracy. For the LudoKing API's leaderboard and analytics endpoints that need precise per-second fairness, the sliding window counter is the better choice. For game-critical endpoints like /games/:id/move where token bucket's burst accommodation matters more, the token bucket remains preferable. Many production systems use both algorithms simultaneously — token bucket for endpoints that benefit from bursting and sliding window for endpoints requiring strict fairness.

Express Rate Limit Middleware

The Express middleware integrates the rate limiter into the request lifecycle. It runs before route handlers, extracts the client identifier (API key, player ID, or IP address), checks the rate limit via Redis, and either allows the request to proceed or rejects it with a 429 response and appropriate headers. The middleware is designed as a factory function that accepts per-endpoint configurations, making it reusable across the entire LudoKing API surface.

TypeScript — Express rate limit middleware
import { Request, Response, NextFunction, RequestHandler } from 'express';
import { TokenBucketRateLimiter } from './TokenBucketRateLimiter';

interface MiddlewareConfig {
  limiter: TokenBucketRateLimiter;
  endpointKey: string;
  keyGenerator: (req: Request) => string; // Extract client identifier
  tokensToConsume?: number;
  skipSuccessfulRequests?: boolean;
  skipFailedRequests?: boolean;
}

export function createRateLimitMiddleware(config: MiddlewareConfig): RequestHandler {
  const {
    limiter,
    endpointKey,
    keyGenerator,
    tokensToConsume = 1,
  } = config;

  return async (req: Request, res: Response, next: NextFunction) => {
    const identifier = keyGenerator(req);

    try {
      const result = await limiter.check(identifier, endpointKey, tokensToConsume);

      // Attach rate limit info to response headers
      res.setHeader('X-RateLimit-Limit',     result.limit);
      res.setHeader('X-RateLimit-Remaining', result.remaining);
      res.setHeader('X-RateLimit-Reset',     result.resetAt);

      if (!result.allowed) {
        res.setHeader('Retry-After', Math.ceil(result.retryAfterMs / 1000));
        res.setHeader('Content-Type', 'application/json');
        return res.status(429).json({
          error: 'Too Many Requests',
          code: 'RATE_LIMIT_EXCEEDED',
          message: `Rate limit exceeded for this endpoint. Retry after ${Math.ceil(result.retryAfterMs / 1000)} seconds.`,
          retryAfter: Math.ceil(result.retryAfterMs / 1000),
          limit:     result.limit,
          remaining: 0,
        });
      }

      next();
    } catch (err) {
      // On Redis failure, fail open (allow request) to avoid blocking all traffic
      console.error('Rate limiter error, failing open:', err);
      next();
    }
  };
}

// Key generators for different identifier strategies
export function byApiKey(req: Request): string {
  const apiKey = req.get('X-API-Key') || req.get('Authorization')?.replace('Bearer ', '');
  if (!apiKey) return req.ip || 'anonymous';
  return hashIdentifier(apiKey); // Hash to avoid leaking key material in Redis keys
}

export function byPlayerId(req: Request): string {
  return req.params.playerId || req.body?.playerId || byApiKey(req);
}

function hashIdentifier(id: string): string {
  // SHA-256 truncated to 16 hex chars — enough uniqueness, no key leakage
  // In production, use crypto.createHash('sha256').update(id).digest('hex').slice(0, 16)
  return id.slice(0, 16);
}

// --- Usage in Express app ---
const limiter = new TokenBucketRateLimiter(process.env.REDIS_URL!);
limiter.registerEndpoint('games-read',  { capacity: 100, refillRate: 1.67,  windowSecs: 60 });
limiter.registerEndpoint('games-write', { capacity: 20,  refillRate: 0.33,  windowSecs: 60 });
limiter.registerEndpoint('move',        { capacity: 30,  refillRate: 5.0,   windowSecs: 60 });

app.get('/games/:id',
  createRateLimitMiddleware({ limiter, endpointKey: 'games-read',  keyGenerator: byApiKey }),
  getGame
);
app.post('/games',
  createRateLimitMiddleware({ limiter, endpointKey: 'games-write', keyGenerator: byApiKey }),
  createGame
);
app.post('/games/:id/move',
  createRateLimitMiddleware({ limiter, endpointKey: 'move',        keyGenerator: byPlayerId }),
  processMove
);

The middleware fails open on Redis errors — a deliberate design choice. If Redis becomes unreachable, blocking all API traffic would be far more damaging than briefly allowing unlimited requests. The trade-off is acceptable for most Ludo game APIs where the worst-case scenario of a brief unthrottled period is manageable, while a Redis outage causing a complete API outage is not. For high-security deployments, configure the middleware to fail closed instead.

Per-Endpoint Rate Limit Tiers

Not all LudoKing API endpoints carry the same operational cost or fairness sensitivity. A POST /games/:id/move call triggers database writes, broadcasts to connected WebSocket clients, and potentially invalidates leaderboard caches — making it far more expensive than a GET /games/:id that serves a cached response. Configuring tiered limits ensures that computationally expensive endpoints cannot be used to amplify traffic and that shared resources are distributed according to endpoint value.

TypeScript — Per-endpoint rate limit configuration
import { RequestHandler } from 'express';
import { TokenBucketRateLimiter } from '../lib/rateLimiter';

// Define all endpoint limits: [capacity, refillRate/sec]
const ENDPOINT_LIMITS = {
  // Game state endpoints (read-heavy, cached)
  'GET:/games':         [120,  2.0],   // 120/min
  'GET:/games/:id':      [60,   1.0],   // 60/min
  'GET:/leaderboard':     [30,   0.5],   // 30/min

  // Game write endpoints (expensive, fairness-critical)
  'POST:/games':          [10,   0.17],  // 10/min
  'POST:/games/:id/move':  [30,   0.5],   // 30/min (game-critical)
  'POST:/games/:id/join':  [10,   0.17],  // 10/min
  'POST:/games/:id/leave': [10,   0.17],  // 10/min

  // WebSocket handshake (connection setup only)
  'WS:/connect':           [5,    0.08],  // 5/min

  // Analytics and bulk endpoints (lower priority)
  'GET:/analytics':         [20,   0.33],  // 20/min
  'GET:/player/:id/history': [60,   1.0],   // 60/min
} as const;

export function buildLimiter(redisUrl: string): TokenBucketRateLimiter {
  const limiter = new TokenBucketRateLimiter(redisUrl);

  for (const [endpoint, [capacity, refillRate]] of Object.entries(ENDPOINT_LIMITS)) {
    limiter.registerEndpoint(endpoint, {
      capacity,
      refillRate,
      windowSecs: 60,
    });
  }

  return limiter;
}

// Global rate limit for unknown endpoints
const GLOBAL_LIMIT = { capacity: 200, refillRate: 3.33, windowSecs: 60 };

Retry Wrapper with Exponential Backoff

When a rate-limited request returns 429, the client must retry intelligently. The naive approach — immediate retry or fixed-interval retry — creates synchronized request waves (the thundering herd problem) that overwhelm the server at exactly the moment it is already under load. Exponential backoff with jitter solves this by doubling the wait time after each failure and adding randomness to spread retries across a time window rather than a fixed point.

TypeScript — Retry wrapper with exponential backoff and jitter
interface RetryOptions {
  maxRetries: number;
  baseDelayMs: number;
  maxDelayMs: number;
  backoffFactor: number;
  jitter: boolean;
  retryableStatuses: number[];
  onRetry?: (attempt: number, error: Error, delayMs: number) => void;
}

const DEFAULT_OPTIONS: RetryOptions = {
  maxRetries:        5,
  baseDelayMs:       1000,
  maxDelayMs:        30000,
  backoffFactor:     2,
  jitter:            true,
  retryableStatuses: [429, 500, 502, 503, 504],
};

async function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

/**
 * Full jitter algorithm — optimal for distributed retries.
 * Delay = random(0, min(maxDelay, baseDelay * 2^attempt))
 * Spreads retries uniformly across the full backoff window,
 * minimizing collision probability in concurrent client scenarios.
 */
function fullJitterDelay(attempt: number, baseDelay: number, maxDelay: number, factor: number): number {
  const exponentialDelay = Math.min(maxDelay, baseDelay * Math.pow(factor, attempt));
  return Math.floor(Math.random() * exponentialDelay);
}

/**
 * Decorrelated jitter — better for high-latency backends.
 * Delay = random(baseDelay, previousDelay * 3)
 * Provides stronger decorrelation between retry attempts.
 */
function decorrelatedJitterDelay(attempt: number, baseDelay: number, prevDelay: number): number {
  return Math.floor(baseDelay + Math.random() * prevDelay * 3);
}

export async function fetchWithRetry<T>(
  url: string,
  options: RequestInit = {},
  userOptions: Partial<RetryOptions> = {}
): Promise<Response> {
  const opts = { ...DEFAULT_OPTIONS, ...userOptions };
  let lastError: Error;
  let prevDelay = opts.baseDelayMs;

  for (let attempt = 0; attempt <= opts.maxRetries; attempt++) {
    try {
      const response = await fetch(url, options);

      // Handle rate limit specifically
      if (response.status === 429) {
        // Honor the server's Retry-After header if present
        const retryAfterHeader = response.get('Retry-After');
        let delay: number;

        if (retryAfterHeader) {
          // Server tells us exactly when to retry
          delay = parseInt(retryAfterHeader, 10) * 1000;
        } else {
          // Calculate exponential backoff
          delay = opts.jitter
            ? decorrelatedJitterDelay(attempt, opts.baseDelayMs, prevDelay)
            : Math.min(opts.maxDelayMs, opts.baseDelayMs * Math.pow(opts.backoffFactor, attempt));
        }

        prevDelay = delay;

        if (attempt === opts.maxRetries) {
          throw new RateLimitExceededError(
            `Rate limit exceeded after ${opts.maxRetries} retries. Total wait: ${delay}ms`
          );
        }

        const retryAfter = response.get('X-RateLimit-Reset');
        const waitUntil  = retryAfter
          ? Math.max(delay, parseInt(retryAfter, 10) * 1000 - Date.now())
          : delay;

        opts.onRetry?.(attempt, new Error('429 Rate Limit'), waitUntil);
        console.warn(`[RateLimit] Attempt ${attempt + 1}/${opts.maxRetries} — waiting ${Math.round(waitUntil)}ms`);
        await sleep(waitUntil);
        continue;
      }

      // Handle other retryable errors
      if (!opts.retryableStatuses.includes(response.status)) {
        return response; // Non-retryable — return immediately
      }

      const delay = opts.jitter
        ? fullJitterDelay(attempt, opts.baseDelayMs, opts.maxDelayMs, opts.backoffFactor)
        : Math.min(opts.maxDelayMs, opts.baseDelayMs * Math.pow(opts.backoffFactor, attempt));

      if (attempt === opts.maxRetries) {
        throw new HTTPError(`HTTP ${response.status} after ${opts.maxRetries} retries`, response.status);
      }

      opts.onRetry?.(attempt, new Error(`HTTP ${response.status}`), delay);
      await sleep(delay);

    } catch (err) {
      lastError = err as Error;
      if (attempt === opts.maxRetries) break;
      const delay = Math.min(opts.maxDelayMs, opts.baseDelayMs * Math.pow(opts.backoffFactor, attempt));
      await sleep(delay);
    }
  }

  throw lastError!;
}

class RateLimitExceededError extends Error {
  constructor(msg: string) { super(msg); this.name = 'RateLimitExceededError'; }
}
class HTTPError extends Error {
  constructor(msg: string, public status: number) { super(msg); this.name = 'HTTPError'; }
}

// --- Usage examples ---

// Simple game move with automatic retry
async function submitMove(gameId: string, move: MovePayload) {
  const response = await fetchWithRetry(
    `${API_BASE}/games/${gameId}/move`,
    {
      method: 'POST',
      headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' },
      body: JSON.stringify(move),
    },
    {
      maxRetries: 4,
      baseDelayMs: 500,
      onRetry: (attempt, err, delay) => {
        console.log(`Retrying move submission — attempt ${attempt + 1}, waiting ${delay}ms: ${err.message}`);
      },
    }
  );
  return response.json();
}

Rate Limit Response Headers

Standardized rate limit response headers allow clients to implement proactive throttling — adjusting their request rate before hitting the limit rather than reacting to 429 responses. The LudoKing API follows the IETF draft standard for rate limit headers, with four primary headers: X-RateLimit-Limit (total budget), X-RateLimit-Remaining (requests left in the current window), X-RateLimit-Reset (Unix timestamp of window reset), and Retry-After (present only on 429 responses).

TypeScript — Rate limit header parser
interface RateLimitHeaders {
  limit: number;
  remaining: number;
  resetAt: number;
  retryAfter: number | null;
}

function parseRateLimitHeaders(response: Response): RateLimitHeaders {
  const parseInt_ = (v: string | null) => (v ? parseInt(v, 10) : 0);
  return {
    limit:     parseInt_(response.headers.get('X-RateLimit-Limit')),
    remaining: parseInt_(response.headers.get('X-RateLimit-Remaining')),
    resetAt:   parseInt_(response.headers.get('X-RateLimit-Reset')),
    retryAfter: parseInt_(response.headers.get('Retry-After')),
  };
}

// Adaptive client that tracks quota and paces requests
class AdaptiveRateLimitClient {
  private remaining: number = 0;
  private resetAt: number = 0;
  private limit: number = 0;
  private pendingRequests: number = 0;

  async request<T>(url: string, init?: RequestInit): Promise<T> {
    await this.waitForQuota();

    const response = await fetchWithRetry(url, init);
    const headers = parseRateLimitHeaders(response);

    // Sync local quota state with server response
    this.remaining = headers.remaining;
    this.resetAt   = headers.resetAt;
    this.limit     = headers.limit;

    if (response.status === 429) {
      console.warn(`Server rejected request. Reset at ${new Date(headers.resetAt * 1000)}`);
    }

    return response.json();
  }

  private async waitForQuota(): Promise<void> {
    // Wait until the reset window has passed if fully depleted
    if (this.remaining <= 0 && this.resetAt > Date.now() / 1000) {
      const waitMs = (this.resetAt * 1000) - Date.now();
      console.log(`Quota depleted. Waiting ${Math.ceil(waitMs / 1000)}s for reset...`);
      await sleep(waitMs);
    }
    this.pendingRequests++;
    this.remaining = Math.max(0, this.remaining - 1);
  }
}

Fair Use Policies and Tiered Access

Rate limits alone do not guarantee fair resource distribution — sophisticated clients can create many API key accounts to multiply their effective quota (the multi-tenant abuse problem). Fair use policies layer on top of rate limits to enforce per-organization and per-IP constraints that prevent quota multiplication. The LudoKing API implements a three-dimensional fair use model: per API key (token bucket), per IP address (sliding window counter), and per organization (aggregate limit across all org keys).

Tiered access also enables business model differentiation. Free-tier clients share a pooled global limit on expensive operations like game state writes, while Pro-tier clients receive dedicated quota allocations. Tournament and enterprise tiers get negotiated limits with SLA-backed availability. This tiering is implemented by mapping each API key to a rate limit tier in Redis and selecting the appropriate limit configuration at request time.

Beyond technical limits, fair use policies define what constitutes acceptable API usage: no scraping of leaderboard data, no automated player accounts that simulate human gameplay without disclosure, no redistribution of cached game state to third parties, and no use of the API to power competing Ludo platforms. Violations result in progressive remediation — warning, temporary suspension, then permanent revocation. The API Documentation has the complete terms of service and fair use guidelines.

Frequently Asked Questions

Token bucket allows burst traffic up to the bucket capacity, refilling tokens at a steady rate. A client with 100 tokens can send 100 requests instantly, then wait for tokens to replenish. Sliding window counts all requests within a rolling time window (e.g., the last 60 seconds) and enforces an exact limit — no bursting. Token bucket is ideal for APIs where clients have natural idle periods (like Ludo gameplay), while sliding window is better for endpoints requiring strict fairness guarantees. The LudoKing API uses token bucket for game-critical endpoints like /move and sliding window for leaderboard and analytics endpoints.
Redis executes Lua scripts atomically — no other command can run between statements in the script. For rate limiting, this atomicity is critical: without it, two concurrent requests from the same client could both read the token count, both see tokens available, and both decrement — allowing one extra request past the limit. A Lua script bundles the read, compute, and write operations into a single atomic step, guaranteeing correctness under any concurrent load. Without Lua scripting, you would need distributed locks (Redis Redlock or similar), which add latency and complexity. For more on Redis patterns, see the Ludo Game Database Schema guide.
For most Ludo game APIs, failing open (allowing requests when Redis is unreachable) is the correct choice because the cost of temporary unthrottled traffic is lower than the cost of a complete API outage. A 30-second window of unlimited requests during a Redis failover is recoverable; blocking all 10,000 concurrent players because your rate limiter depends on Redis is catastrophic. However, for high-security or billing-critical endpoints (like payment or account creation), failing closed is preferable. Implement this as a configurable option in your middleware — the default should be fail open, with explicit fail-closed configuration for sensitive routes.
The Retry-After header communicates the exact moment the server will accept requests again, eliminating the need for the client to guess via exponential backoff. When present, the client should wait exactly the specified duration rather than calculating its own delay. The server's reset timestamp accounts for all concurrent clients hitting the limit simultaneously, whereas a client's independently calculated backoff may overshoot or undershoot the actual refill time. Always prioritize Retry-After over calculated backoff when it is present in a 429 response. The REST API Reference documents which endpoints return Retry-After headers.
Free tier: 100 REST requests/minute, 10 WebSocket messages/second. Pro tier: 500 REST requests/minute, 30 WebSocket messages/second. Tournament tier: negotiated limits (typically 2,000-10,000 requests/minute) scoped to event duration. Enterprise tier: custom SLA-backed limits with dedicated Redis shards. All tiers share the same per-endpoint ratios — Pro does not get higher limits on individual expensive endpoints like /move; it gets more requests on the aggregate budget. See the Node.js SDK Guide for client-side token bucket implementation tuned to each tier's limits.
Tournament bots should implement a request queue with client-side rate limiting that stays within the token bucket budget proactively — never waiting for 429 responses if avoidable. Maintain a token balance counter that decrements on each request and refills at the configured rate, pausing new requests when the balance hits zero. This approach eliminates retry overhead entirely under normal load. For tournament-scale bots managing multiple game rooms simultaneously, distribute requests across multiple API keys (one per managed room) to multiply available quota within fair use limits. Contact the LudoKing API team for a tournament-tier API key that includes higher aggregate limits for automated tournament operations.

Need Higher Rate Limits for Your Ludo Platform?

Contact us to discuss Pro or Enterprise tier rate limits, tournament quotas, and fair use policies for your Ludo API integration.