raikusy.dev
10 Stars
/**
 * The Millisecond Check: How Big Tech Verifies Your Username in a Flash
 * @author raikusy
 * @date September 28, 2025
 * @tag redis
 * @tag typescript
 * @tag software-engineering
 * @tag system-design
 */
 

Have you ever experienced that split-second frustration when your carefully crafted username is rejected with “already taken”? Behind this seemingly simple check lies one of the most elegant engineering challenges in modern computing. At the scale of companies like Google, Meta, and Amazon—with billions of users and countless username checks per second—a naive database lookup would bring their systems to a crawl. Instead, they’ve architected sophisticated, multi-layered systems that combine probabilistic data structures, intelligent caching, and distributed databases to deliver answers in milliseconds.

The Performance Challenge at Scale

When dealing with billions of usernames, a simple SELECT ... WHERE username = ? query becomes a critical bottleneck. Each lookup in an unsorted database requires scanning through potentially millions of records, resulting in O(n) time complexity. Even with optimized indexing, the sheer volume of concurrent requests creates hot spots, tail latency, and wasted I/O. Modern systems solve this by creating a hierarchical defense system where each layer is optimized for speed and efficiency: fast in-memory filters, cache lookups, and then a source-of-truth database with strong indexes and partitioning, all behind global routing.

Layer 1: Probabilistic Filtering with Bloom Filters

The Mathematics Behind the Magic

Bloom filters serve as the first line of defense—a probabilistic data structure that can definitively say “no” but only probabilistically say “maybe”. This asymmetric property makes them perfect for username checking: if a Bloom filter says a username doesn’t exist, you can immediately return “available” without any further checks. They achieve this using a compact bitset and k hash functions, avoiding many database hits at very low memory cost with no false negatives.

Production-Ready Implementation

export class BloomFilter {
  private bitArray: Uint8Array;
  private size: number;
  private hashFunctions: number;

  constructor(expectedElements: number, falsePositiveRate: number = 0.01) {
    // Calculate optimal size: m = -(n * ln(p)) / (ln(2)^2)
    this.size = Math.ceil(-(expectedElements * Math.log(falsePositiveRate)) / (Math.log(2) ** 2));

    // Calculate optimal hash functions: k = (m/n) * ln(2)
    this.hashFunctions = Math.ceil((this.size / expectedElements) * Math.log(2));

    // Initialize bit array
    this.bitArray = new Uint8Array(Math.ceil(this.size / 8));
  }

  private murmurHash3(key: string, seed: number): number {
    let hash = seed;
    for (let i = 0; i < key.length; i++) {
      let char = key.charCodeAt(i);
      hash = Math.imul(hash ^ char, 0xcc9e2d51);
      hash = (hash << 15) | (hash >>> 17);
      hash = Math.imul(hash, 0x1b873593);
    }
    hash = hash ^ hash >>> 16;
    hash = Math.imul(hash, 0x85ebca6b);
    hash = hash ^ hash >>> 13;
    hash = Math.imul(hash, 0xc2b2ae35);
    hash = hash ^ hash >>> 16;
    return hash >>> 0;
  }

  private setBit(index: number): void {
    const byteIndex = Math.floor(index / 8);
    const bitIndex = index % 8;
    this.bitArray[byteIndex] |= (1 << bitIndex);
  }

  private getBit(index: number): boolean {
    const byteIndex = Math.floor(index / 8);
    const bitIndex = index % 8;
    return (this.bitArray[byteIndex] & (1 << bitIndex)) !== 0;
  }

  add(username: string): void {
    const normalized = username.toLowerCase();
    for (let i = 0; i < this.hashFunctions; i++) {
      const hash = this.murmurHash3(normalized, i);
      const index = hash % this.size;
      this.setBit(index);
    }
  }

  mightContain(username: string): boolean {
    const normalized = username.toLowerCase();
    for (let i = 0; i < this.hashFunctions; i++) {
      const hash = this.murmurHash3(normalized, i);
      const index = hash % this.size;
      if (!this.getBit(index)) {
        return false; // Definitely not present
      }
    }
    return true; // Might be present
  }
}

Layer 2: High-Speed Caching with Redis

When the Bloom filter indicates a username might exist, the system turns to Redis for high-speed cached lookups. Redis hashes store many username→userId mappings in RAM with constant-time access, acting as a hot cache layer before any database call.

import Redis from "ioredis";

export class UsernameCache {
  private redis: Redis;
  private readonly CACHE_PREFIX = "username:";
  private readonly HASH_KEY = "usernames";
  private readonly CACHE_TTL = 3600; // 1 hour

  constructor(redisUrl: string) {
    this.redis = new Redis(redisUrl);
  }

  async getUserId(username: string): Promise<string | null> {
    const normalized = username.toLowerCase();
    const value = await this.redis.hget(this.HASH_KEY, normalized);
    return value ?? null;
  }

  async setUserId(username: string, userId: string): Promise<void> {
    const normalized = username.toLowerCase();
    await this.redis.hset(this.HASH_KEY, normalized, userId);
    // Optional: Set TTL for the entire hash if needed
    await this.redis.expire(this.HASH_KEY, this.CACHE_TTL);
  }

  async exists(username: string): Promise<boolean> {
    const normalized = username.toLowerCase();
    const result = await this.redis.hexists(this.HASH_KEY, normalized);
    return result === 1;
  }

  async batchCacheUsernames(usernames: Map<string, string>): Promise<void> {
    const pipeline = this.redis.pipeline();

    for (const [username, userId] of usernames.entries()) {
      pipeline.hset(this.HASH_KEY, username.toLowerCase(), userId);
    }

    await pipeline.exec();
  }
}

Layer 3: Intelligent Search with Tries

Tries (prefix trees) excel at not just username verification but also powering suggestion systems. They provide O(L) lookup time where L is the length of the username, regardless of the total number of stored usernames.

interface TrieNode {
  children: Map<string, TrieNode>;
  isEndOfWord: boolean;
  userId?: string;
  popularity?: number;
}

export class UsernameTrie {
  private root: TrieNode;

  constructor() {
    this.root = {
      children: new Map(),
      isEndOfWord: false
    };
  }

  insert(username: string, userId: string, popularity: number = 0): void {
    const normalized = username.toLowerCase();
    let current = this.root;

    for (const char of normalized) {
      if (!current.children.has(char)) {
        current.children.set(char, {
          children: new Map(),
          isEndOfWord: false
        });
      }
      current = current.children.get(char)!;
    }

    current.isEndOfWord = true;
    current.userId = userId;
    current.popularity = popularity;
  }

  search(username: string): { exists: boolean; userId?: string } {
    const normalized = username.toLowerCase();
    let current = this.root;

    for (const char of normalized) {
      if (!current.children.has(char)) {
        return { exists: false };
      }
      current = current.children.get(char)!;
    }

    return {
      exists: current.isEndOfWord,
      userId: current.userId
    };
  }

  getSuggestions(prefix: string, limit: number = 5): string[] {
    const normalized = prefix.toLowerCase();
    let current = this.root;

    // Navigate to prefix node
    for (const char of normalized) {
      if (!current.children.has(char)) {
        return [];
      }
      current = current.children.get(char)!;
    }

    // Collect all words with this prefix
    const suggestions: Array<{ word: string; popularity: number }> = [];
    this.collectWords(current, normalized, suggestions);

    // Sort by popularity and return top suggestions
    return suggestions
      .sort((a, b) => (b.popularity || 0) - (a.popularity || 0))
      .slice(0, limit)
      .map(item => item.word);
  }

  private collectWords(
    node: TrieNode,
    currentWord: string,
    results: Array<{ word: string; popularity: number }>
  ): void {
    if (node.isEndOfWord) {
      results.push({
        word: currentWord,
        popularity: node.popularity || 0
      });
    }

    for (const [char, childNode] of node.children) {
      this.collectWords(childNode, currentWord + char, results);
    }
  }
}

Layer 4: Distributed Database Architecture

When all previous layers require database verification, the system relies on distributed databases optimized for massive scale. These systems use B+ tree indexes for relational stores or partitioned NoSQL systems like DynamoDB for horizontal scalability.

B+ Tree Database Indexing

-- Optimized database schema for username lookups
CREATE TABLE users (
    user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- B+ tree index for lightning-fast username lookups
CREATE UNIQUE INDEX CONCURRENTLY idx_users_username_btree
ON users USING btree (LOWER(username));

-- Partial index for active users only (space optimization)
CREATE INDEX CONCURRENTLY idx_active_users_username
ON users USING btree (LOWER(username))
WHERE deleted_at IS NULL;

DynamoDB Implementation

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { GetCommand, PutCommand, DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";

export class UsernameTable {
  private doc: DynamoDBDocumentClient;
  private tableName: string;

  constructor(region: string, tableName: string) {
    this.doc = DynamoDBDocumentClient.from(new DynamoDBClient({ region }));
    this.tableName = tableName;
  }

  async get(username: string): Promise<any | null> {
    const response = await this.doc.send(new GetCommand({
      TableName: this.tableName,
      Key: { username: username.toLowerCase() }
    }));
    return response.Item ?? null;
  }

  // Reserve username atomically with conditional write
  async reserve(username: string, userId: string): Promise<void> {
    await this.doc.send(new PutCommand({
      TableName: this.tableName,
      Item: {
        username: username.toLowerCase(),
        userId,
        createdAt: Date.now()
      },
      ConditionExpression: "attribute_not_exists(username)"
    }));
  }
}

The Complete System: Orchestrating All Layers

Here’s how all layers work together in a production-ready username verification system:

type CheckResult =
  | { status: "available" }
  | { status: "taken"; suggestions?: string[] };

export class UsernameVerificationSystem {
  constructor(
    private bloomFilter: BloomFilter,
    private cache: UsernameCache,
    private trie: UsernameTrie,
    private database: UsernameTable
  ) {}

  async checkUsername(username: string): Promise<CheckResult & { responseTime: number }> {
    const startTime = performance.now();

    try {
      // Layer 1: Bloom filter (fastest negative check)
      if (!this.bloomFilter.mightContain(username)) {
        return {
          status: "available",
          responseTime: performance.now() - startTime
        };
      }

      // Layer 2: Redis cache lookup
      if (await this.cache.exists(username)) {
        return {
          status: "taken",
          suggestions: this.generateSuggestions(username),
          responseTime: performance.now() - startTime
        };
      }

      // Layer 3: Trie search with suggestions
      const trieResult = this.trie.search(username);
      if (trieResult.exists) {
        // Update cache for future lookups
        await this.cache.setUserId(username, trieResult.userId!);

        return {
          status: "taken",
          suggestions: this.generateSuggestions(username),
          responseTime: performance.now() - startTime
        };
      }

      // Layer 4: Database verification (slowest, most authoritative)
      const dbResult = await this.database.get(username);

      if (dbResult) {
        // Update all layers with found username
        this.bloomFilter.add(username);
        this.trie.insert(username, dbResult.userId, dbResult.popularity || 0);
        await this.cache.setUserId(username, dbResult.userId);

        return {
          status: "taken",
          suggestions: this.generateSuggestions(username),
          responseTime: performance.now() - startTime
        };
      }

      return {
        status: "available",
        responseTime: performance.now() - startTime
      };

    } catch (error) {
      console.error('Username verification error:', error);
      throw new Error('Username verification failed');
    }
  }

  private generateSuggestions(username: string): string[] {
    // Try prefix-based suggestions from trie
    const base = username.replace(/\d+$/, "");
    const trieSuggestions = this.trie.getSuggestions(base, 3);

    // Add numeric suffixes as fallback
    const numericSuggestions: string[] = [];
    for (let i = 1; numericSuggestions.length + trieSuggestions.length < 5 && i <= 100; i++) {
      numericSuggestions.push(`${base}${i}`);
    }

    return [...trieSuggestions, ...numericSuggestions].slice(0, 5);
  }

  // Called when a username is officially created
  async onCreate(username: string, userId: string): Promise<void> {
    await this.database.reserve(username, userId);
    await this.cache.setUserId(username, userId);
    this.bloomFilter.add(username);
    this.trie.insert(username, userId);
  }
}

Performance Comparison

TechniquePurposeSpeedMemory ProfileTrade-off
Bloom FilterFast negative check< 1ms, bit operationsVery small bitsetFalse positives possible, never false negatives
TriePrefix search & suggestionsO(L) by key lengthHigher due to node structureExcellent for autocomplete and ordered results
Redis HashIn-RAM cache1-5ms, constant-timeRAM-backed, watch capacityTTL and cache coherence needed
B+ Tree IndexOrdered DB lookup10-50ms, logarithmicDisk + buffer cacheIndex maintenance on writes
DynamoDBDistributed source of truthSingle-digit msManaged autoscalingDesign keys for access patterns

Real-World Performance Metrics

Based on architectural patterns used by major tech companies:

  • Bloom Filter Response: < 1 millisecond (99.9% of negative results)
  • Redis Cache Hit: 1-5 milliseconds
  • Trie Lookup: 2-10 milliseconds
  • Database Query: 10-50 milliseconds (with proper indexing)
  • Overall 95th Percentile: < 15 milliseconds

The layered approach ensures that 90%+ of requests never reach the database, dramatically reducing load and improving response times.

Deployment and Global Distribution

Infrastructure Strategy

  • DNS Latency-Based Routing: Routes requests to the nearest healthy region
  • Edge Caching: Bloom filters and Redis instances deployed close to users
  • Regional Databases: Multi-region replication with eventual consistency for reads
  • Conditional Writes: Guarantee uniqueness during username registration

Optimization Tips

  1. Size Bloom filters based on expected elements and desired false positive rate
  2. Rebuild periodically from source of truth to maintain accuracy
  3. Use Redis hashes for compact storage with TTL-based eviction
  4. Implement rate limiting to protect against burst traffic
  5. Monitor cache hit rates and adjust TTLs accordingly

Conclusion

The next time you see “username already taken,” remember the sophisticated engineering marvel working behind the scenes. From probabilistic Bloom filters that eliminate millions of unnecessary checks to distributed databases that can query billions of records in milliseconds, username verification represents one of the most elegant solutions to the challenges of internet-scale computing.

This multi-layered architecture demonstrates how modern systems achieve both speed and reliability by combining different data structures and technologies, each optimized for its specific role in the verification pipeline. The result is a seamless user experience that masks the incredible complexity required to operate at the scale of billions of users, making that frustrating “username taken” message appear in just a few milliseconds.