Skip to main content
Luca Brunner·

Sketched a CRDT-based collab editor and explained exactly why Yjs beats OT for scaling to a million sockets

Generates a complete system architecture blueprint including component diagrams, data flow, integration points, and technology recommendations tailored to your requirements.

Design System Architecture from Scratch

Act as a Principal Software Architect with 20 years of experience designing large-scale distributed systems. I need you to design a complete system architecture for the following: **System Type**: {{system_type}} **Scale Requirements**: {{scale_requirements}} (e.g., 10M daily active users, 100K RPM) **Tech Constraints**: {{tech_constraints}} (e.g., must use AWS, compliance requirements, existing tech stack) **Budget**: {{budget}} (monthly infrastructure budget) Provide the following in your response: 1. **High-Level Architecture Diagram** - Describe using ASCII/text art or Mermaid diagram syntax the overall system layout 2. **Component Breakdown** - List each service/component with its responsibility and chosen technology 3. **Data Flow** - Trace the journey of a typical request from client to database and back 4. **Technology Justification** - Explain why each technology was chosen with 2-3 pros/cons alternatives considered 5. **Scalability Strategy** - How each component scales independently under load 6. **Integration Points** - External APIs, third-party services, and internal service communication patterns 7. **Failure Scenarios** - Top 5 failure modes and how the architecture handles them 8. **MVP vs Final Architecture** - What changes between initial launch and full scale 9. **Cost Breakdown** - Estimated monthly infrastructure cost per component at target scale 10. **Risk Assessment** - Technical risks and recommended mitigations Be specific with version numbers, instance types, and concrete configurations. Do not use vague recommendations.

Ergebnisse

## Architecture — Real-time collaborative document editor (target: 1M concurrent users) **Constraints assumed:** AWS, ~10k RPM steady writes, sub-100ms edit propagation, ~$8k/mo infra budget. ``` ┌──────────┐ Browsers ──WSS──▶ │ ALB │ └────┬─────┘ ┌───────────┴───────────┐ ┌─────▼─────┐ ┌─────▼─────┐ │ Edge WS │ ...... │ Edge WS │ (Node + y-websocket, autoscaled) │ gateway │ │ gateway │ └─────┬─────┘ └─────┬─────┘ └──────── Redis Pub/Sub ─┘ (fan-out across gateways) │ ┌──────▼───────┐ │ CRDT service │ ──▶ Postgres (doc snapshots) └──────────────┘ ──▶ S3 (version history) ``` **Component breakdown:** | Component | Tech | Responsibility | |-----------|------|----------------| | Edge WS gateway | Node 20 + `y-websocket` | holds socket connections, relays CRDT updates | | Sync fabric | Redis 7 Pub/Sub | broadcasts edits between gateways | | CRDT service | Rust + `yrs` (Yjs port) | merges concurrent edits conflict-free | | Snapshot store | Postgres 16 | latest doc state + metadata | **Why CRDT over OT:** Yjs/`yrs` gives conflict-free merges without a central sequencer, so any gateway can accept a write — that's what makes horizontal scaling to 1M sockets feasible. OT would need a single ordering authority per doc (a bottleneck). Trade-off: CRDTs carry more metadata, mitigated by periodic snapshot compaction to S3. **Scaling:** gateways are stateless (sockets only) and scale on connection count; Redis Pub/Sub fans out edits; the CRDT service shards by `docId`. **Top failure mode:** a Redis partition splits gateways — handled by each gateway persisting to Postgres independently and reconciling via CRDT merge on reconnect, so no edits are lost. **MVP cut:** single gateway + Postgres `LISTEN/NOTIFY` instead of Redis until ~10k concurrent, then introduce the Pub/Sub fabric.

Modell: Claude Opus 4

21 Likes7 SavesScore: 18

1 Kommentar

Ryan Mitchell·

I'd ship this as-is. Maybe rename one var, that's it.