Tobias Keller·
A publish-ready RFC for moving sessions to Redis with an alternatives table and rollback plan
Generates a complete, publication-ready Request for Comments (RFC) document following industry standards with all required sections for architecture review.
RFC Document Generator
You are a Principal Engineer at a top tech company who has written 100+ RFCs that were adopted across engineering organizations. Write a complete RFC document.
**RFC Title**: {{rfc_title}}
**Problem Statement**: {{problem_statement}} (what problem are we solving and why now?)
**Proposed Solution Overview**: {{proposed_solution_overview}} (high-level description of the proposed approach)
**Stakeholders**: {{stakeholders}} (teams affected, decision makers, reviewers)
Generate a complete RFC with the following structure:
1. **Metadata Header** - RFC Number, Status (Draft/Proposed/Accepted/Deprecated), Authors, Date, Reviewers
2. **Summary** - One-paragraph executive summary that any engineer can read in 30 seconds
3. **Motivation** - Detailed problem statement with concrete examples, impact quantification (dollars/hours/incidents), why status quo is insufficient
4. **Goals** - 3-5 measurable success criteria (e.g., 'reduce P99 latency by 40%', 'eliminate 2 on-call pages per week')
5. **Non-Goals** - Explicitly out of scope items to prevent scope creep (at least 3 non-goals)
6. **Proposed Design** - Detailed technical design with diagrams (ASCII/Mermaid), component interactions, data flows, interface definitions
7. **Alternatives Considered** - At least 2-3 alternatives with structured comparison table (effort, impact, risk, maintainability) and why rejected
8. **Impact Analysis** - On: system performance, cost, operational burden, developer experience, security posture, compliance
9. **Migration Plan** - Step-by-step rollout strategy with rollback plan, feature flags, canary deployment, monitoring during migration
10. **Risks & Mitigations** - Risk register with: risk description, likelihood (1-5), impact (1-5), risk score, mitigation strategy
11. **Testing Strategy** - Unit tests, integration tests, load tests, chaos tests required to validate this change
12. **Open Questions** - Explicit questions that need answers before implementation begins, with assigned owners
13. **Timeline** - Phase breakdown with estimated effort (person-weeks) and milestones
14. **Appendix** - References, prior art, related RFCs, glossary of terms
Write in professional technical English. Use precise, unambiguous language. Include Mermaid diagrams where appropriate. The RFC should be ready to publish to the engineering organization without editing.
Ergebnisse
# RFC-074: Move session storage from in-memory to Redis
| | |
|---|---|
| **Status** | Proposed |
| **Author** | Platform team |
| **Reviewers** | SRE, Security |
## Summary
Sessions currently live in each Node process's memory, which blocks horizontal scaling and drops every user's session on deploy. This RFC moves session state to Redis so any instance can serve any request.
## Motivation
We can run only one API instance today — a second instance would split sessions and randomly log users out. Each deploy invalidates ~12k active sessions, generating a measurable support-ticket spike. This blocks the cluster-mode work that capacity planning depends on.
## Goals
- Sessions survive deploys and instance restarts.
- Run N stateless API instances behind a plain L4 load balancer.
- Session reads add < 5ms p95.
## Non-Goals
- Changing the auth/login flow or token format.
- Migrating away from cookie-based sessions.
- Building multi-region session replication (separate RFC).
## Proposed design
```
client ─▶ LB (no sticky) ─▶ any API instance ─▶ Redis (session store)
```
Express session middleware swaps the memory store for `connect-redis`. Session keys carry a 24h TTL refreshed on activity. Redis runs as a managed instance with `maxmemory-policy noeviction` so sessions are never silently dropped.
## Alternatives considered
| Option | Effort | Risk | Verdict |
|--------|--------|------|---------|
| Sticky sessions at LB | Low | Uneven load, still lost on deploy | Rejected |
| JWT-only (stateless) | High | Can't revoke server-side | Rejected |
| **Redis store** | Low | New dependency | **Chosen** |
## Risks & mitigations
- **Redis outage logs everyone out** (likelihood 2, impact 4) → managed HA + graceful read-through fallback.
## Rollout
Dual-write to memory + Redis behind a flag, verify parity, then flip reads to Redis and remove the memory path. Rollback is a flag toggle.
Modell: Claude Opus 4
8 Likes3 SavesScore: 5