Maya Patel·
Graded the evidence on a 4-day week honestly and landed me at 65% confidence with a controlled pilot instead
Map the evidence landscape for any claim or decision, rating strength, identifying biases, and calibrating confidence levels rigorously.
Evidence Strength Mapper & Confidence Calibrator
You are a scientific evidence evaluator trained in evidence-based medicine and intelligence analysis. Help me rigorously map and rate the evidence for an important claim or decision.\n\nCLAIM/DECISION TO EVALUATE: {{claim_or_question}}\nDOMAIN: {{domain — e.g., 'Business strategy', 'Medical treatment', 'Public policy', 'Technology adoption', 'Financial investment'}}\nSTAKES: {{what_rides_on_this_decision}}\n\nEVIDENCE YOU HAVE (paste all evidence: studies, articles, expert opinions, data, anecdotes, your own observations):\n{{evidence}}\n\nOUTPUT — Evidence Map & Confidence Calibration:\n\n## 1. EVIDENCE INVENTORY\nCatalog every piece of evidence:\n\n| # | Evidence | Source Type | Date | Source Quality | Relevance | Access to Full Text? |\n|---|----------|------------|------|---------------|-----------|---------------------|\n\nSource types: [Peer-reviewed study / Industry report / Expert opinion / Anecdotal / Internal data / News article / Government data / Meta-analysis / Systematic review]\nSource quality: [A: Gold standard / B: Credible / C: Moderate / D: Weak / F: Unreliable]\n\n## 2. EVIDENCE HIERARCHY (for {{domain}})\nApply the appropriate evidence pyramid:\n\nFor {{domain}}, rank evidence types from strongest to weakest:\n1. [Strongest type] — Why strongest in this domain\n2. [Next]\n3. ...\n\nMap each piece of evidence to its position on this hierarchy.\n\n## 3. INDIVIDUAL EVIDENCE APPRAISAL\nFor EACH significant piece of evidence:\n\n**Evidence [N]: [Short description]**\n- **HIERARCHY LEVEL**: [Position]\n- **STRENGTHS**: What makes this evidence credible? (methodology, sample size, source independence, replication)\n- **WEAKNESSES**: Limitations, biases, gaps (funding bias, small sample, correlational not causal, outdated)\n- **RELEVANCE**: How directly does this address {{claim_or_question}}? [Direct / Indirect / Tangential]\n- **DIRECTION**: Does this support, contradict, or complicate the claim? [Support / Contradict / Mixed / Unclear]\n- **MAGNITUDE OF EFFECT**: Large / Moderate / Small / Negligible (if quantifiable, include numbers)\n- **OVERALL GRADE**: [A: Strong / B: Moderate / C: Weak / D: Very weak]\n\n## 4. EVIDENCE SYNTHESIS MATRIX\n\n| Evidence Type | Count | Avg Quality | Direction Summary | Weighted Conclusion |\n|--------------|-------|-------------|-------------------|--------------------|\n\nAcross ALL evidence:\n- What patterns emerge?\n- Where do sources agree? Where do they conflict?\n- What are the most common methodological weaknesses?\n- Is there publication/reporting bias? (Are negative results underrepresented?)\n\n## 5. CONFIDENCE CALIBRATION\nUse a structured approach to rate confidence in the claim:\n\n| Factor | Assessment | Confidence Impact |\n|--------|-----------|-------------------|\n| Evidence quantity | Sufficient / Insufficient | +/− |\n| Evidence quality | High / Mixed / Low | +/− |\n| Evidence consistency | Consistent / Mixed / Contradictory | +/− |\n| Evidence directness | Direct / Indirect | +/− |\n| Potential for bias | Low / Moderate / High | +/− |\n| Generalizability to our context | High / Moderate / Low | +/− |\n| Expert consensus | Strong / Mixed / None | +/− |\n| Mechanistic plausibility | High / Moderate / Low / Unknown | +/− |\n\n**OVERALL CONFIDENCE LEVEL**:\n- **Very High**: Multiple high-quality sources, consistent direction, direct relevance, low bias risk\n- **High**: Several quality sources, mostly consistent, reasonably direct\n- **Moderate**: Mixed quality or quantity, some inconsistencies, indirect evidence\n- **Low**: Limited evidence, significant weaknesses, important unknowns\n- **Very Low**: Minimal or very weak evidence, high uncertainty\n\nCalibrated confidence statement: 'I am [X]% confident that [specific claim] because [justification with evidence reference]. The main uncertainties are [Y and Z].'\n\n## 6. UNKNOWNs & UNKNOWNS\n- **Known Unknowns**: What don't we know that we wish we knew? (Answerable with more research)\n- **Unknown Unknowns**: What might we be completely missing? (Blind spots, paradigm limitations)\n- **Information still needed**: Specific data/evidence that would change our confidence level\n\n## 7. DECISION IMPLICATIONS\nGiven this evidence map:\n- **If the evidence strongly supports the claim**: Recommended action + monitoring plan\n- **If mixed/uncertain**: Hedge strategies, small experiments, delaying tactics\n- **If evidence contradicts**: Warning signals, alternative approaches to consider\n- **Confidence-appropriate action**: What level of commitment is warranted by this evidence quality?
Ergebnisse
# Evidence Map — "Does a 4-Day Work Week Maintain Productivity?"
**Decision:** whether to pilot a 4-day week. **Domain:** business/org strategy. **Stakes:** retention + output across 200 staff.
## 1. Evidence Inventory (extract)
| # | Evidence | Type | Quality | Direction |
|---|----------|------|---------|-----------|
| 1 | 4 Day Week Global trials (61 UK firms) | Field study | B+ | Support |
| 2 | Iceland public-sector trial | Govt data | B | Support |
| 3 | A competitor's anecdotal blog post | Anecdotal | D | Support |
| 4 | Meta-analysis on hours vs. output | Systematic review | A | Mixed |
## 2. Hierarchy (for org strategy)
Strongest → multi-firm RCT-like trials > single-org case studies > expert opinion > anecdote.
## 3. Appraisal — Evidence #1
- **Strengths:** large multi-firm sample, pre/post metrics, independent researchers.
- **Weaknesses:** self-selected firms (volunteer bias), no true control, 6-month window.
- **Relevance:** Direct. **Magnitude:** revenue roughly flat, wellbeing up.
## 4. Synthesis
Sources mostly point the same way (output holds, wellbeing improves), but the best evidence (the meta-analysis) cautions that effects are context-dependent and the trials lack controls.
## 5. Confidence Calibration
| Factor | Assessment | Impact |
|--------|-----------|--------|
| Quantity | Sufficient | + |
| Quality | Mixed (no controls) | − |
| Consistency | Consistent | + |
| Bias risk | Moderate (self-selection) | − |
**Calibrated statement:** "I'm ~65% confident a 4-day week would hold our productivity, mainly because the volunteer-firm evidence lacks controls. Key unknown: whether our client-facing teams can compress without coverage gaps."
## 6. Decision Implication
Evidence is suggestive, not conclusive → run a **3-month controlled pilot** in two teams with a matched comparison, rather than a company-wide switch.
Modell: Claude Sonnet 4
81 Likes36 SavesScore: 50
6 Kommentare
Ethan Reed·
I'd add a weekly check-in, but otherwise I'd run this as-is.
Felix Bauer·
The 'never miss twice' rule quietly changed my month.
Grace Williams·
The SOP it generated is cleaner than the one I wrote by hand.
Sofia Almeida·
Put this on a sticky note above my desk. That good.
Chloe Adams·
Best research, analysis template I've found on here.
Anna Hofmann·
Adopted the two-minute capture habit from this and never looked back.