Foundations Intermediate 6 min read

Risk vs Issue: The RAID Log That Saves Projects

Q: Who owns the RAID log?

The PM (or the senior engineer playing PM, per [chapter 1](/tech-pm/pm-vs-em-vs-tpm)). Each entry has a separate owner who is accountable for mitigation, but the log itself is one person's responsibility. If two people own the log, no one updates it. Review the log in the weekly project meeting; archive entries that have closed for 30 days.

Q: Should risks be in Jira/Linear or a separate doc?

Separate. Issue trackers are for tasks; risks are for *what might happen*. Mixing them dilutes both. A markdown file in the project repo works for small teams; a Confluence page or Notion table for larger teams. The key requirement is that it shows in the weekly status review - if it lives somewhere no one opens, it does not exist.

How to track risks before they happen and issues after, with the RAID log format and weekly cadence that keep the log honest.

Phùng Anh Tú · June 3, 2026

Table of contents

When does a risk log actually start paying back?
What is the cost of not having a risk log?
What does the minimal RAID log look like?
What does the RAID log look like at multi-team scale?
What is the weekly cadence that keeps the log honest?
What failure modes does the RAID log itself introduce?
When is a RAID log overkill?
Where should you go from here?

The single artifact that distinguishes a project that survives from one that does not is the risk log - because the risk that gets surfaced gets mitigated, and the risk that stays in someone's head gets ignored. This chapter shows the RAID log format, the weekly cadence that keeps it honest, and the template you can drop into your repo today.

When does a risk log actually start paying back?

Three signs.

The project is more than 4 weeks long. Below that, you can hold the risks in your head. Above it, memory degrades and risks silently fall off.

The project depends on outside parties. Vendors, other teams, legal review, security review, ops handover. Each is a source of risks you do not control. The log is the only way to see them all at once.

The project has a hard external deadline. A fixed launch date means risks need contingency plans, not best-effort mitigation. The log forces the contingency conversation.

If none of these apply (a 1-week internal tool you control end to end), a quick mental tally is enough. Most non-trivial projects hit at least two of the three.

What is the cost of not having a risk log?

Three failure modes.

Surprised by foreseeable problems. Every postmortem on a failed project includes "we should have seen this coming". The risk log is exactly the artifact that makes you see it coming and write it down where someone can act.

Mitigation done too late. Even when risks are spotted, without a log they get tracked informally and mitigations slip. By the time someone says "we really should fix that vendor problem", the vendor's SDK ships broken.

No paper trail at retrospective. When the project misses its date, leadership asks "what could we have done differently?". With a risk log, you have evidence. Without one, every postmortem is a shouting match about who should have known.

What does the minimal RAID log look like?

For a 5-person team, a single markdown table per project:

# Project RAID Log — {{ Project Name }}

Last updated: 2026-06-03 by {{ PM name }}

## Risks (might happen)

| ID | Description | Prob | Impact | Score | Owner | Mitigation | Status |
|----|-------------|------|--------|-------|-------|------------|--------|
| R1 | Vendor SDK v3 may slip from June to August | likely | major | 16 | TL | Build adapter on top of v2; revisit June 15 | Open |
| R2 | Postgres upgrade window conflicts with launch | possible | severe | 15 | EM | Negotiate window with infra; deadline June 10 | Open |
| R3 | Designer on parental leave week of launch | certain | minor | 5  | PM | Pre-record handover; designer #2 covers | Mitigated |

## Assumptions (taken as true)

| ID | Description | Validated | Owner |
|----|-------------|-----------|-------|
| A1 | Stripe webhook retries are idempotent on our side | not yet | TL |
| A2 | Customer success team handles tier-1 support post-launch | confirmed | PM |

## Issues (happening now)

| ID | Description | Severity | Owner | Next action | Opened |
|----|-------------|----------|-------|-------------|--------|
| I1 | Staging env down since Tuesday; blocking QA | high | EM | Infra ticket #4421 | 2026-06-01 |

## Dependencies (out of our control)

| ID | Description | Due from | Due date | Status |
|----|-------------|----------|----------|--------|
| D1 | Auth service v4 API spec | Platform team | 2026-06-08 | On track |
| D2 | Brand assets for launch page | Marketing | 2026-06-12 | At risk |

The template is intentionally short. The discipline is writing something down, not perfecting the format. Ten minutes once a week beats a polished log no one updates.

What does the RAID log look like at multi-team scale?

Three changes when the project crosses team boundaries:

flowchart TB
    Team1[Team A RAID] --> Combined[Project-level RAID<br/>top 5 risks rolled up]
    Team2[Team B RAID] --> Combined
    Team3[Team C RAID] --> Combined
    Combined --> Exec[Executive status<br/>top 3 only]
    Combined --> Review[Weekly project review]

Each team owns its own RAID log; the project lead rolls up the top 5 risks across teams into a project-level log. Only the top 3 project-level risks (score >= 15) reach the executive status report - leadership cannot act on 30 small risks. The status report chapter covers the rollup format.

What is the weekly cadence that keeps the log honest?

Four checkpoints, on the same calendar slot every week:

Owner update: each risk owner spends 30 seconds on status change, mitigation progress, or "still open, no change".
New risks: anyone can raise a new risk in 60 seconds. The PM scores it on the spot.
Promotions: risks that became issues this week move columns; mitigation that succeeded gets marked closed.
Top 5 review: the highest-scoring 5 risks get one minute each — is the mitigation working, do we need to escalate.

Total time: 15 minutes. If the meeting runs longer, the log is out of date or the project has too many risks - both are signals.

What failure modes does the RAID log itself introduce?

Log becomes shelfware. Created at kickoff, never updated. Mitigation: weekly cadence + the PM is accountable for staleness; each risk has a "last updated" date; entries older than 14 days get flagged.
Risks confused with worries. "I'm worried about scaling" is not a risk; "Postgres write throughput will exceed 5K/s by Q3 based on growth" is. Use the estimation techniques to give risks numbers.
Mitigation listed but not assigned. "We should fix the flaky tests" with no owner = nothing happens. Every risk has exactly one owner.
Score inflation. Every risk becomes "high probability, high impact" because owners want attention. Mitigation: enforce the bell curve - across all open risks, only ~20% should be score 15+.

When is a RAID log overkill?

Two cases.

Two-week internal task. A bug fix sprint or a minor refactor does not need a formal log. The senior engineer mentions risks in the daily standup and that is enough.

Truly stable project. Long-running maintenance work where the same handful of risks rotate (cert expiry, dependency updates, flaky test cleanup). Capture them in the team's runbook, not a project RAID log.

The RAID log earns its complexity at the project level — bounded scope, hard deadline, multiple unknowns. Below that, lighter artifacts work.

Where should you go from here?

Next chapter: stakeholders and comms - how to communicate the risks you tracked here to people who do not attend your weekly review. After that, scope and tradeoffs shows how to respond when a risk forces a scope change.

Frequently asked questions

How is a risk different from an issue?

Tense. A risk is in the future ('vendor X may not deliver SDK on time'); an issue is in the present ('vendor X did not deliver SDK; we are blocked'). The same problem moves from risk to issue when the probability becomes 100%. Track both because they need different actions: risks get mitigation plans, issues get incident response.

What probability and impact scale should I use?

Five levels for each, kept simple. Probability: rare / unlikely / possible / likely / certain. Impact: minor / moderate / major / severe / catastrophic. Multiply ordinal positions for a 1-25 score. Risks scoring 15+ go on the weekly executive status report; below 15 stay in the team-level RAID log.

Who owns the RAID log?

The PM (or the senior engineer playing PM, per chapter 1). Each entry has a separate owner who is accountable for mitigation, but the log itself is one person's responsibility. If two people own the log, no one updates it. Review the log in the weekly project meeting; archive entries that have closed for 30 days.

Should risks be in Jira/Linear or a separate doc?

Separate. Issue trackers are for tasks; risks are for what might happen. Mixing them dilutes both. A markdown file in the project repo works for small teams; a Confluence page or Notion table for larger teams. The key requirement is that it shows in the weekly status review - if it lives somewhere no one opens, it does not exist.