How should we prioritise defects?

Score each defect on three dimensions: severity (1-4, critical to low), frequency (1-4, always to rare), and impact (1-4, blocking to cosmetic). The product or weighted sum becomes the triage score. Defects above a threshold get scheduled; defects below get won't-fixed with explicit rationale.

When should we close a defect as won't-fix?

Six legitimate categories: behaves-as-designed (the defect describes intended behaviour), cannot-reproduce, edge-case-below-value-threshold, superseded-by-other-work, external-dependency, stale (no activity for months). Every won't-fix gets a one-sentence rationale. Healthy teams won't-fix 15-30% of incoming defects.

How often should we triage?

Scale cadence with volume: under 10 new defects per week → triage in existing standup/iteration planning; 10-50/wk → weekly 30-min triage meeting (PM + tech lead + QA lead); over 50/wk → daily 15-min triage stand-up. Untriaged defects rot. After 30 days the context is lost.

What SLAs should we set per severity?

Common targets: Severity 1 (critical): triage 1hr, fix 24hr, verify 48hr; Severity 2 (high): triage 4hr, fix 1wk, verify 2wk; Severity 3 (medium): triage 1d, fix 2wk, verify 1mo; Severity 4 (low): triage 1wk, fix when-capacity-allows or won't-fix. A team consistently missing these needs to add capacity, raise the won't-fix bar, or re-anchor severity definitions.

All articles in Test management

Test management

Defect triage that doesn't drown the team

Severity × frequency × impact, with explicit non-fix criteria and SLAs per severity tier. The process that prevents the backlog from growing to 400+ untriaged items.

May 23, 202610 min read

A team that doesn't actively triage defects ends up drowning in them. Every found bug gets filed, every automated test failure produces a defect, every customer report becomes an item. After six months, the defect backlog is 400 items deep and growing; no one knows which ones are actually being fixed; the QA lead spends meetings reading lists instead of making decisions.

The fix is a triage discipline with explicit categories, explicit owners, and explicit non-fix criteria. Defects that aren't going to be fixed should be closed as won't-fix with a rationale, not left in the backlog forever to inflate the count.

The triage framework: severity × frequency × impact

Every defect gets scored on three dimensions:

Severity (1-4)

1 (Critical): customer-impacting, no workaround. Crashes, data loss, security breach, payment failures. Fixes get a hotfix release.
2 (High): customer-impacting with workaround, or major functionality broken. Fixes get into the next planned release.
3 (Medium): noticeable issue, functional workaround exists. Fixes scheduled based on priority + capacity.
4 (Low): minor cosmetic, edge case, or polish issue. Fixes scheduled when convenient or won't-fixed.

Frequency (1-4)

1 (Always): every customer hits it.
2 (Often): affects a significant share (say 20%+).
3 (Occasionally): specific conditions; affects a few percent.
4 (Rare): very specific edge case.

Impact (1-4)

1 (Blocking): customer cannot complete their primary task.
2 (Major): customer can complete the task with workaround or extra steps.
3 (Minor): degrades experience but doesn't block.
4 (Cosmetic): doesn't affect functionality.

The product of severity × frequency × impact (or, more usefully, the sum weighted by team policy) becomes the triage score. Defects above a threshold get scheduled; defects below get won't-fixed.

Triage cadence

The right cadence depends on defect volume:

Under 10 new defects per week: triage in the existing standup/iteration planning cycle.
10-50 new defects per week: weekly 30-minute triage meeting with PM + tech lead + QA lead.
>50 new defects per week: daily 15-minute triage stand-up with rotating attendees.

The cadence matters because untriaged defects rot. After 30 days, the original reporter has forgotten the context; the customer's environment has shifted; the relevant code has been refactored. Triage promptness is the difference between "we know what's broken" and "we have a list of historical complaints".

What "won't fix" means and when to use it

Won't-fix is a legitimate triage outcome and the most-avoided one. Teams that never close defects as won't-fix accumulate hundreds of "we'll get to it eventually" items, none of which they will get to.

Healthy won't-fix categories:

Behaves as designed: the defect describes intended behaviour. Close with a note explaining the design rationale.
Cannot reproduce: tried, couldn't make it happen; no clear repro from the reporter. Close with a request to reopen if reproducible.
Edge case below the value threshold: affects 0.01% of users in a non-blocking way; fix cost exceeds value. Close with the scoring that justified the decision.
Superseded by other work: a planned refactor or feature will inherently fix this. Close with a link to the superseding work item.
External dependency: defect is in a third-party system. Close with a link to their issue tracker.
Stale: opened more than N months ago, no activity since. Close with a "stale closure" tag.

The cardinal rule: every won't-fix has a one-sentence rationale. The category alone isn't enough; the why matters for future reference.

The non-fix policy

A useful triage outcome that gets used too rarely: the explicit non-fix policy. Some categories of defect are deliberately not fixed:

UI text changes that aren't user-facing errors. Typos in admin panels nobody reads. The change cost (review, deploy, regression) exceeds the value.
Edge cases in deprecated features. The team has announced the feature is going away in 6 months; defects in it get won't-fixed unless customer-impact-critical.
Performance regressions below the SLO threshold. A response time that went from 80ms to 110ms is technically a regression but well under the 200ms SLO. Won't-fixed with rationale.
Defects in beta or alpha features. By definition, beta users opted into instability. Defects get logged but get won't-fixed unless the user impact justifies hotfix attention.

The team's non-fix policy should be explicit in the triage rubric. New triagers shouldn't have to guess what gets won't-fixed.

Defect lifecycle states

A useful state machine for defects:

New: just filed, not yet triaged.
Triaged: scored, scheduled or won't-fixed.
In progress: an engineer has picked it up.
In review: fix has been opened as a PR.
Awaiting verification: PR merged, deployed, awaiting QA confirmation.
Verified: QA has confirmed the fix.
Closed: ticket complete.
Won't fix: closed with non-fix rationale.
Cannot reproduce: closed without fix; reopener welcome.
Duplicate: closed as duplicate of another ticket; link added.

The full lifecycle matters because "Closed" without "Verified" is a common failure mode: the engineer marked it fixed, the fix didn't actually fix it, the customer hit the same issue 3 weeks later. Explicit verification by someone other than the fixer catches this.

SLAs that prevent drowning

Healthy teams have explicit SLAs per severity:

Severity 1: triage within 1 hour; fix within 24 hours; verify within 48 hours.
Severity 2: triage within 4 hours; fix within 1 week; verify within 2 weeks.
Severity 3: triage within 1 day; fix within 2 weeks; verify within 1 month.
Severity 4: triage within 1 week; fix when capacity allows or won't-fix.

The SLAs are commitments, not aspirations. A team consistently missing them needs to either add capacity, raise the won't-fix bar (more defects close as won't-fix), or accept that severity definitions need to shift (fewer defects classified as critical).

Common pitfalls

Severity inflation: everything becomes a Severity 1. The team learns that Severity 1 isn't actually critical and stops responding. Re-anchor severity definitions quarterly.
No accountability: defects sit in "New" forever because no one owns triage. Assign a rotating triage role per week.
Reopening without escalation: a customer-reported repeat of a previously-closed defect should escalate the new instance's severity. Otherwise, "we fixed it" becomes meaningless.
Triaging in isolation: triage works best with PM + tech lead + QA together. PM-only triage misses technical context; tech-lead-only triage misses customer impact.
Letting won't-fix become a graveyard: occasionally re-review the won't-fix backlog (quarterly) to confirm the rationale still holds. Sometimes a re-fix is warranted as the surrounding context changes.

Metrics worth tracking

The dashboard a healthy triage practice produces:

Open defect count by severity, trend over time. Trending up = team is falling behind; trending down = catching up.
Median time-to-triage, by severity. Should be at or below SLA.
Median time-to-fix, by severity. Should be at or below SLA.
Won't-fix rate, by category. Healthy teams won't-fix 15-30% of incoming defects. Lower suggests insufficient triage discipline; higher suggests overuse.
Reopened defects per month. Trending up = quality issues in the fix process (often related to insufficient test coverage on fixes).
Customer-found vs internally-found ratio. Customer-found should be a small minority for healthy teams. Trending toward more customer-found = test/QA gaps.

A worked example

Team A receives 30 defects per week. Their weekly triage meeting (45 minutes) processes all 30. The outcome distribution typically:

2-3 won't-fix (behaves-as-designed, edge cases below threshold)
5-7 close as duplicate or unable-to-reproduce
15-20 scheduled by severity (most into the next 1-2 sprints)
2-3 immediate-fix (hotfix or top-of-sprint)

After 6 months, Team A's defect backlog is stable at 60-80 open items. The QA lead can describe the top 10 from memory. New engineers can browse the backlog and understand the state.

Team B receives the same volume but skips triage discipline. After 6 months, their backlog is 400+ items. No one can describe what's in it; the team avoids the defect tracker because it's overwhelming; customer-reported issues sit alongside ancient un-triaged items. The QA lead spends meetings managing perception ("we're working on it") rather than making decisions.

The difference is process, not headcount.

For the test-case structure that prevents defects from being filed for the wrong reasons, see Test-case design. For deciding which tests run when (and therefore which defects get found when), see Regression strategy. For the exploratory practice that surfaces the defects automation misses, see Exploratory testing.

Frequently asked questions

How should we prioritise defects?: Score each defect on three dimensions: severity (1-4, critical to low), frequency (1-4, always to rare), and impact (1-4, blocking to cosmetic). The product or weighted sum becomes the triage score. Defects above a threshold get scheduled; defects below get won't-fixed with explicit rationale.
When should we close a defect as won't-fix?: Six legitimate categories: behaves-as-designed (the defect describes intended behaviour), cannot-reproduce, edge-case-below-value-threshold, superseded-by-other-work, external-dependency, stale (no activity for months). Every won't-fix gets a one-sentence rationale. Healthy teams won't-fix 15-30% of incoming defects.
How often should we triage?: Scale cadence with volume: under 10 new defects per week → triage in existing standup/iteration planning; 10-50/wk → weekly 30-min triage meeting (PM + tech lead + QA lead); over 50/wk → daily 15-min triage stand-up. Untriaged defects rot. After 30 days the context is lost.
What SLAs should we set per severity?: Common targets: Severity 1 (critical): triage 1hr, fix 24hr, verify 48hr; Severity 2 (high): triage 4hr, fix 1wk, verify 2wk; Severity 3 (medium): triage 1d, fix 2wk, verify 1mo; Severity 4 (low): triage 1wk, fix when-capacity-allows or won't-fix. A team consistently missing these needs to add capacity, raise the won't-fix bar, or re-anchor severity definitions.