Back to Engineering Burnout & Process Debt 2026
Methodology

Engineering Burnout & Process Debt 2026 — Methodology

600-respondent engineer + manager survey using the licensed Maslach Burnout Inventory short form (MBI-HSS) + a new 8-item process-debt instrument validated in pilot. Includes 100 manager-IC pairs from the same teams. Pre-registered hypotheses, Wilson 95% CIs, Benjamini–Hochberg FDR correction.

Research questions

The Stride 2026 engineering-burnout study addresses one primary question and four secondary questions.

The primary question is whether process debt is a stronger predictor of engineering burnout than the variables existing surveys typically measure (workload-hours, technical debt, single-item stress questions). The clinical literature points strongly at autonomy-related variables; the engineering-org survey literature largely measures workload. The Stride 2026 study tests directly whether process debt — accumulated friction from team processes that no longer fit the work — outperforms workload + technical debt as a burnout predictor.

The secondary questions test four hypotheses developed from the literature: (1) whether teams using validated practices (planning-poker + retrospectives-with-action-tracking) show lower burnout independent of AI; (2) whether AI tool adoption moves the burnout needle in either direction; (3) whether manager-reported team health diverges from IC-reported team health on the same MBI items; (4) whether autonomy moderates the process-debt → burnout relationship.

Hypotheses (pre-registered)

H1 — Process debt > technical debt as burnout predictor

Operationalisation. Composite MBI score is the unit-weighted sum of standardised Maslach Burnout Inventory short-form (HSS) scores across the three dimensions (emotional exhaustion, depersonalization, reduced personal accomplishment). Process debt is the unit-weighted sum across the 8-item process-debt instrument (validated in pilot, Cronbach's α ≥0.7 required). Technical debt is a 4-item self-rated instrument adapted from existing engineering-org surveys. Workload is reported weekly hours.

Prediction. In a hierarchical regression with composite MBI as the dependent variable, process debt's standardised coefficient (β_PD) will be larger in absolute magnitude than both technical-debt coefficient (β_TD) and workload-hours coefficient (β_WH). Effect-size comparison: β_PD ≥ 1.5× max(|β_TD|, |β_WH|).

Falsification. Either technical debt or workload-hours predicts MBI as strongly as or more strongly than process debt.

H2 — Validated practices reduce burnout independent of AI

Operationalisation. "Validated practices" is the conjunction of (a) planning-poker estimation (self-reported, §3.2) and (b) retrospectives-with-action-tracking (self-reported, §3.4 — retrospectives held + at least one action item from the prior retro completed each sprint). AI-adoption stratum is reported in §3.5 on a 5-point scale.

Prediction. Teams using both validated practices will show lower composite MBI scores than teams using neither, after controlling for AI tool adoption. ANOVA: practices × AI; main effect for practices significant at p < 0.05 after BH-correction; AI × practices interaction not significant.

Falsification. Practices effect washes out after AI control, or AI × practices interaction emerges significant (suggesting practices effect is AI-dependent).

H3 (null) — AI doesn't move the burnout needle

Operationalisation. AI-adoption stratum (none / exploratory / selective / regular / pervasive) regressed against composite MBI score, with company size and tenure as covariates.

Prediction. The AI-stratum coefficient will be statistically indistinguishable from zero after BH-correction. The null is the finding — AI tool adoption is independent of burnout once company size and tenure are controlled. Consistent with State-of-AI Volume 0's pre-registered H4.

Falsification. Significant non-zero coefficient on AI in either direction.

H4 — Manager-IC perception divergence

Operationalisation. The 100 manager-IC pairs each complete the MBI-HSS short form (manager reports about their team's health; IC reports about their own health). Per-item difference scores are computed.

Prediction. On at least 3 of the 5 MBI-HSS short-form items, the mean manager-IC difference will be ≥15 percentage points (manager-reported team health higher than IC-reported team health). Test: paired t-test per item, BH-corrected.

Falsification. Mean difference is below 10 percentage points on all items, OR manager-IC reports differ in the opposite direction (manager-reported worse than IC-reported).

H5 (exploratory) — Autonomy moderates process-debt → burnout

Reported with explicit exploratory framing. Autonomy is measured via a 4-item adapted from the Karasek Job Content Questionnaire decision-latitude subscale. The interaction effect of autonomy × process-debt on MBI is reported; if significant, the effect of process debt is stronger in low-autonomy environments.

Multiple-comparison correction

The four planned hypotheses (H1–H4) form one family. BH FDR is controlled at q = 0.05 across the family. H5 is explicitly outside the planned family.

Survey design

The instrument is ~56 substantive items + 4 screening + 3 attention checks + 8 firmographics. Median completion target 14 minutes.

Screening

S1–S4: role (IC engineer / EM / staff+ / director / VP — students screened out); tenure ≥3 years engineering work; current employment; English proficiency.

Section 1 — Team + context (8 items)

Team size, company size, industry, region, role, tenure on team, remote/hybrid/in-office, regulatory burden.

Section 2 — Sprint + process (8 items)

Sprint length, planning frequency, planning method (planning-poker / t-shirt / three-point / none), retrospective frequency, retrospective action-tracking (do retros produce action items that get tracked + completed?), on-call rotation, on-call fairness, post-incident review cadence.

Section 3 — AI adoption (4 items)

AI tool adoption stratum (none/exploratory/selective/regular/pervasive). Time-since-AI-introduction. Mandatory-vs-voluntary adoption. Perceived effect of AI on team workload (positive / no change / negative).

Section 4 — Workload + technical debt (6 items)

Weekly working hours (self-reported). Workload sustainability (5-point). Technical-debt self-rating (4 items): perceived velocity-impact of accumulated technical debt; perceived ratio of bug-fixing to feature work; perceived test coverage; perceived legacy-code burden.

Section 5 — Process debt (8 items, new instrument, validated in pilot)

Items operationalise process-debt across four dimensions: (i) sprint-structure fit, (ii) retrospective effectiveness, (iii) estimation-process fit, (iv) on-call-rotation fit. Each item: 5-point Likert. Composite score is unit-weighted sum of standardised items.

Sample items: "Our sprint length feels matched to the work we do." "Our retrospectives surface issues that actually get addressed." "Our estimation process produces commitments we can defensibly make." "Our on-call rotation is fair and sustainable."

Section 6 — Maslach Burnout Inventory short form (MBI-HSS, ~13 items)

Mind Garden licensed MBI-HSS short form. Three dimensions: emotional exhaustion (5 items), depersonalization (4 items), reduced personal accomplishment (4 items). 7-point frequency scale ("never" to "every day").

Section 7 — Autonomy (Karasek decision-latitude, 4 items)

Adapted from the Karasek Job Content Questionnaire decision-latitude subscale. Sample item: "I have a lot of say about what happens on my job." 5-point Likert.

Section 8 — Outcomes (4 items)

NPS-style satisfaction with current job. Tenure intent (likelihood of staying ≥12 months). Sleep quality (single-item proxy). Work-life balance (single-item proxy).

Geographic region, gender (optional), opt-in for follow-up, dataset-release consent.

Attention checks

Three checks: an explicit-instruction check, a list-question with one obviously off-topic option, a paragraph-comprehension check. Respondents failing ≥2 of 3 are screened out and replaced.

Recruitment

Prolific Academic panel arm

n = 500 completes. Effective CPI ~$3.50 USD (longer survey + MBI licensing fee per response).

Manager-IC paired arm (organic recruitment)

n = 100 pairs (200 respondents) recruited via:

  • Industry partner referrals
  • Engineering-leader communities (Rands Slack, Lead Dev)
  • Stride newsletter targeted ask

Pairs complete a tied recruitment flow: manager invites IC; both complete the survey within a 14-day window. Tied identifiers are retained pseudonymously for the paired analysis only; the pseudonyms are not in the released dataset.

Organic top-up arm

n = 100 single completes via newsletter / LinkedIn / community channels.

Pilot wave

N = 60 panel respondents + 8 think-aloud sessions ($75 honorarium, 30 minutes each). Process-debt-instrument factor structure validated in pilot (Cronbach's α threshold). Any item with α below 0.65 is reworded; any sub-scale with α below 0.7 is reworked.

Statistical methods

Effect sizes, not just p-values

  • Cohen's h for proportion comparisons.
  • Cohen's d + Hedges' g for continuous magnitudes.
  • Cohen's for hierarchical regression.
  • Wilson 95% CIs on every quoted percentage.
  • Bootstrap 95% CIs (10,000 iterations) on continuous magnitudes.

Multiple-comparison correction

The planned hypothesis family (H1–H4) is corrected with BH FDR at q = 0.05.

Sensitivity analysis

Before publishing Volume 1, we run three sensitivity analyses:

  1. With and without the manager-IC pair sub-sample (the pairs likely bias toward healthier teams).
  2. With and without organic stratum.
  3. Alternative process-debt scoring (factor-loading-weighted vs unit-weighted).

Any headline number that flips direction under any sensitivity condition is flagged inline.

Reproducibility

The Volume 0 landscape figures are reproducible today. A Jupyter notebook at apps/platform/research/2026/reproducibility/burnout/landscape-charts.ipynb loads the four published-data CSVs and regenerates every Volume 0 figure. Each CSV row carries a citation_url so a peer-reviewer can trace every plotted point.

Dataset publication (Volume 1)

When Volume 1 lands, the survey dataset publishes under CC-BY-4.0 with the MBI-HSS item-level responses redacted per Mind Garden's licensing requirements. Released:

  • responses.csv — one row per respondent. MBI dimension composite scores (not item-level). Process-debt item-level responses (own instrument). All demographics + practice covariates. Stratum flag + post-stratification weight.
  • process-debt-validation.csv — pilot factor-loading analysis output for the 8-item process-debt instrument.
  • manager-ic-pairs.csv — per-pair per-item difference scores (item-level for the paired analysis; permitted by Mind Garden's research licensing for within-sample differences).
  • cross-tabs/ — pre-computed cross-tabs for the planned hypothesis family.

Vendor-neutrality posture

Stride is a software-delivery platform. We publish this study with the same disciplines as the rest of the 2026 series:

  • Maslach Burnout Inventory items are used verbatim under Mind Garden's research licensing. No deviation.
  • Stride is not named in survey items.
  • The process-debt instrument is published in full as part of the Volume 1 release; any subsequent researcher can re-use it under CC-BY-4.0.
  • Editorial owner has final cut, not GTM.
  • The dataset publishes whether the findings flatter Stride or not. A finding that Stride's own customers show no lower burnout than non-Stride teams ships. A null on H3 ships.