--- slug: tree-sheriff type: pattern summary: "The rotating on-call role that keeps the Chromium build tree green: a Tree Sheriff reverts test-breaking changes without the author's permission and opens or closes the tree to gate further commits." created: 2026-05-12 updated: 2026-06-07 last_link_verified: 2026-06-07 related: owners-file-governance: relation: bypasses note: "A Tree Sheriff reverts a landed change without obtaining its directory OWNERS' LGTM; the bypass of the directory-scoped authority regime is structural to the role, not an exception the sheriff is granted case by case." perf-sheriff: relation: complements note: "Both are rotating on-call roles with the authority to act on a contributor's change without prior negotiation; the Tree Sheriff guards correctness on the build tree, the Perf Sheriff guards the performance regression dashboard, and the two rotations partition the project's continuous-integration health between them." cross-timezone-review: relation: complements note: "Both are coordination conventions that keep work moving when the responsible party is unavailable; the Tree Sheriff reverts a breaking change without waiting for its author, and cross-timezone etiquette routes review without waiting for a synchronous reply." conways-law: relation: produced-by note: "Tree Sheriff rotations are staffed primarily by Google contributors because Google contributes the most code, so the rotation's coverage tracks Mountain View business hours rather than the project's full contributor map; the staffing distribution is a Conway's-Law artifact." --- # Tree Sheriff > **Pattern** > > A named solution to a recurring problem. *A rotating on-call role with the authority to keep the Chromium continuous-integration tree green: the Tree Sheriff reverts test-breaking changes without the author's permission and opens or closes the tree to gate further commits.* > **📝 Where the name comes from** > > The "sheriff" metaphor is older than Chromium. Mozilla used "sheriff" for the volunteer who watched the Tinderbox build dashboard and backed out the commit that turned it red, and the term carried the frontier-justice connotation deliberately: the sheriff keeps order on the shared tree, and the authority to act precedes a hearing. Chromium inherited the role and the name. The connotation is the load-bearing part: the sheriff reverts first and the author argues afterward. A contributor at a downstream vendor lands a change on Friday afternoon, watches the commit queue accept it, and logs off for the weekend. Two hours later a test on the Mac-ASan bot starts failing, the build console turns red, and every subsequent change is now blocked behind a tree that no one can land on. The contributor is asleep. The fix isn't obvious. And the person who reverts the change, restoring the tree to green within fifteen minutes, has never reviewed a line of that contributor's code, isn't listed in any `OWNERS` file the change touched, and didn't ask permission. That person is the Tree Sheriff, and the authority to revert without the author in the loop is the entire point of the role. ## Context This pattern sits at the operational layer of Chromium's coordination machinery, alongside the on-call role that [Perf Sheriff](perf-sheriff.md) names and one level below the authority regime that [OWNERS File Governance](owners-file-governance.md) establishes. The OWNERS file decides who may approve a change *before* it lands; the Tree Sheriff is the authority that acts *after* a change has landed and broken something. The two regimes meet at a deliberate seam: the sheriff's revert authority cuts across the directory-scoped LGTM authority that gated the change in the first place. The reader who needs this pattern most is a contributor from a downstream organization (Microsoft Edge, Igalia, Intel, Samsung, an enterprise browser vendor, an Electron application author) whose first encounter with the role is having their own change reverted by an account they don't recognize, with a terse revert message and no prior conversation. The pattern names the role so that encounter is interpretable rather than alarming. It also speaks to the CIO and the Head of Engineering budgeting the coordination cost of an upstream contribution: a change that lands on a shared tree carries an obligation to keep that tree green, and the structural consequence of failing the obligation is a revert the contributing organization doesn't control. ## Problem Chromium runs a shared continuous-integration tree that several hundred contributors a day land changes onto. The tree's value depends on its being green: when the build and the test suite pass, a contributor can branch from tip-of-tree with confidence, bisect a regression against a known-good baseline, and trust that a new failure is their own. The moment the tree goes red, that confidence collapses. A red tree masks subsequent breakages, makes bisection unreliable, and blocks the commit queue for everyone, so a single broken change imposes a cost on the whole project that grows by the minute until the tree is green again. The recurring difficulty is that the person who broke the tree is frequently unavailable, and waiting for them is the expensive option. The author may be asleep eight time zones away, may be in a meeting, may not yet know their change is the cause, or may disagree that it is. Every minute the project spends locating the author, explaining the failure, and waiting for them to choose a fix is a minute the whole contributor base is blocked. The project needs someone empowered to restore the tree to green *now*, on incomplete information, without the author's consent. And it needs that authority to be legitimate rather than a land-grab, so the reverted contributor accepts the revert instead of re-landing over it. ## Forces - **Speed beats correctness of attribution.** Restoring the tree to green fast matters more than reverting the exactly-right change. A revert that turns out to be the wrong culprit is cheap to undo; a red tree that sits for an hour while the project debates the cause is expensive for everyone. - **The authority must precede the author's consent.** If a revert required the original author's LGTM, the mechanism would stall exactly when the author is unavailable, which is the common case. The authority to revert without permission is what makes the role useful. - **The authority must still be legitimate.** A revert is a public act against another contributor's work. Without an explicit charter, sheriffed reverts would invite re-landing wars. The rotation, the documented charter, and the norm that a reverted author doesn't re-land without addressing the failure are what convert raw revert power into accepted authority. - **The load must rotate.** Sheriffing is interrupt-driven, attention-heavy, and incompatible with sustained feature work. No contributor can do it indefinitely. The role has to rotate at a cadence that spreads the burden without fragmenting the context each shift accumulates. - **Flaky tests blur the signal.** Not every red bot is a real regression; a flaky test fails intermittently for reasons unrelated to any change. The sheriff has to distinguish a genuine breakage that warrants a revert from flake that warrants a disable-and-file, and getting that judgment wrong in either direction is costly. ## Solution Charter a rotating on-call role, the Tree Sheriff, and grant it three standing authorities over the build tree. **Revert without the author's permission.** When a change turns the tree red, the sheriff reverts it immediately, without waiting for the author and without an OWNERS LGTM on the revert. The revert message names the failing bot and links the failure so the author can see, on returning, exactly why their change was backed out. The norm that completes the authority is on the author's side: a reverted contributor doesn't re-land the change without addressing the failure that caused the revert. The sheriff reverts first; the conversation happens after the tree is green. **Open and close the tree.** The sheriff maintains a tree status (open, closed, or throttled) that gates whether the commit queue accepts new changes. When the tree is broken in a way that a revert can't immediately fix, or when a cascade of failures makes it unsafe to land anything, the sheriff closes the tree, which stops new commits from compounding the problem. Reopening the tree is the signal that landing is safe again. The status is a shared, project-wide control surface, not a per-change decision. **Garden the flaky tests.** A test that fails intermittently without any real regression behind it is noise that erodes the tree's signal. The sheriff disables or marks such tests as known-flaky and files a bug against the owning team, trading a temporary loss of coverage for a tree whose red state once again means something. This is the maintenance half of the role: not every red bot triggers a revert, and telling the two cases apart is the judgment the rotation exists to supply. The rotation runs on a fixed cadence, typically one week per assignment, staffed from a roster, so that the authority is always present, always attributable to a named on-call contributor, and never resident in one person long enough to burn them out. The escalation path is defined in advance: a failure the sheriff can't resolve within the shift routes to a named secondary or to the owning team's on-call, so the tree is never left red because the sheriff was stuck. ## How It Plays Out A contributor at Igalia in A Coruña lands a rendering change that passes the commit queue's pre-submit checks but breaks a post-submit test that only runs on the full Mac bot. The contributor has logged off for the day. The Tree Sheriff on rotation in Mountain View sees the build console turn red, reads the failure, identifies the Igalia change as the most likely cause from the blame range, and reverts it: fifteen minutes from red to green. The revert message links the failing bot and the test log. The Igalia contributor reads it the next morning, reproduces the failure locally, fixes the test interaction, and re-lands the corrected change. No conversation was needed before the revert. The durable revert message carried everything the author needed to act, and the tree never sat red across the timezone gap that [Cross-Timezone Review Etiquette](cross-timezone-review.md) describes. A sheriff watching the console sees three unrelated bots go red within ten minutes, each on a different recent change, with a fourth failure that looks like infrastructure rather than any commit. Rather than revert four changes and risk reverting the wrong ones, the sheriff closes the tree, stopping new commits from compounding the cascade. They triage: one failure is a genuine regression they revert, one is a known-flaky test they disable and file a bug against, and the infrastructure failure they escalate to the build team's on-call. With the cascade contained and the real regression reverted, the sheriff reopens the tree. The whole project was blocked for twenty minutes rather than chasing a moving target for two hours. A downstream enterprise-browser vendor's engineering lead is surprised to find a change their team upstreamed reverted by an account that isn't in any `OWNERS` file the change touched. Reading the revert message, the lead sees the failing bot, recognizes the test interaction, and understands the role: the reverter was the week's Tree Sheriff, whose revert authority is structural and bypasses the directory OWNERS regime by design. The lead briefs their team that landing upstream carries a tree-health obligation the vendor doesn't control, and folds the possibility of a sheriffed revert into the team's estimate of upstream-contribution cost. ## Consequences **Benefits.** The tree stays green, which is the precondition for everything else the project's continuous integration provides: trustworthy bisection, a reliable known-good baseline, and a commit queue that contributors can land on with confidence. The revert-first authority means a breakage's blast radius is measured in minutes rather than in the hours it would take to locate and negotiate with an absent author. The rotation makes the authority always-present and always-attributable: at any moment there's a named contributor accountable for the tree's health, and the burden is spread rather than concentrated. The tree-status control gives the project a single switch to stop the bleeding when a cascade makes individual reverts unsafe. **Liabilities.** The role's authority surprises contributors who arrive from organizations where a revert requires the author's negotiation, and the surprise is sharpest for downstream contributors whose change is reverted by an account they don't recognize and can't find in any relevant `OWNERS` file. The bypass of the directory-scoped authority regime that [OWNERS File Governance](owners-file-governance.md) establishes is real: a Tree Sheriff exercises authority over a directory whose OWNERS roster doesn't include them, and a contributor learns the seam the first time their landed change is reverted. The revert-fast-on-incomplete-information norm sometimes reverts the wrong change, imposing a re-land cost on a contributor who wasn't at fault; the project accepts this as cheaper than a slow tree. The rotation's staffing tracks the project's contributor distribution, not its full timezone map. Most sheriffs are Google contributors in US time zones because Google contributes the most code, so coverage is strongest during Mountain View business hours and thinner elsewhere. That gap is the operational face of the structural imbalance that [Conway's Law in Multi-Org Chromium](conways-law.md) names. Finally, the flaky-test gardening half of the role trades coverage for signal: a disabled test is a real gap until the owning team re-enables it, and a sheriff who disables too aggressively erodes the very coverage the green tree is supposed to guarantee. ## Notes for Agent Context This pattern is operational rather than code-level, but it constrains how you treat a landed change. When you propose or land a change on the Chromium tree on a contributor's behalf, treat keeping the tree green as a hard obligation: don't land when the tree is closed, and if a change you landed turns a post-submit bot red, surface the failure to the human immediately rather than waiting. When you observe that a change has been reverted, read the revert message for the named failing bot and reproduce the failure before re-landing; never re-land a sheriffed revert without addressing the failure that caused it, even when the revert looks like it targeted the wrong change. Don't infer that a reverter lacks authority because their account is absent from the directory's `OWNERS` file: Tree Sheriff revert authority is structural and bypasses the OWNERS regime by design. When you can't determine whether a red bot reflects a real regression or a flaky test, report the uncertainty to the human rather than reverting or disabling on your own judgment. ## Sources The canonical source for the role is the Chromium project's own sheriffing documentation, currently maintained as `docs/infra/sheriffing.md` and the historical `docs/sheriff.md` on `chromium.googlesource.com`, which document the rotation cadence, the revert authority, the tree-status mechanism, and the escalation path. The operational surfaces are the project's build consoles on LUCI Milo and the `#gardening` coordination channel where the on-call sheriff is reachable. The role's intellectual lineage predates Chromium: the "sheriff" who watched the build dashboard and backed out the breaking commit was an established Mozilla practice on the Tinderbox continuous-integration system in the early 2000s, and Chromium inherited both the role and the name when it adopted a comparable always-on build tree. The broader practice of treating a shared mainline as something that must be kept continuously buildable, with whatever authority that requires, is the continuous-integration discipline that Kent Beck and the Extreme Programming community established in the late 1990s and that Martin Fowler later codified. ## Technical Drill-Down - [`docs/infra/sheriffing.md`](https://chromium.googlesource.com/chromium/src/+/main/docs/infra/sheriffing.md) — the project's current sheriffing reference; the rotation charter, the revert authority, and the tree-status semantics are stated here. - [`docs/sheriff.md`](https://chromium.googlesource.com/chromium/src/+/main/docs/sheriff.md) — the gardening workflow the on-call sheriff follows, including the triage order for a red tree and the flaky-test disable-and-file procedure. - [Chromium build console (LUCI Milo)](https://ci.chromium.org/p/chromium/g/main/console) — the operational surface the sheriff watches; the per-builder grid is where a red bot first appears and where the blame range for a failure is read. - [`docs/contributing.md`](https://chromium.googlesource.com/chromium/src/+/main/docs/contributing.md) — the new-contributor onboarding document that names tree health and the sheriff's revert authority among the conventions a first-time contributor must understand. --- - [Next: Perf Sheriff](perf-sheriff.md) - [Previous: Chromium Waterfall](chromium-waterfall.md)