Calibration · Job Architecture Toolkit

Calibration builds a shared standard

Calibration is the practice of arguing toward consistency. It exists so the standard lives in the room, not in any one person's head.

The best sessions don't end with a tally. They end with a stronger shared mental model. The point of comparing roles is to make sure that next quarter's leveling questions get the same answer that this quarter's did.

Bring two roles

Two reference roles is the smallest useful unit for calibration. The comparison is what makes the level visible.

The single most useful habit I've watched work: every manager who brings a role brings two more roles for context. One clearly below the proposed , one clearly above. The conversation stops being about defending an answer and starts being about placing the role on a real ladder.

Two roles side by side

HRBP supporting a 90-person unit

P4 Advanced

High confidence.

HRBP supporting a 350-person unit, cross-region

P6 Principal

High confidence, boundary P5/P6.

Dimension-by-dimension comparison of two roles
Dimension	Role A	Role B	Delta
Scope	4	5	Role B higher (+1)
Complexity	4	5	Role B higher (+1)
Autonomy	4	5	Role B higher (+1)
Influence	4	5	Role B higher (+1)
Knowledge	4	5	Role B higher (+1)
Impact	3	5	Role B higher (+2)

Boundary cases reveal philosophy

Boundary cases show what your company actually values. The argument you record becomes the precedent for the next one.

Borderline roles are the ones where your company's actual values show up. A role that's defensibly P4 or P5 forces a real conversation about how much weight you give influence versus complexity. The decision matters less than the rationale you record.

I've kept the write-ups from old projects on the way managers reasoned about boundary cases. Those notes are more useful than the levels themselves, because they tell future-me how this company tends to think.

Which of these is calibration evidence (versus pay pressure or a retention concern)?

"She just turned down a competing offer at a higher level."

"He's been at the level for three years and the team expects he'll get promoted."

"Her role now influences engineering decisions in two adjacent product areas, and she sets the standards for our internal tooling."

"Their pay sits in the bottom third of the range."

Govern exceptions

Exceptions are fine. Undocumented exceptions become precedent and erode the standard.

Exceptions happen. Someone gets hired into a role at a the architecture doesn't quite support, because the market is hot, or the team needs the hire fast, or there's a strategic moment. Hidden exceptions become precedent. Named exceptions stay manageable.

Exception hygiene
Field	What to capture
Role	Function, family, current level, intended level
Reason	Hot market / strategic hire / interim / individual stretch
Boundary expected	Date or condition by which the gap should resolve
Reviewer	Who owns the next checkpoint

Exception hygiene

Field
Role
What to capture
Function, family, current level, intended level
Field
Reason
What to capture
Hot market / strategic hire / interim / individual stretch
Field
Boundary expected
What to capture
Date or condition by which the gap should resolve
Field
Reviewer
What to capture
Who owns the next checkpoint

Worked walk-through

Walk a real boundary case end to end. Predict, compare, and document the call so the next one has precedent.

Calibration of an HRBP role

The case: an HRBP supporting a 200-person business unit, advising the leadership team, owning talent reviews and engagement.
Manager's ask: P5.
Reference roles brought to the room: a P4 HRBP supporting a 90-person unit, a P5 HRBP supporting a 350-person unit with cross-region scope.
The conversation: scope and influence sit between the two references. Knowledge and complexity look closer to P4.
Decision: P4 with a P4/P5 boundary noted. Re-review in six months if the unit grows past 300.
Documented rationale, not just the level.

This is also one of the calibration scenarios in the practice lab. See §10.