← All posts

More Interviewers Doesn't Mean Better Decisions. It Usually Means Worse.

Past four well-structured interviews, every extra interviewer adds barely anything to your accuracy and a lot to your noise. Google's own data found four interviews were enough to predict hire quality with 86% confidence, and in Laszlo Bock's account of that analysis in his book Work Rules!, a fifth interviewer added only about one more percentage point. More rounds don't make you more likely to hire the right person. They make you more likely to eliminate them, because committees drift toward "no." A long process isn't rigor. It's slowness wearing rigor's clothes.

That's the uncomfortable part. The seventh interview doesn't catch a flaw the first four missed. It manufactures a reason to doubt a candidate everyone already liked. Companies don't add rounds because the research tells them to. They add rounds because adding rounds feels safe, and because when a hire fails, nobody wants to be the one name attached to the decision.

How many interviews does it actually take to predict a good hire?

Four. Google ran the numbers on this, hard. At one point its process took somewhere between six and nine months to get someone hired, with candidates cycling through round after round. Then Google pulled its internal data and found that four interviews were sufficient to predict with 86% confidence whether it should hire someone. In Bock's telling, the fifth interviewer added only about one more percentage point of accuracy. The sixth, less than that.

So Google capped it at four. Time-to-hire dropped from roughly six months to 45 days, with no evidence the hires got worse. Sit with that. A company with effectively unlimited budget, every reason to be cautious, and access to its own outcome data chose to shrink its process. The constraint was the upgrade.

This is the cleanest real-world test you'll find. Not a lab study, not a consultant's pitch. A firm that could afford 20 rounds looked at what 20 rounds bought and stopped at four.

Doesn't aggregating more opinions cancel out individual bias?

Only if the opinions are independent. They almost never are.

The whole statistical case for many raters rests on one condition: each person forms their view without contamination from the others. Break that condition and you don't average out bias. You amplify the bias everyone shares. Most real processes break it constantly through shared candidate briefs, sequential scheduling where interviewer four hears interviewer one's take in the hallway, and an open debrief room where the loudest voice goes first.

There's a deeper problem with how much a single interview impression is even worth. The reading of the research on interview noise by Daniel Kahneman, Olivier Sibony and Cass Sunstein is blunt: a typical unstructured chat interview correlates with later job performance at only about 50%, barely above a coin flip, while structured, committee-based interviews that fight bias cascades climb to around 65%. Stacking eight weak signals that all heard each other's results doesn't get you to truth. It gets you to confident consensus, which is a different and more dangerous thing. This is the same reason a structured rubric beats an interviewer's gut every time: the gut is exactly the channel the contamination travels through.

What does adding interviewers actually do to the outcome?

It tilts you toward rejection. Not randomly. Systematically.

This is the part most hiring teams never model. The extra noise from more interviewers doesn't split evenly between false positives and false negatives. Three forces all push the same direction, toward "no":

  • Conformity. In Solomon Asch's classic experiments, over 75% of subjects gave an answer they could plainly see was wrong rather than break with the group. A debrief that opens with "so, what did everyone think?" is structurally identical to Asch's setup: state your view after you've heard the room.
  • Anchoring. Decisions form fast and stick. In a large field study by Frieder, Van Iddekinge and Raymark, roughly 60% of interviewers had made up their mind within the first 15 minutes, less than halfway through the scheduled time. The first strong opinion in the room becomes the anchor everyone else argues against or rounds toward.
  • Veto by skeptic. Under any unanimity norm, one unconvinced interviewer outweighs the seven who were sold. The more chairs you add, the higher the odds that one of them, on one bad-fit conversation, sinks a qualified person.

Add it up and your eight-round loop isn't more accurate. It's more likely to drop the right candidate while everyone walks out feeling thorough.

What does a weak process look like next to a strong one?

The structure matters more than the headcount. Here's the contrast.

Weak process: Six sequential interviews. Everyone reads the same candidate brief beforehand, so they walk in primed the same way. Afterward, a "calibration meeting" where the hiring manager, the most senior person in the room, speaks first. Everyone else calibrates to that. The decision gets made by whoever sounds most certain.

Strong process: Four interviewers, each assigned a distinct signal: one digs into the craft, one into collaboration, one into a real work sample, one into judgment under ambiguity. Each scores against a rubric written before anyone met the candidate. Written assessments go in before the debrief room opens. The decision compares scores. Nobody's vibe gets to override the rubric because someone said it louder, which is the whole reason interviews keep rewarding confidence over competence until you force the scores onto paper first.

Weak loopStrong loop
Interviewers6-83-4, signal-diverse
Scoring criteriaFormed during/afterPre-committed rubric
IndependenceShared brief, sequentialBlind scores before debrief
Debrief orderSenior voice firstScores submitted first
Decision ruleConsensus / vibeCompare scores

The point of the data on interview structure is exactly this. Structured interviews carry a validity coefficient of 0.42 versus 0.19 for unstructured in recent meta-analysis, and the older Schmidt and Hunter 1998 landmark put it at 0.51 versus 0.38. Structure roughly doubles your signal. Headcount does almost nothing. Teams spend their energy on the lever that doesn't move.

What does a bloated process cost you in candidates?

Your best people leave first. That's the cost nobody puts on the scorecard.

Candidates with options, the ones you most want, have the lowest tolerance for theater. 36% of candidates drop out of processes because they feel they're being asked to jump through hoops, and 47% walk over poor communication during one. And the whole industry has been drifting the wrong way: in the US, the average interview process grew from 12.6 days in 2010 to 22.9 days by 2014, nearly doubling in four years.

The "team match limbo" some large tech employers run is what this looks like up close, and it's worth walking through as an illustrative case. An engineer clears the full loop, gets the hire signal, then spends months waiting to be matched to a team, sometimes meeting more managers along the way. Effectively a second hidden loop. Competing offers expire one by one. The final offer lands with no leverage left to push back. A strong signal got converted into lost negotiating power. The process didn't vet a better hire. It bled out a good one.

Slowness selects against the candidates who can afford to walk. You end up over-indexed on the people who waited because they had nowhere else to go.

Is this true for every role?

No, and pretending it is would be its own kind of dishonesty. Name the trade-off plainly.

The case for compression is strongest for individual-contributor and early-career roles, where four well-built signals genuinely cover the ground. It's weaker for C-suite and senior leadership hires, where judgment, leadership, and fit are legitimately harder to read in a single pass and the downside of a wrong call is asymmetric. A few domains, surgery, aviation, nuclear safety, have real reasons for multi-evaluator redundancy, though even there the fix is structured checklists, not more unstructured conversations.

And rigor itself isn't the enemy. Glassdoor's research found that a more difficult interview process correlated with slightly higher later employee satisfaction, with a 10% harder process linked to about 2.6% higher satisfaction. The argument isn't against rigor. It's against using round count as a stand-in for rigor. They're not the same thing, and most teams have quietly swapped one for the other.

There's also a fair version of the multi-rater case: genuinely independent panel scoring, where assessors can't see each other's scores before the debrief, can beat a single interviewer. True. The problem is that almost no real-world panel runs that way. Shared briefs, sequential scheduling, and open rooms violate the independence the math depends on.

What to do now

If you're designing the process: cap it at three or four interviewers. Assign each a distinct signal so you're not buying the same opinion four times. Write the scorecard before anyone meets the candidate. Make people submit written assessments before the debrief opens, and decide by comparing scores, not by going around the room. If you must run a panel, kill the "what did everyone think" opener. It's an Asch experiment with a job on the line.

If you're the candidate stuck inside a seven-round gauntlet you didn't design: you can't shrink it, but you can manage it. Ask early and directly how many rounds there are and the timeline. A team that can't answer is telling you something. Keep your other processes warm rather than pausing them out of politeness, because a slow process is not a committed one until there's an offer in writing, and you hold far more leverage negotiating from a job you already have than from a single drawn-out loop you're waiting on. And don't read extra rounds as your own failure. The length is usually about the company's anxiety, not your candidacy. That's agency: you hold the circumstance you didn't choose, and you still make the calls that are yours to make.

Staring down a five-round loop and not sure how to read it or hold your leverage? Talk it through with Praxy on WhatsApp. I'll help you figure out what each round actually means and how to keep your footing without burning the bridge.

Related reading