Blog > Interviewing: The Signal to Noise Ratio

Interviewing: The Signal to Noise Ratio

Modelling a recruiting process via signals

888 words, ~ 4 min read

perspective

Table of Contents

The Task

The task is to find candidates for a job opening. There is a time deadline, and a set of requirements for the candidate to be able to perform. Some of these requirements are mandatory, while others are ideal qualities (could be taught, or non-essential to the job though useful).

The simplest way to do this would be to accept applications, read through those applications, and then determine who is the best fit. Maybe add some interviews.

Of course, this isn't as easy as it sounds. There may be a lot (10,000+ or even 100,000+) of applications, some of which don't even relate to an application. Some candidates may have messed up some resume formatting, but could be skilled, while others might have resumes that make them sound perfect but in reality lack the skills needed.

Crucially, there is the problem of determining whether candidates have the desired skillset. Getting interviewers means taking professionals away from their job to interview and discuss candidate performance, not to mention the issue of whether a particular interview is a good indicator of a candidate's performance.

The Market of Lemons

The following is my own framework for categorizing this landscape.

All actors are rational. Everyone takes the actions that will maximize the probability of achieving their given goal.

Candidates have some underlying competency/skill for how well they can perform tasks at a given job. While in the real world, it is impossible to assign a value to account for all scenarios, for this purpose, say that this value is ss.

There's a chance one may be good, a chance one may be bad. Each candidate, however, will state that their value is as high as possible. One is unable to trust each candidate as a result when they self-report their competency.

This is a market of lemons due to the information asymmetry. Since there is uncertainty of a candidate's performance, one discounts their value so candidates are always incentivized to inflate their reporting.

To solve this issue, look at the resume and conduct interviews.

Verifying Skills

Now, factor in a candidate's resume and how well a candidate does on interviews.

Each resume and performance in an interview is a signal of how competent a candidate is.

Both the resume screen and a given interview performance are an indication of an underlying ss for a given candidate, but it is not a precise measurement.

Instead, each review can be thought of as a function.

The resume function R(c)R(c) takes in a given resume and outputs a prediction for ss associated with that resume.

The interview function is a bit more complicated. In an ideal world, it is possible (though unlikely) that the same resume reviewed by the person has the same score. In an interview, however, even if there's the same interviewer, candidate, and interview question, the performance can vary. This can be modelled as I(c)I(c) which outputs a prediction for ss, s^\hat{s} associated with that candidate.

It would be more accurate to represent this prediction at a confidence interval, with a fairly high percentage (such as 95%). For simplicity, I'll use the middle of the confidence interval (but keep in mind that as the number of signals acquired increase, the size of the confidence interval shrinks, resulting in a more "accurate" prediction).

The outputs are all signals for you to consider. The issue now becomes the sheer number of signals and the cost it takes to evaluate the signals.

Signals

Candidates go through a pipeline before receiving an offer. This often starts with the resume review, then a series of interviews (often segmented into different rounds, with increasing length), and then an offer.

Add in a new function, deliberation D(p,c)D(p, c) which takes in a previous measurement and a new measurement to combine the two representing the debrief process after an interview as an update. This is similar to Bayesian inference of computing some belief given this new measurement.

For a pipeline consisting of a resume review and 2 interviews (AA for the first round, BB for the second), this yields:

s^=D(D(R(c),A(cA)),B(cB))\hat{s} = D(D(R(c), A(c_{A})), B(c_{B}))

At each point, more signals are being acquired about a given candidate. Even though any one particular signal may be inaccurate, the acquisition of more signals lowers the variance associated with a measurement, indicating a higher degree of confidence in the predicted s^\hat{s} for a candidate.

There is a tradeoff between the confidence in a given s^\hat{s} and the cost that it took to gain that confidence. Each interview costs money; the interviewers are being required to spend time away from work.

If the precise costs are known, then a linear program can be used to solve for the ideal thresholds to hold candidates to for progression into the next round.

Utility

I use this framework to help make sense of the process. For example:

  • Candidates with higher signals are the ones that progress forward with the goal of getting more signals.
  • Referrals fit into this context. A referral is a signal that has a "higher" degree of being right, since it's from an internal employee. That is worth money (since there is a cost from interviews and from hiring an underperforming employee), hence the bonus as an incentive to receive such a referral.

Found this interesting? Subscribe to get email updates for new posts.

Return to Blog