Blog·Authenticity·5 min read

Why we don’t score, rank, or predict candidates.

It would be technically easy to add a score to every Flipbase video moment. We have deliberately not built it. Here is the reasoning.

Flipbase team · 6 February 2026

Every conversation with a new customer eventually touches the same question. Could we add a score to the video moments? An AI rating, a quick rank, a flag for the recruiter that says this one is worth watching first.

The answer is no. The reason is not technical, because the technical part is straightforward. The reason is product, and the rest of this is the long version of why.

What a score actually does.

When you put a score next to a piece of evidence, the score becomes the evidence. The recruiter who is meant to watch the video moment glances at the 87 percent next to it and forms an opinion before they press play. The opinion is calibrated to the score, not to what they see in the recording.

This is well-documented in any domain where humans review evidence with a score attached. Doctors with AI-assisted radiology, judges with risk scores, recruiters with AI shortlisting. The score pulls the human's attention toward agreement, even when the human would have disagreed on the underlying evidence in isolation.

If we put a score on a Flipbase video moment, the moment stops being the thing the recruiter watches. It becomes the thing the recruiter checks against a score. The product changes in shape from a context tool to a decision-support tool, and decision-support tools have a different ethics and a different regulatory posture, both of which we deliberately do not want.

What a rank does.

A ranked list is a stronger version of the same problem. The recruiter who opens a list of ten candidates ranked by an AI-generated score reads them in the order the algorithm provided. The first candidate gets the most attention, the last candidate gets the least. The recruiter's evaluation of any given candidate is partially shaped by where the algorithm placed them in the list, because attention is a finite resource and order matters.

The honest version of this is that ranking is a recommendation in disguise. The system says: I think you should look at this one first. The recruiter agrees most of the time, because looking at the recommendations is faster than not. The cumulative effect is that the algorithm shapes who gets hired, and the recruiter signs off.

We are not in the business of shaping who gets hired. We are in the business of giving the recruiter one more piece of context so they can shape it themselves.

What a prediction does.

Predictions are the same problem with a different shape. A prediction tells the recruiter the system's best guess at the candidate's outcome. Will they pass the screening call. Will they accept the offer. Will they be a strong performer in role.

Predictions are technically a forecast and emotionally a verdict. The recruiter who sees a prediction of low likelihood to pass the next stage will think twice about advancing the candidate, even when their own read of the video says otherwise. The model becomes the gatekeeper, the recruiter becomes the executor.

This is the part of the AI-in-hiring conversation that regulators are paying the most attention to, and rightly so. The EU AI Act treats high-risk hiring AI as needing documented oversight precisely because predictions about candidate outcomes change the shape of the decision, even when they are framed as advisory.

Why this is hard to resist.

The reason every vendor in the space eventually adds a score or a rank is that the buyer asks for it. The first procurement meeting after a video screening tool is rolled out usually contains some version of: this is nice, but can the system help us figure out which ones to watch first?

The answer that pleases the buyer is yes, here is a score. The answer that pleases the underlying intent (faster, better decisions) is no, watch them all, they are 60 seconds each. The second answer is harder to sell in the moment and produces better outcomes over time. We have chosen the harder sell.

What we build instead.

Every product decision in Flipbase is in service of the recruiter still being the one who decides. The 60-second cap exists so the videos can actually be watched at scale, so the recruiter is not asking us to filter for them. The structured question exists so candidates' answers are comparable, so the recruiter has a fair basis for comparison. The native ATS embed exists so the watching happens where the rest of the decision already happens.

None of those decisions reduce the recruiter's work to the same degree that a score would. They all increase the quality of the decision the recruiter makes. The math is deliberately different.

What this means for our customers.

If a customer wants AI scoring on candidates, we tell them, openly, that we are not the right product. We say it on the website, we say it on demos, we say it in procurement discussions. We do not try to sell them on the idea that not scoring is actually a feature. Either they want a tool that scores, in which case there are other vendors who do that better and more honestly than we would, or they want a tool that gives recruiters context, in which case we are the right fit.

The vendors that try to do both end up doing both poorly. The vendors that pick a side end up doing one well.

Flipbase is the side that does not score. That is a constraint we choose, not a missing feature. If your team needs the other shape, we will say so plainly.