There’s a moment I’ve seen repeat itself quite often when working with organizations that are trying to get serious about artificial intelligence. The pilot went well. The results are there. The team is convinced. And yet the project doesn’t scale. It enters a gray zone made up of meetings, reassessments, new committees. Months later, it’s still there.
They call it pilot purgatory. And the most common explanation you hear is that what’s missing is an operating model, governance, a broader strategy. All true. But there’s something this explanation doesn’t fully capture.
The real bottleneck, in most of the cases I’ve observed, has a quieter nature. It has to do with who, in the end, approves the project – who actually signs off on it.
Scaling an AI system into production means that someone has to take responsibility for that system functioning properly. That data is used appropriately. That the decisions it produces are defensible—to the board, to regulators, to end users. This is not an abstract responsibility. It’s a signature, in the most concrete sense of the term.
And that signature, in most organizations, doesn’t know where to go.
Not because of a lack of willingness, but because the conditions to exercise it consciously are missing. It’s not clear what needs to be true for a system to be considered ready. It’s not defined which risks are acceptable and which are not. In many companies, there still isn’t a shared language between those who build systems, those who use them, and those who are accountable for their consequences.
Meanwhile, the regulatory context is moving. The AI Act has come into force. GDPR applies to AI systems as well. Regulators’ expectations change fast enough to make decisions taken just a few months earlier partially obsolete. In this scenario, the caution that keeps pilots from moving forward is not irrational. It’s a reasonable response to real uncertainty.
What’s missing is not another framework. It’s the ability to turn that uncertainty into something manageable: identifying the specific risks of that system, in that context, for that organization. Translating the regulatory landscape into concrete decisions. Defining where system autonomy is acceptable and where human oversight is always required. Building a record that demonstrates, if needed, that due diligence has been exercised.
That record has a name, even if it sometimes feels like too big a word for business contexts: it’s called ethics. Not in the sense of a principle stated in a values document, but in the practical sense of the term. Ethics as the ability to clearly explain why a system behaves in a certain way, where its limits are, who decided those limits were acceptable, and why. Defining all this means drawing something concrete: a line, a threshold. The ability to explain a system is not a communication exercise—it is the most honest test of its plausibility. If that threshold doesn’t hold, it’s a signal that something needs to be stopped or investigated further, not hidden.
This is work that requires skills that rarely live in the same place. Legal expertise, to navigate a regulatory framework that is still being defined. And research expertise, in the most precise sense of the term: because many of the relevant questions—what is truly risky and what only appears so, what seems harmless but isn’t, how to identify and measure those boundaries—still don’t have established answers. They are open questions that require method as much as experience. The value lies not only in knowing where the limits are, but in knowing how to look for them, define them, and translate them into something useful for those who have to decide.
Seen from this angle, pilot purgatory is less a symptom of organizational immaturity and more a signal that innovation has outpaced the ability to govern it. It’s not a fault. It’s a condition that can be addressed – provided it is recognized for what it is.