TL;DR

AI isn’t the problem – everything around AI is. At the Levi9 webinar held on April 2nd, Principal Architect Dario Djurica and Lead Software Engineer Nemanja Pavlovic broke down why so many AI projects don’t survive the journey from POC to production – and what it actually takes to make AI work in the real world. The key takeaway: the models are good enough. The ecosystem around them isn’t.

Why the Demo Works and Production Doesn't

There’s a massive gap between what AI can do and what’s actually running in enterprise systems today. Anthropic measured it: the curve of AI capability and the curve of real-world adoption barely intersect. Companies are investing – $11 billion last year, nearly $40 billion this year – but results lag behind. 40% of all agentic AI projects get cancelled. And 80% of practitioners say security is one of the biggest problems they face when introducing AI into their organizations.

The paradox runs deeper: 82% of companies have some form of AI in use, but only 30% actually govern it. The rest simply exists – somewhere between sales and marketing, unmonitored, untraceable, uncontrolled.

A Story from the Field: From 50 Documents to 50,000

One of the projects Dario and Nemanja ran started as a classic POC – an internal knowledge base agent, 50 neatly prepared documents, a Q&A system that worked flawlessly. The client was thrilled. Green light for production.

Then they got access to the real SharePoint. Set up in 2011. No folders, no structure, no system. Fifty thousand documents – PDFs, Word files, emails, plain text files, multiple versions of the same document scattered across departments, and a mountain of data someone had typed in and forgotten. Initial estimate: six months just to sort out the data layer. They eventually cut that down to about a month and a half – but only by leaning on the Microsoft AIQ platform, which absorbed much of the work around security controls and multi-format document parsing.

The lesson: every POC lives on clean data. Production lives on messy data.

Six Pillars of Production Readiness

Through work across multiple projects, a framework emerged that Levi9 uses to assess how ready an AI solution actually is for production. This isn’t about organizational challenges – that’s a separate conversation – but about the technical side.

Data Foundation

Data is the agent’s brain. Without clear structure, access control (who sees what), freshness mechanisms (which version of a document is current), and quality gates that filter what the agent actually serves to the user – everything else falls apart.

Agent Design and Guardrails

AI shouldn’t be bolted onto an existing business process as a patch. The entire process needs to be redesigned with AI as a first-class citizen, not an add-on. That design must include output validation, tool call verification, clearly defined scope boundaries, and – critically – human-in-the-loop checkpoints wherever the decision warrants it.

Observability and Operations

Logs that say “200 OK” are not enough in the world of agentic systems. You need to trace the full reasoning chain – which prompt, which model, which reasoning level. In one project, a model ran perfectly in production for days, then dramatically slowed down because the same vendor released a new version and the entire internet rushed to test it. Without a real-time dashboard, you wouldn’t know until the client calls.

Security

Security is an afterthought on too many projects. A classic service account for an agent isn’t good enough, because you can’t trace what the agent is actually doing across systems. Microsoft and AWS have introduced dedicated agent identities – the same principle as user identities, with full action traceability. Relying solely on the system prompt (“don’t return this document”) works on a demo dataset. It doesn’t hold when users start probing boundaries.

Cost Management

Token attribution per agent, per team, per task type – all of it needs to be tracked. An agent can get stuck in an infinite loop and burn resources for hours without anyone noticing. Budget alerts, smart routing (complex tasks go to more capable models, simple ones to lighter ones), and loop detection aren’t nice-to-haves. They’re infrastructure.

Governance and Compliance

The EU AI Act is here. Companies waiting to implement it “when they have to” are already behind. Audit trails are mandatory: every prompt, every reasoning pattern, every attempt to access a document must be stored and accessible. Regulated industries – finance, healthcare – already know this. Everyone else will learn.

Build, Buy, or Platform?

There are three approaches companies choose between.

Build everything from scratch? Developers love it, but it doesn’t scale. Every team builds something different, tools don’t talk to each other, and technical debt compounds fast.

Buy individual tools? This can work for specific problems – Nvidia Guardrails for protection, specialized validation tooling – but integrating them cleanly in an enterprise context rarely goes smoothly.

Platform-first? That’s what Levi9 recommends. Platforms from Microsoft, AWS, and Nvidia cover all six pillars described above – not always perfectly, since the platforms themselves are still maturing, but as a comprehensive starting point they’re far ahead of the alternatives. Forrester backs this up: three out of four AI projects built entirely from scratch, without a platform foundation, fail.

What's Next

The next Levi9 webinar goes deep into those platforms: what works, what doesn’t, and how teams already using them are navigating real-world projects. Because finding your way through the Wild West of AI tooling still requires someone who’s already crossed that terrain.

Written by:

Nemanja Pavlovic,
Lead Software Engineer
Levi9

Dario Djurica,
Principal Architect
Levi9

Published:

15 April 2026

Blog post

Formula 1: Learning from a Physical Data Strategy

September 18, 2023

Blog post

Going Beyond Code: How to Deliver Value to Customers

September 1, 2023

Blog post

Digital Sovereignty: Building Trust Through Choice, Portability, and Security

September 25, 2025

Blog post

Unlocking Career Growth: The Roles of Buddy, Coach, and Mentor Explained

December 9, 2022

Blog post

How to Organize a Hackathon to Awe Your Customers and Boost Team Morale

September 1, 2023

Blog post

DIY Pocket Planetarium: Create Your Own with Raspberry Pi Pico

March 28, 2023

Blog post

Storm of innovation in weather forecasting: From discovery to application

May 29, 2024

Blog post

Conquering Mountains and IT: Insights from a Java Software Architect

August 21, 2022

Blog post

Hack9 2024: Harnessing GenAI Technology for the Greater Good

December 23, 2024

Governing AI in production: AI Agents that survive beyond POC and live in a real-world

TL;DR

Why the Demo Works and Production Doesn't

A Story from the Field: From 50 Documents to 50,000

Six Pillars of Production Readiness

Build, Buy, or Platform?

What's Next

In this article:

Related posts

Formula 1: Learning from a Physical Data Strategy

Going Beyond Code: How to Deliver Value to Customers

Digital Sovereignty: Building Trust Through Choice, Portability, and Security

Unlocking Career Growth: The Roles of Buddy, Coach, and Mentor Explained

How to Organize a Hackathon to Awe Your Customers and Boost Team Morale

DIY Pocket Planetarium: Create Your Own with Raspberry Pi Pico

Storm of innovation in weather forecasting: From discovery to application

Conquering Mountains and IT: Insights from a Java Software Architect

Hack9 2024: Harnessing GenAI Technology for the Greater Good

Contact us

Explore

Industries

Governing AI in production: AI Agents that survive beyond POC and live in a real-world

TL;DR

Why the Demo Works and Production Doesn't

A Story from the Field: From 50 Documents to 50,000

Six Pillars of Production Readiness

Build, Buy, or Platform?

What's Next

In this article:

Related posts

Formula 1: Learning from a Physical Data Strategy

Going Beyond Code: How to Deliver Value to Customers

Digital Sovereignty: Building Trust Through Choice, Portability, and Security

Unlocking Career Growth: The Roles of Buddy, Coach, and Mentor Explained

How to Organize a Hackathon to Awe Your Customers and Boost Team Morale

DIY Pocket Planetarium: Create Your Own with Raspberry Pi Pico

Storm of innovation in weather forecasting: From discovery to application

Conquering Mountains and IT: Insights from a Java Software Architect

Hack9 2024: Harnessing GenAI Technology for the Greater Good

June 26th

The Cost of Choice

Most companies spend up to 40% to much on cloud, are you? Cut spend, not options. Smart standardizations win.

May 28th

AI AGENTS DESERVE AI PLATFORM

Portable patterns for Azure, AWS and GCP that survive the next upgrade

April 23rd

Winning on Repeat: Product Engineering in the Age of AI

Cadence, quality and outcomes over output

April 2nd

GOVERNING AI IN PRODUCTION

Designing cloud and data platforms that survive real-world pressure

March 5th

Navigating Digital Sovereignty and Strategic Cloud Choices

How Organizations Can Balance Innovation, Compliance, and Control in a Multi-Cloud World