Agentic AI

Agentic AI

Discussing the issues of contemporary, generative AI, there is a new “buzzword” as a panacea. Agentic AI being the panacea, as not being generative. That is as wrong as many assumptions about AI. While it does help, is a break-through even, it is important to understand to avoid mistakes from misperception.

If you haven’t read my article about the Reality of Artificial Intelligence, please do that, as I do expect you to know what AI is today.

Agentic AI: The Hedging of Generative AI

Agentic AI is a smart attempt to overcome the problem of generative AI to overcome things like hallucination or drift. But Agentic AI is no AI, it’s the smart IA to “hedge”, to control the generative AI. While I don’t know all flavors, I’ve been introduced to two examples I am told by experts I trust to be good examples of leading “Agent AI” flavors.

So what it is in fact is the IA I mentioned in the original whitepaper, it’s the orchestration layer. Nothing more, nothing less. So Agentic AI is mainly a toolbox that standardizes the access to the AI. But the AI itself is always still a generative AI, which will remain true, until a new generation of LLMs is developed that are not smart ChatBot trying to calculate what you want to hear.

Example 1: Pydantic AI

Pydantic AIPydantic itself is a set of python tools. Python being a script programming language similar to PHP or HTML (simplified), the stuff that powers webpages.

Pydantic is the most widely used data validation and settings management library for Python, leveraging modern Python type hints. It ensures data structures (from APIs, databases, or user input) are accurate and correctly typed, automatically converting data to expected types and raising errors if validation fails. [Source]

Pydantic AI is a Python framework designed to build production-grade, type-safe AI agent applications, developed by the creators of Pydantic. It leverages Pydantic for robust data validation, offering seamless integration with LLMs to create structured, reliable, and observable AI systems. [Source]

So Pydantic AI is a Python framework specialized to hedge LLMs using Python Scripts, a corset to make sure the Generative AI sticks to parameters that shall make it stick to a defined framework. That framework still needs to be defined by humans for what they expect from the “real AI”, the LLM based Generative AI running in the background. The LLM remains to be a “probability machine”. guided by algorithms and a token-based memory to create good answers.

Example 2: Crew AI

CrewAIAn interesting different approach is Crew AI. Different from Pydantic AI, it is a tool (still algorithms) that creates “teams” of AI, giving them tasks. To work properly, it works best with good, updated RAG knowledge. Scientists can such create a virtual “team” with special focus on these tasks. So one knows a programming language, the other the rulesets, yet another another knows the research on the topic to be looked at. Like a team of humans, they work together in their specialties. That naturally (if programmed correctly) reduces the chance of hallucinations, i.e. using Pydantic to cross reference that the results do fit with the underlying data.

While that again reduces hallucinations, it cannot eliminate them either. Mostly recommending to use a low temperature (less creativity in response), it remains the fact that it’s generative AI that creates answers. Instead of responding to a user, it can communicate with other “team members”, such “developing” more complex and knowledgeable replies than a single LLM-based AI could. And what makes that one important, is that like in a human team, the AI can communicate, discuss, improve and hedge each other.

So What is Agentic AI

If you digitalize a shitty process, you have a shitty digital process [Thorsten Dirks] aka. Garbage in - garbage outWhile Agentic AI is such nothing but Intelligent Algorithms. But yes, they can and do improve AI. , they again depend on the quality of how they are being prepared. By themselves, Agentic AI are “toolboxes”. What you do with it defines the outcome, can improve the results, but keep in mind that the core is and remains the LLM. Enhanced by tokens, RAG, LoRA and rules that you set.

But in the end, Artificial Intelligence still amplifies. If the input is good, the output can be better. If the input is garbage, the output is artificially intelligent garbage.

Outlook: A New AI Generation

Get the FactsWhat I find particularly interesting is not a new “Scientific AI” focused on “facts” or LLMs. Probabilities are not bad by design. Developments like the ones above improve AI usefulness. Processes that give the AI ability to rethink, to reconsider are an interesting step. Or to decide “not to answer”, like on a “good night”. They can be used to question themselves and reconsider. You may see developments in the online AI tools (ChatGPT, Gemini, Claude, etc), that they provide sources when they research the web. They cannot really from what is in their LLM knowledgebase.

But it is simply important to understand what AI is not, perception matters: AI cannot be perfect in an imperfect world with incomplete, conflicting, and time-varying information. The right target is not perfection. Even our best scientists make errors. Einstein did the groundwork but himself did not believe black holes exist.

The Forced Response Error

AI Instant ResponseThe other is that today, we expect AI to come up with an answer in seconds. As a Chatbot, that makes sense, in business, it forces simplification and wrong answers.

Traditional AI systems fail because they are siloed and rigid. Basic LLM systems fail because they are fluent and eager. Both fail, but in different ways. One says: “cannot proceed.” The other says: “here is a confident narrative.” Because we force it. Most deployed AI is still put into the wrong role. Instant answer on insufficient information.

I keep comparing this to “robotic” we did in the 90s, taking a booking made by an experienced travel agent under stress, running checks, coming up with better result. Sometimes ignorant of hidden constraints like customer preferences (or “I don’t fly via XYZ). It depends on the information. And then the same in Cytric™. A flight request to be answered in seconds. Filtering out by default, not always the best result. The filtering in search engines, creating a legion of SEO-experts (search engine optimizers) that push “their” result onto the first result screen on Google?

Lack of Experience

Fleeting AI memoryAnd mostly we walk away with “throw-away experience”. AI is “reborn” every time, it’s mostly stateless. It cannot develop its own “memories”, learn from what it did wrong.

Agentic AI gives us the tools to change that. IMHO, in my humble opinion, persistence is key to AI improvement. Allow it to really learn. Not a librarian with worlds largest library but no experience. Routines to question itself, to learn, to develop. And then not to take AI as a panacea but like any other colleague. Greenhorns Daniel Stecher referred them to. They come in, fresh from university, but they need to learn what happens when the rules break. When “best practice ” ain’t what was learned in university as universal, but taking into account that specific environment.

vs. Learning Best Practice

Failsafe ProgrammingWhen I develop, I never cater for “failsafes” and “error logs”. Code has a task. At the end, the outcome is verified to have the proper format, else it’s a bug. Run autonomous, it does not help that it logs error. Or crashes the system. 100% reliable code. What would we do, if a GDS system failed. Or Google. Just because someone developed a “failsafe”, that instead of solving the problem managed it? But it is what they still teach at universities as “best practice”. Your system delivers raw data? Check that it’s correct. No faith in your own system?

So AI fails not on wrong knowledge, but on missing definition of “best practice”. And “experience”.

The Aviation Dilemma

Supersonic TurbopropBut yes, then we have those airlines that modify “Standard System Interface Messages”, creating business for tools like tNexus by DTP. Or develop “open standards” (yes, opentravel.org) that cater for all kinds of exceptions, rendering them useless in practical terms.

Then, as Daniel Stecher likes to say: We have 25 different systems (information siloes), complemented by Excel sheets. Worsened by adding remote systems. Forcing our experts into cognitive overload. And then we expect AI to solve that mess?

I compare to A-CDM. Developed in Zurich, focused on collaboration. Failing as industry players demand information delivery, but put their information behind a paywall. Yes, I talk about you DFS, you “naturally” jump to mind. We are still building information siloes. And are proud of it…?

Summary Assessment

Agentic AI is a good and promising way to improve generative AI output. Especially breaking out of the “chat-mode” (why is it called ChatGPT?) and giving it time, enabling a real thinking process and using AI in teams (CrewAI) is really interesting. But again, as KieuAnh Billiot emphasized recently. AI amplifies. Good or bad. If you have good data, good information and a good setup, it amplifies that. If you feed it bad data, try it to abuse it to solve neglect of your IT infrastructure, you just amplify that: Garbage in – garbage out.