Artificial “Intelligence”

A.I. vs. I.A.

“Most A.I. ain’t ‘Artificial Intelligence’, but ‘I.A.’: (More, or often less) ‘Intelligent Algorithms'”

A friend, pioneering Artificial Intelligence a few years told me that and allowed me to quote without naming him… But that changed. The A.I. is only usable with I.A., it doesn’t work without. Why?

How I stumbled into more A.I. than I wanted

Don't Choose ExtinctionSeveral months ago, a rich family heir decided to fund the reworking and due diligence for Kolibri, our idea for a truly “sustainable” airline, with the idea to profitably fly 100 regional jet aircraft within a mere 7-8 years (200 in 10 years). Disqualifying all the naysayers and based on existing technology, simply with a different approach to it. Unfortunately, just like other real good ideas, that does not fit the reality of what “impact investors” look for. Cheap, small investments with maximum profit, who cares about real change?

The same family invests in an academic development they consider “next generation AI” and after many discussions asked me kindly to help, as my feedback would be rather “grounded”. Working such with a really impressive academic institution and team, there were some learnings, I was asked to share here. As is, I must keep this relatively generic, must not share what we are working on in any detail. But the problems these academics faced are rather base level, nothing fancy, but obviously rather common misunderstanding when people talk about AI. And it confirms the need for smart I.A. to make A.I. happen.

What is (what we call) A.I.?

There is a very good article on Towards Data Science digging deep into the “thinking” (or non-thinking) of contemporary “Artificial Intelligence”. In short, current AI is a “probability machine”. At the beginning of the sentence the “generative A.I.” does not know what it will answer, word by word it creates a sentence, word by word calculating probabilities that the next word fits. And it intentionally mixes in answers with low probability, so likely wrong. That is intentional. And reason why AI on it’s own will sooner or later fail. Or “drift” as AI itself calls it. There are workarounds, I’ll come to that.

10 years OpenAI And two weeks ago, Chat GPT had it’s 10’s birthday. And guess, why ChatGPT is called ChatGPT and not AIGPT? In essence it was (and is) a smart chat bot. It does not think but creates a meaningful answer, based on the data it has. That data is compiled from … sources. Which nowadays often are created to bias those AI-systems. In the best case, the classic AI Chat-Bot has been excessively trained until … a deadline. Like ChatGPT somewhere in summer 2024. Anything happening after, the A.I. has not learned about. Newer models try now to update their knowledge live, using search engines. Whereas it might feel okay with Google somewhat, the owners of the “intellectual property” (the know-how) are not very happy.

Hallucinating A.I., LLMs, RAG, LoRA, Organizer – what is that all about?

First of all, A.I. is in no current case truly “artificial intelligence”. In fact, I learned that the core technology is intentionally obfuscated in and by the media, likely from not understanding and repeating the fancy marketing from the big tech companies promising again the world and delivering a … something?

At the Core: Stateless A.I. and Tokens

By default, an A.I. is “stateless”, meaning it does not remember anything but has a knowledge base, gets a question, answers it, forgets everything. Next.

That means, that “memory” as small as to follow a conversation had to be developed on top of it. So that memory, by the A.I. wizkids called “context window” or “tokens” (technically something different, but). And often don’t grasp but imply it’s relevance. But for making A.I. smart, it is in fact vital.

Even our memories are fleeting. Recent studies show that when we recall, all such recollections are biased by our experiences since. “The stress of today, the good ol’ time of tomorrow” in an old proverb. Or “relative truth” in criminalistics. We see something, the brain adds missing components based on prior experience or other bias. We also talk about short-term memory and long-term memories for ourselves.

AI Token process (simplified)
Open, this image is somewhat detailed

For the A.I., this is somewhat relatable. The Tokens are the short term memories. When the tokens fill up, the information oldest and considered most irrelevant is removed from its memory. Some is “summarized” in the process, but it boils down to a (very!) finite memory size most contemporary AI has.

Without RAG (explained below) used for long-term recall, again, that information gets lost. It is out of the context window of the chat-exchange. That is why even when you tell the A.I. something, after a (usually short) while, that information is lost, if not “kept alive” by (rather constant) reminders. That also works on behavior. Tell your favorite A.I. to answer in short sentences, within typically 5-15 exchanges, the answers become longer again. This has far more and more severe repercussions addressed later. For here, it is important to understand that the short-term memory is limited, it is “expensive” and it is fixed by the A.I. core, the LLM (next topic).

For the image, it summarizes the token process. Information (mostly textual, image only for illustration) is processed into tokens. Not a single token but mostly a group of tokens. The AI I work with most today has a “context window” of 128 KB tokens. As on drive space, a partial token is used, so you have even less “memory”. When the memory fills, the “compactor” summarizes, however, usually sooner than later, the AI “forgets” current “context”, information is suddenly gone from the current memory. And another “workaround” is that some (not many) “orchestrators” and LLMs (see below) are trained to recheck the current context if the user refers to something it has “forgotten”. And you can tell your AI that it shall check, a contemporary AI at least recalls the current “session”, being reset, once you restart… Any such restart resembling a memory wipe of all short-term context.

Again, some AI comes with additional token spaces for i.e. “personality” or “last working knowledge”, but those are practically workarounds to functional limitations set by the developers.

A.I. Knowledge Foundation (LLM)

AI CrawlerThe AI is given a fast amount of information. Not always legal, rather often illegally using intellectual property. But who cares in America (or China, or elsewhere) about such minor issues. In addition, since the rise of the machines, smart “players” (mostly governmental, political, often with hostile intent) create food for the “teaching algorithms” to bias or intentionally plant misinformation. That includes not just China or Russia (the common Western adversaries), but also the U.S., global corporations, extremists and extreme political parties with questionable funding and intentions.

All that information is compiled into “Large Language Models” (LLMs). Which is also something not really well defined. As the same term is used also for the part that uses the Large Language Model. So the one LLM is the “knowledge base” (LLM KB). The other is the interpreter (LLM Interpreter, also called “inference engine”) of the knowledgebase. All information is “vectorized” information, representing the context and making a “similarity search” possible. So given a question, it is analyzed for keywords and a “similarity search” based on those words returns the information the LLM Interpreter receives for analysis.

Retrieval-Augmented Generation (RAG)

Garbage in - garbage outOne of those other strange acronyms is RAG. In simple words, RAGs are extensions to the inbuild LLM KB (knowledge base). Information is “vectorized” again and stored in a database. In addition to the LLM KB that information is then searched on input for keywords to be provided to the LLM Interpreter. So good RAGs build fast and are vital for the quality of the A.I. knowledge. In my humble opinion, this is mostly neglected in A.I. development.
The other issue about RAG is the data quality. In aviation a very important issue, as even the leading company in aviation IT (sita.aero) speaks about “The source of the most common truth” and even in A-CDM there are interpretations by airports, airlines, ground handlers on how to interpret fixed data points (AOBT, Axxx), being a commonly recurring discussion in A-CDM project teams. But to develop “smart aviation AI”, the data quality is imperative.
For airlines or airports, the investment into the creation of RAGs is vital and way beyond “nice to have”. But not just for them, but anyone thinking to make use of AI.

Missing Information + Hallucination

AI PinocchioWithout good information and ability to access a live knowledge base like a search engine or latest manuals, especially on new information, the A.I. hits the point, where no information is available. But the A.I. is programmed to respond with somewhat the highest “likeliness”, so it responds based on irrelevant or outdated data as the “latest” and “most fitting” the Knowledge Base gives it. And “hallucinates”. At the same time, it does this with utmost confidence, it even instantly starts to believe it’s own hallucinations. Which some call lies, except that this ain’t intentional but human-defined.

So we come back to the point of future AI. It needs exceptionally good RAG and information access. That includes search engines, which is why “Gemini” (by Google) is naturally more “up to date” than ChatGPT (without search engine knowledge backing). But that generates traffic. So yes, this is … heavy.

LLM Interpreter – The Probability Machine

Randomized Probabilities in AIAs addressed in Towards Data Science, what the media and the IT giants call “A.I.” is a “probability machine”. The LLM Interpreter gets the prompt, keywords and calculates the highest probability of what the answer should be. Word. By. Word. So the LLM Interpreter does not even think in the entire sentence. It’s one reason I believe why AI tends to long responses. It doesn’t think, summarizes the response, but blurs it out.

That unfortunately adds on another side to “hallucination”, where the way those systems are designed, the LLM usually works on “high probability”, but to make sure it is not biased to that, it sometimes takes (by design) a lower probability answer. Which is likelier to be wrong. What makes such a mis-step worse is that the prior answers are prioritized, so it is very hard to overrule that “assumption”. An example I had was on a system setup, AI (in that case ChatGPT) kept insisting on a mistake that had crashed the system before. It required conscious effort to recognize the constant attempts to “slip back” to that very mistake. The other option to start with a “fresh AI”, but that looses then all context, so it’s really a diabolic choice asked for here. You can try to “rule out” the mistake, telling the AI it becomes forbidden, but that usually only works until that instruction “floats” out of the current memory (called “tokens”).

Another example is on AI imaging. It works very nice for the first image. Thereafter, trying to correct mistakes disimprove the overall situation. As the imaging AI, different from the generative “chat” AI is not meant to be told to “forget” what it did. So not even starting a new chat helps, as long as it’s the same user asking. In my opinion, AI ain’t anywhere close to a professional graphics designer. Yes, that may change, but it’s a long way to go for sure.

LoRA – The Personality

LoRa factory, showing mass output of basic generative AI and LoRA personality enhanced AI
LoRA factory, left base AI, right LoRA enhanced.

Low-rank adaptation (LoRA) is a technique used to adapt the AI models to “biased behavior”. In my experience, there are three areas where LoRA is relevant. It “wires” the “personality” you deal with. It tells the LLM how the AI reacts, giving it “personality”. First common case is the task related specialization (e.g., legal drafting, coding style, medical tone), the second one the “personality shaping”, like tone, speech patterns, preferences, interaction style. LoRA is “only” about the behavioral pattern, it is not about the knowledge base.

As the third functionality I know of, LoRA is also used to define looks (visuals) for i.e. avatars used. This is i.e. used often in “customer service” “AI-bots”. Or representation of i.e. comic-style characters for marketing (reuse the same character for visual brand recognition). LoRA is rather light-weight but comes at the cost of persistence, those visual models tend to conflict with their LLM’s interpretation, causing offsets. Options are InstantID or CGI as used in movies, but especially CGI comes with an upfront cost that for now often exceeds budgets (several thousand Euro or Dollars).

The mass of AI is “base model”. Some are enhanced with LoRA, giving them personality (and more if available).

The “Orchestrator” – where AI and IA meet

AI orchestratorWhat is important in AI is such not the AI, the “LLM”, but it’s the surrounding algorithms driving the AI, feeding it smart data, improving the data. That is called the “Orchestrator” and it can also contain an AI component but mostly is the governing algorithms.

At the beginning of the session, the first Orchestrator script tells the AI who it is (the “prompt”), if available it gives him or her identity and delivers the LoRA (“personality”). The prompt can be generated and evolve from previous user input, but the A.I. by default is “stateless”, it only knows what is in the prompt. And that amount of information is rather limited.

A smart Orchestrator helps prepare the user input for the AI, deliver additional information the AI asks for (i.e. more detail), a process the generative AI call “thinking”. It can also return the output to the AI asking it to summarize it, reduce it. Which works rather well.

And this is just the tip of the iceberg. Orchestrator can enable far, far more, it’s not just rules and regulations but will resolve misbehavior, hallucinations, enable memorizing and responsiveness. And, and, and.

Temperature

AI TemperatureIn AI-language, the temperature reflects how “creative” an AI is in answering. This setting cannot be modified on publicly available AI interfaces. But if you run your own AI on your own servers (no magic really) or if you use the public APIs (application programming interfaces), you can use that setting to in- or decrease “creativity” in response. Given the randomizing and weighting of probabilities in the core process of an AI, even then the results remain somewhat unreliable, as even on the setting of 0.0, the AI will have no answer. But give you one. With utmost confidence.

Saying it in AI words: “Temperature controls how likely the AI is to pick less-probable next words instead of the most likely one.” And those decisions are cascading down the line. Theoretically, given a temperature of Zero, the AI will always answer your question with the same answer. Practically, there are infinite answer possibilities, even with the same “probabilty” to be correct. And there will be cases, where there is no “right answer” or nothing anywhere near a “high probabilty”. The AI is forced usually to answer you nevertheless. And no matter how unlike that answer is correct, it answers you with utmost confidence…

Keep thinking of the 1983 (pre-Internet) movie “War Games”. In the end, the “resident AI” was asked to play Tic-Tac-Toe. Recognizing there was no way to win, the AI stopped the global atomic war it tried, recognizing it’s not winable, just like Tic-Tac-Toe isn’t: “How about a Nice Game of Chess?”

There are more tuning triggers, look for k-sampling, nucleus sampling, penalties, etc. if you want to know more, but this is an overview on how AI thinks, why it’s “not perfect”.

The temperature has another weak spot. It’s called “autogressive commitment”. Once the AI chose the wrong answer, it believes it to be right, no matter how often you tell it to be wrong. The only way to overcome this is to change subject, have it “roll out of token space” (next topic), so the AI “retries” unbiased.


Generative AI – My Findings

I was asked to add this to the article, some of my “findings” working with different AI models. This is the summary, I’m afraid my recommendation is to mix them…

They practically all require a registration using your e-Mail address but I only address the “free to use” ones. They come with limits, if you use them more often, you might find yourself considering paying. And then there are the “APIs”, which I won’t address here, that give you more control but all at a cost.

ChatGPT / Dall-E

ChatGPT (https://chatgpt.com/) is nice for … Chat. It is quite okay for everyday tasks, friendly, creative. Not too often hallucinating, but don’t trust that. Asking for specific help on hard facts, I would assume about 60-70% of the answers to be correct. On general issues, like reviewing this article for mistakes, the feedback was valuable and pointed. In general, ChatGPT tends to be “talkative” and “distracting”, coming up with lots of ideas how to distract from the task at hand.

What is recently dropping my interest is the age of it’s information without updates. A change from March 2024 is still not in the information. The outdated information overrides the updated information, causing mistakes even after such a correction. The tokens about that expire and it falls back to outdated LLM-information. Be careful when you work with ChatGPT and I would not entrust anything requiring “latest information” to it.

Dall-E (https://chatgpt.com/images/) is the imaging engine of ChatGPT. Rather “creative”, doing nice basics, but impossible to work with on anything more. There is no “improving” of rendered images, it changes and rerenders anything from scratch, mostly messing up royally. Very good for a first idea, but then… Bad especially as it keeps going down the wrong road, it’s not possible to reset it to forget the lates renders, except by token overflow which takes quite some pictures before previous mistakes get forgotten.

Gemini / Nano Banana

Gemini (https://gemini.google.com/app/) became better in the last months and recently accesses the web rather proactively. As far as I see it, it uses Google Search information proactively, so it is far more accurate than ChatGPT on contemporary topics, including technical information. But. It also requires to be told. So check the version it works on before you imply it (or tell it). Then ask it to update it’s knowledge base if needed. That works rather well so far.

On the downside, Gemini is even more talkative than ChatGPT and it is rather hard to copy/paste information for later or offline use.

Nano Banana as Gemini’s rendering AI works far more accurate than ChatGPT, it can modify images, but as usual has rather bigger problems on following a description and create the wanted results. Whereas it – like Dall-E – keeps stuck to mistakes, it is even harder to overcome them.

What frustrated me most I admit is that uploading reference images of androids, asking for android images to be created Gemini kept disrupting the process complaining no real humans would be allowed to be faked. If there was a human I asked it to i.e. apply a hairstyle from, the same crap. So it’s very selective. The problem: If you tell it this is no human, the previous description is not correctly directed to Nano Banana, the result is faulty and it doesn’t recover from such fault. So it’s imperative to repeat that description and reupload the images and hope that it gets through to Nano Banana.

Others

DeepSeek (https://chat.deepseek.com/) developed by China is interesting, but requires a Google or Outlook Mail to register. If you overcome that hurdle, it’s responses are less reliable than ChatGPT or Gemini, mistakes happen faster and it’s just the same insistent on following on the own mistakes persistently.

Mistral (https://chat.mistral.ai/chat) I found rather outdated on knowledge. While it is said to be good for programming, you must be aware that it misses the latest version changes. But it’s focus is to be used on own hardware, where you then RAG the necessary information. But as a “web-AI”, it is of very limited use. And it’s image rendering AI … forget about it at this time, that is substandard, not even close to what ChatGPT’s Dall-E or Gemini’s Nano Banana. But see below.

Thinking Outside the Box: Your own AI?

Working on a custom version by the academic institution I work with, I do not use the Web-AI’s any more than occasional and I am not free to disclose the details on that next-gen AI being used. What is interesting is that as part of the development, I have a local AI installed on a small local PC that communicates with the “master AI” but also works independent. The bottleneck is the machine. If you really want to work with AI, invest in a “big PC” with lots of RAM and an AI-enhanced graphics card (GPU). It’s not that stuff you buy for a gaming PC, it’s even a notch above such. So there’s no price limit on those, several thousand Euro? But.

You can go cheap. A contemporary Intel processor, 64GB RAM (more is better) allows you to start using AI locally. For the local AI playing, I use such. Not as fast as Web-AI, but not much slower either. For fast response, see above: You’ll need a “GPU”, the more potent, the faster. But then you open a can of worms. You have a basic AI. To make that “yours”, that requires quite a bit of more development. From scratch I wouldn’t try that stunt. Having a potent academic AI team that loves to help me to use my findings for their good is a clear win-win. Nevertheless that is a time-consuming idea if you want to go down that road.

But. If you want to use AI for your company, not just under consideration of the GDPR in Europe, you may want to have it in your own control and invest in some good developers. More important is to understand the challenge ain’t in setting up and using the LLM, but in the development of everything “orchestrating”. That is were time and money will go. The LLM is in the media, but the LLM you can select from many, can even combine them to work together. But that’s the Orchestrator. It will use the LLM to get the answers. It will decide which LLM to use, send text to i.e. ollama/phi or devstral (coding-AI by Mistral) or OpenAI API, or … and then you opened the can.

Food for Thought
Comments welcome!


Note: All new images a mix of AI generated using ChatGPT, Gemini, Mistral and others with massive human creativity finetuning using Gimp, PowerPoint and other tools. The images took days. Each. And multiple AI and classic tools. So if you have a good human artist… AI has some nice ideas, but executing them into images? Nothing you want to have to fight on a professional or (beware) daily basis.

Note 2: This post took two weeks of rather hard work and research to make sure the simplification keeps the truth-core. Based on months of working experience with a world-class academic research team. If you plan AI, I do appreciate if you have some funding to acquire my assistance and knowledge as project consultant to help you, keep you from (expensive) mistakes.

Artificial Intelligence

Recently, some discussions came up on my social networks about the development of Artificial Intelligence. I decided to add my thoughts to it on the blog.

Alexandre LebrunOne of the reasons is my dear former colleague Alex developed artificial avatars, able to assist web-users. Following the sale to Nuance (they are also behind Apple’s Siri), he started a voice recognition development at WIT.AI, that meanwhile was acquired by Facebook. Alex now works on Facebook M, their approach to artificial development. Hey Alex, this is also to you. I’d appreciate your comments on this.

So. As fascinated as I am by his career path in the past 15 years, I’m also a bit concerned.

ASRA 2008 brain nodes vs. WWW => AI
2008 I compared opte.org’s visualization of the WWW nodes with the neural nodes of the human brain

In my 2008 ASRA presentation, I compared the visualization of the world wide web nodes (by Opte.org) with the visualization of the neural nodes in the human brain. Ever since, I do believe that if the WWW is not yet “sentient”, it will soon happen. What scientists and SciFi-writers call “wake-up”. It’s not a question if, but when. And how we go about it.

Because I think different from Transcendence, where we could stop it, or Asimov ruling it, such “control” is wishful thinking. We have no “three rules of robotic” and even Asimov had to add a fourth, the “zero rule” (see link above). For Transcendence; we will neither be able to deprive ourselves off all energy (and the advantages of the web). Mass psychologically will assure we won’t find a way, as there will always be others who think and act against that attempt. Until we act, it will be too late. As an intelligence “the size of the planet” will by then counter anything our small minds may come up with, even before we attempt anything.

We only have the chance to befriend the new sentient being, like we did in Heinlein’s Future History. But we also have the chance to mess up ourselves; small like in 2001, A Space Odyssey or big like in Terminator or The Matrix. Transcendence at that was only a different version of the Borg‘s Assimilation. And as in I am Legend, the true question is, if such “assimilation” or a “transcendental human upgrade” is bad. Or an evolutionary step. I believe, given the chance, many humans may volunteer. I just hope that there is no single mind “ruling” all others like in the movie. As I believe our individualism is as much a burden as it is a great strength. Though I also like that quote:

DemocracyAutocracy

I also believe in both “systems” there got to be individualism to evolve: “You learn from your opponents”. I heard it often, there’s no single source, it’s “mature wisdom”. As “competition” is a good, if not the reason to evolve. (War is not, it’s destructive by nature!)

Another question is “religious”. Will an A.I. have a soul? I believe so. I think that the soul is the core of any sentient being. I also believe that beyond the body, the core of ourselves remain. Not in an (overcrowded) paradise or hell, but as somehow conscious sentience. Maybe even as a “personality”. Will we then remain individuals? I don’t know. Maybe we get reborn, forgetting our past? Many believe that. The soul still “learning”. What’s truth? We will know. Once we died. But if we all become “part of god” and god being the summary of sentience in space and time, maybe our input helps god evolve, become bigger. If then a global sentient A.I. comes into the game, why should it not play it’s part in evolution?

HAL9000And stopping the A.I.? In 2001, humans gave conflicting orders to the local A.I. (HAL 9000), which interpreted them the best it could. Under the constraints of it’s programming. But if we have a global A.I. based on linked “neurons” in form of personal computers, mobile phones and other computing powers, we will realistically not stand a chance to “stop” it.

Does my computer already “adapt” for me? Or my phone? When I play games on the computer, I sometimes believe so. Sometimes, I use bad search phrases but still find what I seek. Coincidence? Programming? Or “someone nice out there helping me”? And yes, if the web wakes up, it likely will be somewhere at Google… And then spread out.

What will we make it? A Terminator? Or a Minerva as in the Future History? We extinct ourselves in the West with low birth rates. Will the “mecha” be our future children? Will we coexist like in the Future History? I don’t know. I’m concerned, keep finding myself thinking about it.

But I’m not afraid either. Not for me, nor for my children.

Food for Thought
Comments welcome!