A client raised a timely question: What added value does Athena Decision Systems, which I represent bring compared to the diverse AI agents offered by many parties? Can the risk of generative artificial intelligence (AI) hallucinations be eliminated with the help of AI agents, which is Athena’s value proposition?
The question is good, as the matter is anything but straightforward. The starting point for using AI agents is generative AI. Without it you would not have the agents capable of verbal and text-based communication. The use of generative AI also brings uncertainty, caused by the randomness inherent in their algorithms.
The rapid development of generative artificial intelligence constantly provides new ways to improve reliability. These include, among others:
Additionally, responses can be generated through collaboration among agents: one suggests an answer, another checks it, and a third improves it. This also enhances quality but does not eliminate the fundamental problem. The aforementioned methods can improve the quality and relevance of responses. However, the risk of making a mistake cannot be fully eliminated, as the algorithms behind generative AI still operate in the background. Of course, you can continue refining the questions (prompt design) or tuning the documents that support the language model (RAG) until the correct answer is obtained, but then the correctness of the conclusion is determined by a human modifying the questions, not AI or an agent relying on it. Nothing guarantees that the next differently phrased question will still yield the correct answer.
The best way to eliminate the problem would be to remove the root cause of hallucinations. The development of language models (LLM = Large Language Model) is leading to reasoning models (LRM = Large Reasoning Model), which are capable of critically examining their own responses and correcting any errors within them. Unfortunately, this does not seem to eliminate hallucinations based on current technology. Read about it: The AI Reasoning Paradox: The AI Reasoning Paradox: Why Agents FAIL | AIGuys If you encounter a paywall on Medium (medium.com), you will need a subscription.
Using the methods described above to improve reliability significantly increases the cost-of-service development. Additionally, the cost of using a language model public cloud service increases when multiple AI agents work on the same response.
If we want to eliminate the risk of incorrect answers provided by AI with the help of AI agents, they must be given completely reliable, supporting information and guidelines. The latter can also be referred to as business-describing rules. Read more about hybrid artificial intelligence that combines stochastic and deterministic perspectives: Responsible AI Decision Making. When a decision is so important that it requires a human to evaluate the accuracy of the answer regardless of the level of reliability of the solution, the rule engine still brings guidelines to support an expert or customer service representative.
Only 100% accurate information guarantees a fully reliable result, regardless of whether AI agents are used in the service. In that case, services can be built that combine the answers created by generative AI and the trustworthy business information found in the company’s systems and databases. It is also essential to understand the organization’s rules, regardless of whether the final decision is made by an AI agent or a human. And what would be a better place to manage the guidelines governing generative artificial intelligence than a rule engine.
I am looking for customers and use cases where Athena can assist in improving efficiency and building reliable AI agents. Read more here: Athena Decision Systems