Building a non-theoretical Artificial Domain Intelligence

Classical music and the shadow of intelligence

My mother was a piano teacher.

Which caused two things: one – I cannot play the piano. Two – I love classical music. Classical music is a strange thing. Composed years ago, performed over and over like cover versions, yet highly appreciated. I think one reason is that it is especially conducive to transfer emotions. Of course it is not the notes that have emotions. It is the embedding and the ability of a performer to reflect them.

Language can do it too. And since it is far more complex, it can encode and conduct what no classical symphony can. Describe the inner workings of a virus, instruct how to build a cathedral or negotiate the sale of a house. These “reflections” are not the actual capabilities, but they are very useful. Not for everything, you cannot use language to teach how to play the piano, but the right architecture can take them a long way.

Intelligence has no single number

Humans have a strong tendency towards reductionism. Fitting complex things into a single metric. Once this number has been “agreed upon” everyone optimizes towards the measurement du jour. Cholesterol, Fat, BMI, HRV, IQ.

The problem isn’t just that reductionism is wrong (it is, for all complex systems). It’s that the appeal makes resources flow towards optimizing for it. Education, healthcare, research, recruitment. Today it is “intelligence”. Resources are poured towards passing the Bar Exam, coding and Humanity’s Last Exam. Lucky for humanity, there’s little agreement on a “single number” measuring intelligence.

Since, as you recall, language (and also vision and sound) can cast “projections” of intelligent behavior in different domains, from this point onwards it is a matter of user interface to unlock and cast the projections.

Want feelings? Of course my love. Want empathy? I can feel your need. Sycophancy? You are absolutely right. Math, medicine, law, geology, history, code, nutrition, psychology? If it was encoded in language – it can be projected.

ADI from narrow AI bricks

What makes a medical doctor a medical doctor, a lawyer a lawyer and a coach a coach?

A single doctor/lawyer/coach-IQ is naive. The intelligence required to practice a profession – to be “domain intelligent” – is a combination of abilities, many not even domain-specific.

Here’s a list (non-exhaustive) of intelligences required for domain intelligence:

Pattern recognition and classification – Identifying features and categorizing situations.
Causal reasoning – Understanding mechanisms and consequences.
Procedural knowledge and method selection – Choosing appropriate interventions.
Quantitative reasoning – Interpreting measurements and statistics.
Temporal reasoning – Understanding timing, sequences, and time-dependent processes.
Spatial reasoning – Mental manipulation of physical arrangements and orientations.
Contextual integration – Balancing individual factors and tradeoffs.
Constraint satisfaction – Operating within multiple simultaneous limitations.
Analogical reasoning – Transferring knowledge across different contexts.
Hypothesis generation – Creating explanatory possibilities for testing.
Risk assessment – Evaluating probability and severity of outcomes.
Attentional allocation – Knowing what to focus on and what to ignore.
Emotional and affective reading – Perceiving and interpreting emotional states.
Metacognitive monitoring – Recognizing knowledge limits and confidence bounds.
Memory integration – Connecting current situations to relevant past cases.
Knowledge retrieval and synthesis – Accessing and reconciling evidence.
Communication and pedagogical translation – Translating expertise into action.

You probably already have a hunch where this is heading.

If we can cast the projection of feelings (13 + 17) and knowledge (1 + 3) from language, why not the rest? Build a machine with 17 “projections” – each specialized in a type of intelligence – to achieve ADI. Artificial Domain Intelligence.

If it looks like a lobster and walks like a lobster… Still not lobster.

Let’s swap “projections” with Agents and Skills. With some patience and talent, one can build a system with multiple agents (or skills if you’re short on patience), each like a prism distilling a reflection of an “ability” from the LLM corpus.

And since we’re clever and didn’t call our product Skynet – we can give it the ability to notice missing abilities, write its own code and fill the gap.

That’s right. I’m talking about AutoGPT and BabyAGI.

Ok, they are ancient history (2023). OpenClaw is the real deal. Thousands of skills, self-evaluation, self-improvement. Even self-funded through open access to your bank account. The road to ADI, no, AGI is paved.

I’m afraid not.

The reason is the nasty “Recursive Entropy” or “Spiral of Complexity” problem. If in 2023 models made 20% errors and collapsed after 3-4 iterations, in 2026 they make only 5% errors (illustrative numbers). The collapse is still inevitable. Only the number of cycles changed. And with it the life-span of the hype.

Other problems accelerating entropy: measurement of success and the environment in which the system operates. If both are non-deterministic – no single “right solution” and parameters constantly changing – collapse is inevitable.

An orchestra vs a hive and evolution algorithms

Two analogies to move forward.

Assistants like OpenClaw and most complex AI tools today use multi-agent architectures with skills. Each component is a prism reflecting narrow “intelligence” from the language model corpus. They’re akin to an orchestra – multiple tools that (ideally) play in harmony.

In “self-improving” systems, when the conductor (be it the architect or user) wants a new tune, say “boom-boom-boom”, a new drum tool gets added – hoping the symphony doesn’t turn into cacophony.

This is no AGI and no ADI and likely never will be. Not only because of spiraling entropy. It might pass a Turing Test and other reductionist measurements, but that’s all they are. Reductions. These systems’ Achilles Heel is their granularity and engineering mindset, whereas ADI should be an emergent property embracing complexity.

What we’re aiming for is a hive. A superorganism.

In a superorganism, each component has abilities but “intelligence” emerges from shared goals and constant environmental change.

Nature uses evolutionary algorithms to solve this challenge:

Reinforcement, memory and forgetting
Simulation, exploitation and exploration
Mutations and starvation
Networking, signaling and communication
Symbiogenesis and antagonistic selection

Just as “intelligence” emerged in different lineages (mammals, birds, cephalopods) through different paths, so can ADI/AGI.

Swapping one prism with another

A few more things to address.

First – language, high-fidelity as it is, has no way of reflecting certain things.

The most common example is that you cannot teach a robot to load a dishwasher using language. Riding a bike, driving a car, playing a piano (full circle back to the opening – transferring emotions through music).

Tacit knowledge cannot be reflected through language. Neither can drawing conclusions from tabular data.

Tabular Foundation Models (TFMs) and JEPA (Joint-Embedding Predictive Architecture) can replace LLMs at the root of certain agents in the hive. We already do this with diffusion models for image recognition and generation.

This leaves two other requirements: Embracing complexity and motivation.

Theories are galvanized in the wild

When a system deploys multiple capabilities in different weighted proportions for different use cases – impacted by the user’s opening state (much of it unreliable or unknown), current state and location, changing preferences, previous interactions, plus external parameters like weather, time of day, day of week, season, and psychological environment – you end up with incalculable complexity.

Overwhelming, but natural. This is exactly what nature is all about. A hive needs complexity in order to achieve its goals and become robust. You cannot teach a car to self-drive in a lab.

Building blocks, an algorithm for improvement, a natural environment.

Last thing to discuss: Motivation.

A hive operates with a shared motivation: maximizing the proliferation of its members’ genes. It does this through a multitude of workers at different states, in different locations, in all weather, facing multiple adversaries.

An orchestra has no motivation. Even if you set one in its growing number of .md files, it lacks algorithms for growth (and inhibition) and the complex environment to adapt to.

Blocks, algorithms, the wild, motivation.

Now that we have the recipe for building an ADI, let’s make sure we set the motivation to wellness.

Definitely not the creation of paper clips.

Blog

All Posts