Anthropic admits it doesn’t understand how its AI works: Director Dario Amodei makes a surprising revelation

Artificial intelligence is making ever greater strides, but its inventors are the first to admit that they don’t fully understand how it works. Dario Amodei, CEO of Anthropic, the company developing the AI assistant Claude, recently admitted in a surprising statement that his team still doesn’t know how its systems actually work. A revelation that raises questions about the future of this technology.

The mysterious “black box” of artificial intelligence

The world of artificial intelligence sometimes resembles a giant scientific experiment, the results of which are beyond our comprehension. And that is no exaggeration. The head of Anthropic, one of the leading companies in the field of generative artificial intelligence, has just made a confession that might surprise you: He doesn’t really understand how his own creation works.

In his blog, Dario Amodei talks about this fact with a candor that is rare in the industry. He believes that the public is right to be concerned about this situation. “People outside the industry are often surprised and worried when they learn that we don’t understand how our artificial intelligence works. And they are right to be concerned: this lack of understanding is unprecedented in the history of technology,” he explains.

Have you ever imagined that you use a tool every day that the inventors themselves don’t know how it works? That’s the current situation with artificial intelligence models like Claude, ChatGPT and Gemini.

This opacity is not the result of carelessness, but rather the consequence of a unique development approach. Generative AI is not developed like conventional software with precisely defined lines of code. It is created in a complex learning process that sometimes gives it unforeseen properties.

Because the developers of AI do not understand their systems

This situation seems paradoxical: how can companies like Anthropic develop AI systems without understanding how they work? Dario Amodei offers an illuminating explanation for this phenomenon.

“When a generative AI system performs an action, such as summarizing a financial document, we don’t know exactly why it makes certain decisions: why it chooses certain words and not others, or why it sometimes makes a mistake despite being globally accurate,” he explains.

The CEO of Anthropic uses a biological metaphor to explain the situation: Generative artificial intelligence systems are cultivated rather than engineered. Their internal mechanisms emerge rather than being directly designed. It is comparable to cultivating a plant or a bacterial colony: the general conditions that control growth are fixed, but the exact structure that develops remains unpredictable and difficult to explain.

To better visualize this phenomenon, imagine a gardener planting a seed and creating the ideal conditions for its growth. He can influence the general shape through his actions, but he cannot determine exactly how each individual branch will develop.

The riddle of the matrices

What makes AI so difficult to understand is its internal structure. When researchers look inside these systems, they see huge matrices with billions of numbers. These matrices perform important cognitive tasks, but how they do this remains unclear.

According to Amodei, this opacity underlies many of the risks and concerns associated with generative AI. These problems would be much easier to solve if the models were interpretable.

The race for interpretability: a crucial question for the future

Given this situation, Anthropic has made interpretability a top priority. But what exactly is interpretability? It is the ability to understand and explain the decisions of an artificial intelligence system, to see what is happening in the “black box”.

The CEO of Anthropic believes that interpretability of the most advanced AI could become a reality in the next five to ten years. That may seem like a long time, but it reflects the complexity of the challenge.

Amodei also calls for a general mobilization on this topic. He calls on other players in the field, such as OpenAI and Google, to devote more resources to this important topic.

Interpretability gets less attention than the constant stream of new models, but it is arguably more important,” he emphasizes. This observation shows a growing awareness of an industry that is often driven by a race for performance rather than deep understanding.

But how can this quest for interpretability be pursued? Amodei suggests several ways:

  • Greater mobilization of resources in AI companies
  • Flexible government regulations to encourage research into interpretability
  • The use of export restrictions to give researchers more time.

These proposals reflect an approach that combines private sector efforts with a public regulatory framework.

Encouraging progress despite the challenge

Despite the scale of the challenge, significant progress has already been made. Last March, Anthropic researchers shared their findings on how their AI “thinks”. A small victory in the great battle for interpretability.

This research is comparable to the invention of an “MRI of artificial intelligence”, a tool that makes it possible to visualize what is going on in AI models.

The efforts to understand these systems are all the more urgent as the models are becoming more powerful every day. The current situation is quite unique: we are developing tools whose capabilities are increasing rapidly, while our understanding of how they work is progressing more slowly.

A plea for a more measured approach to AI development

With his statements, Dario Amodei seems to be advocating a more considered approach to AI development. Instead of rushing to develop ever more powerful models, he suggests taking the time to understand the models that already exist.

This position goes against the grain of an industry that is often characterized by a frantic race for innovation. He raises the question of the responsibility of AI developers for the technologies they put in our hands.

Have you ever wondered what guarantees we have that these systems will work as they should? Without a thorough understanding of their inner workings, this question remains unanswered.

Implications for users

For us users of these technologies, Amodei’s revelations are both fascinating and unsettling. They remind us that despite its impressive capabilities, artificial intelligence is still an evolving technology with gray areas and uncertainties.

This situation urges caution in integrating AI into our lives and decision-making processes. If even its creators do not fully understand it, perhaps we should keep a critical eye on its results.

  • Scrutinize the information provided by AI instead of blindly accepting it.
  • Be aware of the limitations of these systems
  • Track the progress of interpretability to better understand the tools we use.

I find the gap between our expectations of AI and the reality of its development particularly interesting. We often imagine perfectly mastered machines, when in reality they are partly the result of emergent processes that their creators struggle to explain.

The future of interpretability: a puzzle to be solved

Interpretability is one of the most fascinating challenges of modern AI. It literally means decoding the “brain” of these systems to understand how they make decisions.

Advances in this area could radically change our relationship with AI. A better understanding would improve the reliability of the systems, reduce the risks associated with their use and pave the way for even more sophisticated applications.

Amodei’s candor about the current limitations is refreshing in a field that often extols the virtues of AI without mentioning its downsides. It shows that the approach to the challenges posed by this technology is maturing.

There is still a lot of work to be done, but the teams at Anthropic and other companies in the industry are actively working on it. The black box of AI will not stay that way forever.

In the meantime, these revelations remind us that despite its name, AI remains a human creation with its mysteries and shortcomings. A fascinating technology that sometimes advances faster than our ability to understand it.