Emerging behaviors, when artificial intelligence surprises even its creators

Emerging behaviors, when artificial intelligence surprises even its creators

[ad_1]

Since when ChatGPT appeared a debate has restarted which had been somewhat dormant in recent years: when will artificial intelligences reach the level of human beings? AND already present in them a trace of consciousness? It is a technology that risks proving to be dangerous, to the point of representing, as some currents of thought maintainan “existential risk”?

These are expectations and fears which, barring sensational surprises, can be considered excessive: ChatGPT and the other LLMs (the acronym stands for Large Language Model) they are not endowed with any form of intelligence or even understanding. Instead, they have the ability to find out, often very accurately, which sequence of sentences has statistically the higher probability to respond to our requests or to respond consistently to one of our statements.

This, however, does not mean that everything always goes according to plan. Not only because ChatGPT and the like still make a lot of mistakes and are subject to various types of hallucinations (as the situations in which the artificial intelligence confidently produces completely invented reports are defined in jargon), but also because the behavior of the more powerful models is often able to surprise their own programmers.

Artificial intelligence

The AI ​​between racism and discrimination: two years later, we are worse off than before

by Emanuele Capone


For example, during a test organized, among others, by Ethan Dyer, a computer scientist at Google Research, sequences of emojis reproducing a film title were shown to various LLMs. Although the more limited models provided nonsensical answers, the more powerful ones were able to guess, often on the first try, the film in question (for example by signaling correctly that the emojis representing fish and a little girl could point to the film Finding Nemo).

A surprising ability, considering that these models are supposed to have only one goal: to accept a sequence of texts as input and predict (on a purely statistical basis) what could be a coherent continuation of those same texts. It was in no way expected, however, that they would develop the ability to interpret certain emojis: “Despite attempts to prepare for surprisesI’m always amazed at what these models are capable of,” Dyer commented.

These unexpected skills are called “emergent”: capabilities that only the largest models suddenly develop and which often have little to do with text analysis. In some cases, for example, asking algorithms to explain the reason for their decisions (developing a chain of thoughts) it allows them to solve mathematical problems that they otherwise would not have been able to solve correctly.

But there are also some downsides: “As the models improve their performance as they grow in size, so does the likelihood of harmful or biased behavior – can be read in Quanta Magazine – Larger models can suddenly become more partial and distorted”, researcher Deep Ganguli, of the startup Anthropic, always confirmed to Quanta.

More generally, this type of behavior indicates a so-called learning Zero Shots or Few Shots: the ability to solve certain problems even though the system had never (or only rarely) encountered them before. A skill long sought after in the field of AI, whose systems had never been able to learn anything by themselves.

New version

GPT-4, the artificial intelligence of OpenAI becomes even more powerful

by Bruno Ruffilli



But now comes the difficult part: researchers need to figure out why this is happening. One possible explanation is that more complex models are indeed spontaneously gaining new abilities: “It could very well be that these systems have learned something fundamentally new and different, which they did not possess when their size was smaller,” he explained. computer scientist Ellie Pavlick.

The second possibility is less glamorous, but just as important: emerging skills could be the culmination of an internal statistical process. According to this interpretation, more complex systems would become so good at finding correlations within big data (exploiting the greatest number of parameters they are equipped with and perhaps even the better quality) to achieve results that were not initially foreseen.

Which of the two explanations is correct? “So long as we don’t know exactly what goes on inside these systems, we don’t even know which scenario is correct,” Pavlick always explained. In short, at least for the moment, artificial intelligence will continue to surprise us without even being able to understand what is really happening.

[ad_2]

Source link