«Artificial intelligence creates a video for you», here is the new Meta service

«Artificial intelligence creates a video for you», here is the new Meta service

[ad_1]

You type a few sentences and the artificial intelligence uses them to create a video, for professional use. This is Meta’s new move, which re-launches the challenge on the field of the new “generative” AI models called large language models. Able to create words, images and now even videos based on user input. An area where the most famous service is that of Open AI, Dall-E2, but which is soon crowding and pushing towards new horizons. As Meta’s announcement demonstrates.
His service is called Make-A-Video. It builds on Meta AI’s recent advances in generative technology and is aimed at creators and artists. The system is able to learn how the world is made from text data associated with images and to understand how the world moves from movies without any associated text. “As part of our ongoing commitment to a science accessible to all, we are sharing the details in a paper and we plan to carry out a demonstration experience”, writes Meta in the announcement.
At the beginning of the year Meta presented Make-A Scene, always based on the same generative AI models, to create photorealistic and artistic illustrations using words, lines of text and freehand drawing.
Dall-E, now in version 2, generates images from text and is available in beta. Unlike Meta products, it is already commercial. You pay a subscription of $ 15 a month for credits with which to create a few hundred images in this way. At the moment you log in after a waiting list.
Open AI now benefits from the support of Microsoft, while Google presented Imagen a few months ago, without however providing many details on what it wants to do with it.
There is also Nvidia, with two different products, Crayion and GuaGan, which allows you to convert text into a realistic photo.
And in August an Open Source service was born, Stable Diffusions.
At the base of these services there are in fact the adversarial generative networks, commonly called GAN for “text-driven generative adversarial networks”. GANs consist of two competing neural networks: a generator, whose goal is to create images that are as realistic as possible, and a discriminator, which has the task of recognizing whether the images generated by the generator are false or not. This method is used to produce deepfakes.
Models and related companies are competing to increase the quality and reliability of these products. For example, the new version of Dall-E is able to maintain strong semantic consistency in understanding the relationship between various objects in a given image. For example, “a person sitting on a horse” produces an image with the jockey sitting on the saddle, not on the head. The models also progress thanks to a large database of images with correct captions.
The idea behind these efforts is to create products that could greatly change the trades and markets associated with the production of text, images, videos, such as advertising and marketing.
However, the approaches of the companies are different.
Meta currently follows the principles of open science, as mentioned, to share the benefits of technology with more people.
All actors are at the same time aware of the associated challenges. Like that of not lending the right to the production of dangerous or misinforming material (deepfake in primis) and therefore they use algorithms and staff of moderators to evaluate the use made of these products.
The issue of work is even more complex, namely how these systems can coexist with current professions without destroying human value. A concern that has emerged in recent days due to the protest of various graphic artists against works created with Stable Diffusions. Meanwhile, while these social challenges are highlighted and it is not yet clear how to solve them, technology continues to evolve.

Find out more

[ad_2]

Source link