Google Unveils VideoPoet: A New AI Video Generation Model

  • New AI technology by Google offers advanced video generation capabilities.
  • The cutting-edge method is aimed at improving language processing.
Posted:
12.25.2023
Google VideoPoet: AI Revolution in Dynamic Video Creation

Last week Google unveiled a new technology called VideoPoet, showing the company’s constant work on AI development. With this AI video generator, any autoregressive language model or large language model (LLM) can be transformed into a high-quality video generator. VideoPoet exhibits cutting-edge video generating, especially when it comes to creating an extensive array of high-quality motions.

Animating still images, altering movies for inpainting or outpainting, and creating audio from video are just a few of the many tasks that VideoPoet's multifunctional model can perform. It accepts inputs in the form of text, images, or videos and can convert text to video, picture to video, and video to audio. Its versatility, which simplifies a variety of video production chores by combining several elements into a single model, is a major benefit. Unlike other systems, VideoPoet uses discrete tokens, similar to language models, using tokenizers such as SoundStream for audio and MAGVIT V2 for images and MAGVIT V2 for images and video.

Animating still Images, Altering Movies for Inpainting or Outpainting

VideoPoet's superior comprehension of context and content is demonstrated by its capacity to create films with a variety of motions and styles based on particular text inputs. The model shows an amazing capacity to preserve object integrity and appearance over extended periods, whether animating a painting or creating a video clip from a descriptive text. Google AI reports that the model can produce films in either portrait or square orientation, depending on the needs of short-form material. It can also produce audio from a video feed.

One noteworthy aspect of VideoPoet is its ability to modify videos interactively. A great deal of creative power is provided by the ability of users to direct the model to change motions or activities within a film. Moreover, the model can precisely comply with camera motion orders, which increases its usefulness in producing dynamic and aesthetically appealing footage. Furthermore, VideoPoet's outstanding multimodal awareness is demonstrated by its ability to provide believable audio for created video without any user input.

VideoPoet produces 2-second videos by default. On the other hand, it can forecast a one-second video output given a one-second video clip. A video of any length can be created by repeating this technique endlessly.

Even if VideoPoet's output still falls well behind Runway AI and Pika's video generation tools, it shows how far Google has come in AI-based video creation and editing.

To demonstrate the capabilities of VideoPoet, Google’s team has created a brief film that is made up of numerous short clips that the model has created. They also asked Bard AI chatbot to compose a set of suggestions for the script that would detail a brief story about a raccoon that travels. Here, you can see the AI-generated video that VideoPoet was able to produce.

An AI Generated movie by VideoPoet, a Large Language Model
Nataliia Huivan
Nataliia Huivan
Professional author in IT Industry

Author of articles and news for Atlasiko Inc. I do my best to create qualified and useful content to help our website visitors to understand more about software development, modern IT tendencies and practices. Constant innovations in the IT field and communication with top specialists inspire me to seek knowledge and share it with others.

Share your thoughts in the comments below!

Have any ideas or suggestions about the article or website? Feel free to write it.

Any Questions?

Get in touch with us by simply filling up the form to start our fruitful cooperation right now.

Please check your email
Get a Free Estimate