2023 was the yr of generative AI, however extra particularly, the yr we witnessed the facility and potential of LLMs, giant language fashions. Loads of the world of labor relies round textual content: paperwork, e mail, content material, media. Each startups and huge tech firms leaned in laborious, incorporating automation instruments and generative AI purposes throughout verticals.
Visible generative AI made strides as effectively. Midjourney V6, which was launched in December 2023, and and OpenAI’s Dalle-3 each supplied a step soar in picture creation.
However the subsequent frontier is video. Progress in generative AI applied sciences for video has even be transferring very quick, however it’s typically much less talked about than textual content and pictures, which have already got merchandise with large client adoption.
Generative AI in video consists of a number of buckets:
- Automated video enhancing (contains descript
- Speaking avatars – textual content to video (contains firms like HourOne, Synthesia, HeyGen)
- Video footage era (i.e. transferring footage) from immediate
This submit focuses on video footage era.
Timeline of Generative AI for video progress in 2023
A16Z companion Justine Moore posted an glorious X thread on the advances of generative AI for video proper earlier than the tip of the yr.
As Justine’s timeline reveals, the large gamers on this area are the massive tech platforms: Google, Meta, Nvidia within the US and in China, Bytedance, Alibaba and Baidu. Whereas Google and Meta shared they’re engaged on AI Video era, they’ve but to launch their merchandise to the general public.
The big tech gamers are effectively positioned to steer on this area given their entry to deep studying expertise, limitless cloud sources and deep pockets. Google Mind not too long ago open-sourced Phenaki, a video diffusion mannequin that factors in direction of YouTube’s inner capabilities. It’s able to producing a two minute AI generated video, utilizing a sequence of prompts. Meta’s Make-A-Video builds on the current progress made in text-to-image era expertise constructed to allow text-to-video era. Many different paper on this area had been printed in 2023.
On the startup entrance, up and coming gamers like PikaAI and RunwayML, provide very brief, however prime quality video creation instruments. After which, there are open supply options like Stability.ai’s Secure Video Diffusion launched in November 2023.
RunwayML is focusing on Holywood and AI filmmaking
One other instrument price calling out, producing movies from Photos is FinalFrame. Right here’s my video for “Panda bear browsing in Hawaii”
AI that makes everyone dance, utilizing a pictur
Justine Moore tracked 21 merchandise publicly accessible that allow customers to generate AI video footage (you’ll be able to test them out on this Google doc created by Justine). Be aware that almost all of instruments generate very brief movies (as much as 16 seconds).
With ample information and compute, photorealistic, interactive video era appears inside attain. As an investor in generative AI/ interactive leisure, that is an extremely thrilling time for the Generative AI video discipline as these fashions start crossing the brink of usefulness. Nonetheless, vital challenges stay round bias, misinformation, and mental property, along with the but unknown affect of incoming regulation. Additionally, traders have a troublesome query to ask: is generative AI an actual platform shift, or are we in a bubble?