Rhymes AI releases Allegro

less than 1 minute read

๐ŸŽฌ Rhymes AI just released Allegro, the very first open-source commercial-grade video generation model!

It matches or beats most commercial solutions, ranking just behind Hailuo and Kling in overall quality.

Key insights:

๐ŸŒŸ Innovative two-stage architecture

โ–ธ Video VAE converts pixels to compressed visual tokens (4x8x8 compression)

โ–ธ Video Diffusion Transformer predicts frames with self & cross attention

๐Ÿ“š Training dataset

โ–ธ 106M images, 48M videos with highly associated captions

โ–ธ Complex filtering pipeline for top quality: aesthetics, motion, semanticsโ€ฆ

๐Ÿš€ Progressive training in 4 stages

โ–ธ Image pre-training

โ–ธ Basic video at 360p

โ–ธ HD video at 720p

โ–ธ Fine-tuning on premium data

๐Ÿ“ˆ Results put it among the very best

โ–ธ Beats all open-source models on VBench benchmark

โ–ธ Only behind Hailuo & Kling in commercial space

โ–ธ 91% of users preferred it to Dream Machine, 96% preferred it to OpenSora

This is huge: we now have detailed insights into what makes high-end video generation tick, from data curation to architecture choices. And itโ€™s all open-source under Apache 2.0, available on GitHub and HF!

๐Ÿ“ธ Demo gallery ๐Ÿ‘‰ rhymes.ai/allegro_gallery

Model on the Hub ๐Ÿ‘‰ https://huggingface.co/rhymes-ai/Allegro

Capture dโ€™eฬcran 2024-10-22 aฬ€ 09.20.22.png