AI & VideoTechnical DevelopmentMay 27, 2026
Meituan opens LongCat-Video-Avatar 1.5 with 8-step lip sync
Meituan has open-sourced version 1.5 of its LongCat-Video-Avatar framework, which generates photorealistic digital human videos. This updated framework achieves state-of-the-art lip-sync accuracy using only eight inference steps.
Key Takeaways
- LongCat-Video-Avatar 1.5 is now open source under Meituan.
- The framework generates photorealistic digital human videos.
- Version 1.5 reaches state-of-the-art lip-sync accuracy.
- The model does that with only 8 inference steps.
Why It Matters
Meituan’s release makes a photorealistic digital-human video framework available in open source form, with lip-sync accuracy achieved in just 8 inference steps. That matters for streaming and video tooling because it lowers the compute burden of producing avatar-based video, while improving one of the most visible failure modes in synthetic presenters. The key signal to watch next is whether Meituan publishes additional benchmarks or demos beyond the lip-sync result for LongCat-Video-Avatar 1.5.
Read full article at pandaily.com
