Meta Builds “Mango” to Scale Image and Video Automation

The next automation wave won’t be assessed by the speed at which a model writes an email. It’ll be measured by how fast it can turn a single idea into endless video for feeds, ads, training, and product communication all without a studio, a crew, or a calendar full of edits.
That is the bet behind Meta’s newest rumored roadmap. The Wall Street Journal reports Meta is developing a “Mango” model, with an internal target of releasing the image and video-generation model in the first half of 2026. In the same internal Q&A described by the Journal, Meta leaders also discussed a text model named “Avocado” that’s being tuned to be stronger at coding, plus early work on “world models” aimed at interpreting visual data and understanding environments.
Meta is treating this like a flagship push, not a side project. Mango and Avocado sit inside Meta’s new Superintelligence Labs, led by Meta’s Chief AI Officer Alexandr Wang.
Why “Mango” matters more than a codename
If Mango ships as a high-quality image+video model, it’s not just a creator novelty. It becomes a production engine that changes the cost and speed of communication across industries:
Marketing: Always-on creative testing, dozens of video variants per campaign (hooks, formats, seasonal swaps, localized versions) instead of monthly batches.
Sales: Personalized demo videos generated from a product catalog + CRM context, updated instantly as pricing, features, or competitors change.
Healthcare: Faster production of patient education and staff training content useful, but only if paired with strict review and governance.
Legal: A bigger compliance surface rights, provenance, disclosures, and internal approval flows for synthetic media.
Industry: Training and SOP videos at scale for field teams and operations, where consistency matters as much as speed.
Avocado is the companion bet. If Mango automates what gets produced,Avocado is meant to automate more of the code-heavy workflow behind shipping and maintaining it at scale.
That’s the broader trend: automation is moving from “text that helps work” to media that is the work.
The Vibes connection isn’t cosmetic
Meta has already been laying the product runway for AI video. In September 2025, Meta announced an early preview of Vibes, a feed in the Meta AI app and on meta.ai where people can create and share short-form.
Why does that matter for Mango? Because a model doesn’t win just by existing—it wins by being deployed where creation happens, where Meta can measure retention, remix behavior, failure cases, and creator preferences. A stronger underlying model turns Vibes from a demo into a scalable loop: generate → share → learn → improve.
Also read: Runway’s Gen-4.5 Sets New Bar for AI Video Generation
This also fits Meta’s longer media-model arc
Meta’s research has been pointing toward multimodal generation for a while. In 2024, it introduced Movie Gen as a “media foundation models” research direction spanning image, video, and audio generation and editing. Mango reads like the productized, competitive next step.
The competitive pressure is clear
Barron’s frames Mango as Meta trying to catch up in an AI media arms race where Google, OpenAI, and Adobe have already pushed hard on video and creative tooling. Whether Mango beats rivals on quality is unknown today, but Meta’s timeline says it can’t afford to sit out 2026.
What's left is to watch how Meta packages Mango. If it lands inside Meta AI products (and eventually business tooling), the automation story shifts again, from “generate content” to “generate content continuously,” with the speed and volume that modern distribution demands.
Y. Anush Reddy is a contributor to this blog.



