Wan 2.2 Image Juiced: The Next Level for Efficient 2 Megapixels Image Generation

Back to articles

Announcement

Aug 4, 2025

John Rachwan

Cofounder & CTO

Nils Fleischmann

ML Research Engineer

Johanna Sommer

ML Research Engineer

Bertrand Charpentier

Cofounder, President & Chief Scientist

Most AI image generation models are optimized for 1-megapixel resolution, which falls short for many high-demand applications. The need for 2-megapixel image generation is clear. Following the success of Wan 2.1 Image, based on Wan 2.1 Video (see our previous blog), we’re proud to introduce Wan 2.2 Image, now based on Wan 2.2 Video. This new version is faster, cheaper, and even more efficient. With Wan 2.2 Image Juiced, you can generate 2-megapixel images in just 3.1 seconds for only $0.02, all on a single H100.

Want to Try Wan 2.2 Image?

You can experiment with Wan 2.2 Image on our official Replicate endpoint: Wan 2.2 Image on Replicate.

From Wan 2.1 Image, to Wan 2.2 Image…

Since Wan 2.2 video surpasses his predecessor Wan 2.1 video, we decided to turn build Wan 2.2 Image surpassing Wan 2.2 Image. Hence, similarly to Wan 2.1 video that we turned into the powerful Wan 2.1 image model, we turned the Wan 2.2 Video into the powerful Wan 2.2 Image model by pruning video components and applying the best compression methods from the Pruna package.

What’s new? While Wan 2.1 Image was known for its cinematic realism, it sometimes leaned toward a cartoonish style. Wan 2.2 Image solves this problem, offering even more diverse and artistic image generation.

What is Wan 2.2 Image good for?

Ultra-high-quality images up to 2 Megapixels resolution.
Capable of high quality realistic results, and also artistic results!

Check the quality yourself below! If you want more, you can view a full grid of 100 images generated by Wan Image and Wan Image Juiced on this page.

Wan 2.2 Image Juiced: Faster and Cheaper 2 Megapixels Image!

While still running on a single H100, Wan 2.2 Image Juiced is even faster than its predecessors. Wan 2.2 Image generates one 2 Megapixels image 2.4x faster than SeedDream, 1.8x faster than Flux-1.1 Pro, and 1.1x faster than the already fast Wan 2.1 Image. For comparison, Wan 2.2 Image can generate two 2-megapixel images in the same time it takes SeedDream to generate just one 1-megapixel image. This means you don’t have to choose between speed and resolution.

With Wan 2.2 Image, generating a 2-megapixel image now costs just $0.02 on the Replicate endpoint, making it the most cost-efficient solution for high-resolution image generation.

In addition to the qualitative image assessments you can see on this page, we conducted a comprehensive benchmark comparing SeedDream, FLUX 1.1 Pro, Wan 2.1 Image, and Wan 2.2 Image Juiced. Using the first 100 prompts from the GenAI-Bench dataset, we generated 2-megapixel images with the default settings for each model. The results confirm that Wan 2.2 Image generates state-of-the-art images, while being both faster and more affordable than its competitors.

API endpoint	VQA	ARNIQA	CLIP	CLIP IQA	Image Reward
Seedream	0.9171 (1)	0.5480 (4)	28.1910 (1)	0.8181 (2)	1.5846 (1)
FLUX 1.1 Pro	0.8644 (2)	0.5794 (3)	27.7835 (2)	0.7175 (4)	0.8711 (4)
Wan 2.1 image	0.8303 (4)	0.6559 (1)	26.8412 (4)	0.8281 (1)	1.0957 (3)
Wan 2.2 image	0.8405 (3)	0.5994 (2)	27.1220 (3)	0.7368 (3)	1.1447 (2)

For each combination of model and metric, we show its scores and ranking in parentheses.

Disclaimers

Although we are releasing this practical research implementation with all good intentions, we do want to emphasise some disclaimers around the research tricks we implemented to empower this release.

The original works: Our Wan Image implementation is based on the original work of the Wan-Video team. All original limitations and licensing terms from the base model continue to apply to this adaptation.
Prompting considerations: We haven't conducted extensive evaluation on how video-optimized prompts translate to static image generation. The original model was designed for video, hence the output resulting from static prompts might fully be represented by the selected frame of a video model.
Try it in your specific use case: We recommend to test it thoroughly, validate your outputs and report any unexpected behaviors or limitations to us.

In short, have fun, make the most of the model, and help us improve it when needed! Share your findings and use cases to advance the Wan Image model applications on socials too and we would be happy to reshare!

Enjoy the Quality and Efficiency!

Want to take it further?

Compress your own models with Pruna and give us a ⭐ to show your support!
Try our Replicate endpoint with just one click.
Stay up to date with the latest AI efficiency research on our blog, explore our materials collection, or dive into our courses.
Join the conversation and stay updated in our Discord community.