AI Image & Video Models

Every frontier model in one canvas. Generate, edit, animate, and compare without juggling subscriptions.

15 models

Image models

Photorealism, typography, product shots, and editorial stills. Start with the models most likely to fit the brief.

Jul 15, 2026new

Qwen Image 2.0

Qwen Image 2.0 is Alibaba's fast AI image generator and editor for readable text, reference-guided changes, 1K or 2K output, and up to six results.

Jul 15, 2026new

Qwen Image 2.0 Pro

Qwen Image 2.0 Pro is Alibaba's AI image generator and editor for stronger text rendering, realistic textures, semantic adherence, and 2K output.

Jul 15, 2026new

Wan 2.7 Image

Wan 2.7 Image is Alibaba's AI image generator and editor with up to nine references, 1K or 2K output, and coherent image sets of up to 12 images.

Jul 8, 2026new

Seedream 5.0 Pro

Seedream 5.0 Pro is ByteDance Seed Team's multimodal image creation model for professional design work. It improves image-text alignment, structural coherence, text rendering, visual aesthetics, dense information layouts, precision editing, realistic textures, and multilingual generation.

Jul 1, 2026NEW

Nano Banana 2 Lite

Nano Banana 2 Lite is the streamlined 1K edition of Nano Banana 2 for fast image generation and editing. It keeps the Gemini image workflow focused on responsive drafts, references, and everyday creative edits.

May 7, 2026

Grok Imagine Image Quality

Grok Imagine Image Quality is xAI's recommended higher-quality image model replacing the retiring Pro tier. On Vofy, it supports prompt-based creation, image edits, broad style transfer, and multi-turn refinement at up to 2K with up to 10 outputs per run.

17 models

Video models

From quick social cuts to multi-shot scenes, use leading video models without leaving your project.

Jul 15, 2026new

Wan 2.6 Flash

Wan 2.6 Flash is a fast AI reference-to-video generator that creates consistent, multi-shot videos from image or video references, with optional audio.

Jul 15, 2026new

Wan 2.7

Alibaba's Wan 2.7 AI video generator supports text-to-video and reference-to-video workflows, 720p/1080p output, five aspect ratios, and clips up to 15 seconds.

Jul 7, 2026

Gemini Omni Flash

Gemini Omni Flash brings Gemini's native multimodal intelligence to AI video creation and editing. Create 10-second videos from text, photos, references, or source footage, then refine the result with natural-language direction, native audio, and step-by-step creative control.

Jun 15, 2026

Seedance 2.0 Mini

Seedance 2.0 Mini is a streamlined Seedance 2.0 video model for short-form generation, reference-guided motion, video edits, and clip extension. It supports text-to-video, image-to-video, first-and-last-frame generation, multimodal references, and audio-visual sync workflows.

May 27, 2026

Kling 2.6

Kling 2.6 is a balanced Kling video model on Vofy for short clips, motion-controlled video, interpolation, and optional audio workflows. The current Vofy setup supports text-to-video, image-to-video, interpolation, and motion control at 720p or 1080p.

May 27, 2026

Sora 2 Pro

Sora 2 Pro is OpenAI's higher-quality Sora 2 tier on Vofy for longer, more polished AI video. The current Vofy setup supports text-to-video and image-to-video in 16:9 or 9:16, with 720p, 1024p, and 1080p outputs from 4 to 20 seconds.

Built for creators and creative teams

From daily social posts to campaign visuals, use effects and workflows that move from idea to shareable asset fast.

Marketing teams

Launch creative and product photography in minutes.

Indie filmmakers

Storyboards and B-roll with motion control.

E-commerce

On-model try-on and lifestyle shots at catalog scale.

Social creators

Short-form video and daily posts across every format.

Designers

Concept art, moodboards, and surgical inpaint edits.

Agencies

Client pitches with motion-ready, brand-safe frames.

How it works

Start with an effect, add your own idea or media, then generate something ready to share.

Step01

Choose an effect

Start with a fresh preset for a video, image, edit, or social visual.

Step02

Add your media or prompt

Upload a reference, write a short idea, or adjust the model settings.

Step03

Generate and share

Create variations, refine the result, and save something ready for friends, followers, or your next post.

FAQ

Which AI image model is best for product photos?+

GPT Image and Nano Banana Pro are strong defaults for photorealism, clean composition, and text-heavy layouts. Seedream is useful when you need fast lifestyle or product variations.

Which AI video model should I use for short films?+

Sora Pro is a strong choice for narrative scenes with audio. Kling is useful for multi-shot sequences, while Veo and Seedance are practical for polished image-to-video motion.

Can I switch models in the middle of a project?+

Yes. Vofy keeps your prompts, references, and project context together so you can compare models without rebuilding the brief.

Do all models support reference images?+

Support varies by model. Image models commonly support reference-based edits, and several video models support image-to-video workflows. Each model detail page lists the relevant inputs.

How are credits charged across models?+

Credits depend on model tier, resolution, duration, and generation mode. The pricing page explains the credit cost for common image and video workflows.

How quickly do new models arrive on Vofy?+

Vofy is designed to add new frontier model releases quickly, so teams can try new OpenAI, Google, xAI, ByteDance, and Kuaishou models from the same workspace.

One subscription, every frontier model

Switch between image and video models from a single workspace, then keep creating in Studio.

See pricing Open Color Splash