Models

GPT Image 2

GPT Image 2

GPT Image 2 is OpenAI's state-of-the-art image generation model for fast, high-quality image generation and editing. OpenAI positions it as a major step forward in instruction following, dense text rendering, multilingual layouts, stylistic fidelity, flexible sizing, and stronger world knowledge.

Veo 3.1 Lite

Veo 3.1 Lite is Google's lower-cost Veo video model on Vofy for high-volume short-form generation. In the current Vofy setup it supports text-to-video, image-to-video, and interpolation in 16:9 or 9:16, with 720p at 4, 6, or 8 seconds and 1080p at 8 seconds.

Kling 3.0

Kling 3.0 is Kuaishou's video generation family combining Video 3.0, Video 3.0 Omni, and Motion Control 3.0. Generate up to 15-second clips at 1080p with multi-shot storytelling, frame interpolation, lip-sync, and audio-aware workflows.

Nano Banana 2

Nano Banana 2 combines Pro-level image quality with Gemini Flash speed — advanced world knowledge, subject consistency across 5 characters, precise text rendering and translation, and 4K output from 512px, all powered by real-time web search.

Seedream 5.0 Lite

Seedream 5.0 Lite is ByteDance's latest AI image creation model — the first to integrate real-time web search during generation. It fuses live web information to improve timeliness, with upgraded intelligence for parsing complex instructions and visual content, broader world knowledge, stronger cross-image consistency, and enhanced enterprise-grade scene generation quality.

Seedance 2.0

Seedance 2.0 is ByteDance's multimodal AI video model on Vofy for reference-driven creation and video editing workflows. In the current Vofy setup it supports text-to-video, image-to-video, interpolation, reference-image, multimodal-reference, video-to-video, and video-extension generation at 480p or 720p for 4 to 15 seconds, with optional audio and web-search controls.

Grok Imagine Image Pro

Grok Imagine Image Pro is the Pro image model in xAI's Grok Imagine family. On Vofy, the Grok Imagine image workflow covers prompt-based creation, image edits, broad style transfer, and multi-turn refinement at up to 2K across a wide set of aspect ratios.

Grok Imagine Video

Grok Imagine Video is xAI's short-form video model on Vofy. Create clips from prompts, still images, reference imagery, or short source footage for social content, product motion, and lightweight marketing workflows.

GPT Image 1.5

GPT Image 1.5 is OpenAI's flagship image generation model — a creative studio in your pocket. Precise edits that keep lighting, composition, and likeness intact; creative transformations from photo to movie poster or painting; stronger instruction following; denser text rendering; and 4x faster generation.

Seedance 1.5 Pro

Seedance 1.5 Pro is ByteDance's first AI video model to generate audio and video natively in a single pass — dialogue, sound effects, and ambient audio produced simultaneously with the visuals, with millisecond-precise lip-sync across 9+ languages.

Nano Banana Pro

Nano Banana Pro is Google's Thinking Mode AI image generator — 4K resolution, precise detail control, reference-based style transfer, and multilingual text placement, all powered by Gemini.

Veo 3.1 & Veo 3.1 Fast

Veo 3.1 is Google DeepMind's video model family for sharper visuals, built-in audio, vertical output, and longer scene building. Standard Veo 3.1 is the higher-fidelity option, while Veo 3.1 Fast is tuned for quicker, lower-cost iteration.

Sora 2 & Sora 2 Pro

Sora 2 is OpenAI's AI video generator — create cinematic videos up to 20 seconds at 720p with native audio, accurate physics simulation, and coherent multi-shot storytelling. Sora 2 Pro extends this to 1080p and 25 seconds for production-ready output.