Rhythm + camera choreography
Street Dance Battle
High-tempo movement with beat-aware cuts and character continuity under fast choreography.
Multimodal video creation, reimagined
Seedance 2.0 accepts four types of input — images, videos, audio, and text — enabling cinematic generation with precise reference control. Set visual style with an image, define motion with a video clip, set mood with audio, and direct it all through natural language.
From choreography to action sequences — explore what Seedance 2.0 can generate across different creative scenarios.
Rhythm + camera choreography
High-tempo movement with beat-aware cuts and character continuity under fast choreography.
Dialogue-driven short film
Emotion-driven scene pacing with close-up performance and natural conversational timing.
Audio-visual narrative
Stylized visual motifs synchronized with music cadence and atmosphere shifts.
Action choreography
Complex martial arts movement, directional camera transitions, and dramatic fight choreography.
Large-scale VFX
Wide-scene composition, destruction sequences, and continuity in effects-heavy narrative.
Creative scene flow
Smooth scene transitions, creative visual effects, and narrative continuity across shots.
Choose your entry mode, bind references with @, and iterate without starting over — all on one canvas.
Use First & Last Frames for anchor-driven shots, or All-in-One Reference for multimodal composition.
Upload assets that define composition, motion rhythm, and tone first — then add secondary references.
Assign clear roles — first frame, camera language, character identity, BGM — to each material in your prompt.
Use continuation and timeline-style edits to preserve narrative momentum across versions.
Supported interaction modes
First & Last Frames
Best for anchor-frame generation with structured opening and ending control.
All-in-One Reference
Designed for multimodal mixing and precise asset orchestration in a single prompt.
A comprehensive upgrade in generation quality, controllability, and creative expression.
Combine text, images, videos, and audio in a single generation — each modality enriches the final output.
Upload reference images for composition, videos for camera language and motion, and audio for rhythm — the model understands and reproduces them.
Significantly smoother physical dynamics, more stable movement, and stronger scene-level visual coherence across frames.
Extend existing footage and build multi-shot narrative sequences — pick up where the last clip left off.
Support for character replacement, clip insertion, and timeline-style iteration — create and refine in the same workflow.
Optimized for choreography, emotional dialogue, music videos, action scenes, and VFX-driven micro stories.
Input limits, generation parameters, and interaction controls at a glance.
Seedance 2.0 is not yet available — join the waitlist to get notified at launch
Pricing and availability will be announced after integration and quality validation are complete.
Seedance 2.0 is ByteDance's next-generation video model that supports four-modal input — images, videos, audio, and text — for controllable, cinematic video generation.