openai/sora-2
Sora 2 is OpenAI's new powerful media generation model, generating videos with synced audio. It can create richly detailed, dynamic clips from natural language or images.
Sora 2 is more physically accurate, realistic, and more controllable than prior systems. It also features synchronized dialogue and sound effects.
With Sora 2, OpenAI is jumping straight to what they think may be the GPT‑3.5 moment for video. Sora 2 can do things that are exceptionally difficult—and in some instances outright impossible—for prior video generation models: Olympic gymnastics routines, backflips on a paddleboard that accurately model the dynamics of buoyancy and rigidity, and triple axels while a cat holds on for dear life.
Prior video models are overoptimistic—they will morph objects and deform reality to successfully execute upon a text prompt. For example, if a basketball player misses a shot, the ball may spontaneously teleport to the hoop. In Sora 2, if a basketball player misses a shot, it will rebound off the backboard. Interestingly, “mistakes” the model makes frequently appear to be mistakes of the internal agent that Sora 2 is implicitly modeling; though still imperfect, it is better about obeying the laws of physics compared to prior systems. This is an extremely important capability for any useful world simulator—you must be able to model failure, not just success.
The model is also a big leap forward in controllability, able to follow intricate instructions spanning multiple shots while accurately persisting world state. It excels at realistic, cinematic, and anime styles.
As a general purpose video-audio generation system, it is capable of creating sophisticated background soundscapes, speech, and sound effects with a high degree of realism.
You can also directly inject elements of the real world into Sora 2. For example, by observing a video of one of our teammates, the model can insert them into any Sora-generated environment with an accurate portrayal of appearance and voice. This capability is very general, and works for any human, animal or object.