Sora: Creating video from text
davidbarker
3642 HN Points
Unlock the power of AI storytelling with Sora, OpenAI's revolutionary text-to-video tool.
Sora empowers creators, educators, and businesses alike to effortlessly bring their narratives to life. With just a few lines of text, you can generate stunning videos that captivate your audience, enhance your message, and elevate your digital presence.
Experience the seamless fusion of creativity and technology, and let Sora transform your imaginative concepts into visual masterpieces. Whether you're crafting educational content, marketing materials, or personal stories, Sora is your gateway to a world where your ideas are not just heard, but seen and felt.
Discover daily updates on the latest Sora AI News & Insights at SoraGenerator.ai. Dive into the world of AI innovation with fresh, engaging content every day. Stay ahead in the AI landscape with our expert analysis and in-depth articles.
davidbarker
3642 HN Points
Today, Sora is becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.
We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon.
Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.
The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.
The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.
The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.
OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos. OpenAI's entry into generative AI video is an impressive first step. The model can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.
It’s not perfect. Only when you watch the clip a few times do you realize that the main characters—a couple strolling down the snow-covered sidewalk—would have faced a dilemma had the virtual camera kept running. The sidewalk they occupy seems to dead-end; they would have had to step over a small guardrail to a weird parallel walkway on their right. Despite this mild glitch, the Tokyo example is a mind-blowing exercise in world-building. Down the road, production designers will debate whether it’s a powerful collaborator or a job killer. Also, the people in this video—who are entirely generated by a digital neural network—aren’t shown in close-up, and they don’t do any emoting. But the Sora team says that in other instances they’ve had fake actors showing real emotions.
The other clips are also impressive, notably one asking for “an animated scene of a short fluffy monster kneeling beside a red candle,” along with some detailed stage directions (“wide eyes and open mouth”) and a description of the desired vibe of the clip. Sora produces a Pixar-esque creature that seems to have DNA from a Furby, a Gremlin, and Sully in Monsters, Inc. I remember when that latter film came out, Pixar made a huge deal of how difficult it was to create the ultra-complex texture of a monster’s fur as the creature moved around. It took all of Pixar’s wizards months to get it right. OpenAI’s new text-to-video machine … just did it.