SORA Video To Video Is Literally Mind Blowing - 12 HD Demos - Changes Industry Forever For Real
I have combined all 12 Video To Video #SORA demos released by #OpenAI into 1 video with their used prompts and a amazing background music. You won’t believe how this Video to Video will change entire movie, animation, social media industries forever. The results are just simply astonishing.
Our Discord Channel ⤵️
Our Patreon With Amazing AI Scripts & Tutorials ⤵️
Prompts Of Each Demo Video (Public Post) ⤵️
Official Site ⤵️
[AI video generation] Sora element technology explanation
Sora’s technical configuration
Although the paper has not been published, OpenAI has published an explanation page for the elemental technology, so I will refer to that page.
If you would like to see the original text, please click here
overall structure
Sora is said to consist of the following technical elements.
Turning visual data into patches
Video compression network
Spacetime latent patches
Scaling transformers for video generation
Variable durations, resolutions, aspect ratios
Sampling flexibility
Improved framing and composition
Language understanding
To summarize very simply, there are four main elements:
A technology that compresses video data into latent space and then converts it into a “spatiotemporal latent patch“ that Transformer can use as a token.
Transformer-based video diffusion model
Dataset creation using high-precision video captioning using DALLE3
Looking at it this way, it doesn’t seem like they’re using particularly new technology.
Raise your level and hit it physically. You can clearly understand the importance of level (money/calculation resources) rather than small techniques.
Turning visual data into patches
First, let’s look at how to create a “space-time potential patch.“
(Source: )
As a pre-process to create a spatiotemporal latent patch, the input video (video data) is compressed into a latent space.
If you think of it as equivalent to VAE in image generation, I think it’s mostly correct.
(In fact, since the paper on VAE is cited, I think it’s safe to assume that it’s just VAE.)
This greatly reduces the amount of calculation, and Sora trains with this compressed latent space. Masu.
In image generation, training begins immediately after conversion to VAE, but Sora includes another conversion process to create what is called a spatiotemporal latent patch.
This seems to correspond to a text token in LLM.
An image is worth 16x16 words: Transformers for image recognition at scale.
The patching method divides the image based on position (patching) and converts it into a one-dimensional vector (flatten/smoothing).
For those who want to know more ( )
(Source: )
Vivit: A video vision transformer.
There are two patching methods proposed here:
Similar to ViT, how to patch based on position and concatenate it in frame order (figure 2)
Capturing the input video three-dimensionally, extracting blocks (tubes) of t (number of frames) x h (patch height) x w (patch width) and compressing them into one dimension.
For those who want to know more ( )
(Source: )
Masked autoencoders are scalable vision learners.
Rather than a patching method, this paper is about efficiently learning patched images.
Effective as pre-learning for ViT
Input a masked part of a patched token and solve the task of restoring the masked part
For those who want to know more ( )
(Source: )
Patch n’Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution.
A paper that allows you to freely change the resolution and aspect ratio of input data
By taking advantage of the fact that ViT can change the length of the input sequence and packing the sequence, it is now possible to input any resolution or aspect ratio.
Using this technology, Sora can be trained on videos and images of varying resolutions, lengths, and aspect ratios, allowing you to control the size of the videos produced during inference.
(Source: )
Song: Unknown Brain - MATAFAKA (feat. Marvin Divine) [NCS Release]
Music provided by NoCopyrightSounds
Free Download/Stream:
Watch:
Song: Warriyo - Mortals (feat. Laura Brehm) [NCS Release]
Music provided by NoCopyrightSounds
Free Download/Stream:
Watch:
Song: Egzod, Maestro Chives, Neoni - Royalty [NCS Release]
Music provided by NoCopyrightSounds
Free Download/Stream:
Watch:
3 views
29
8
4 hours ago 00:01:54 1
Slayers Try Opening (Breeze)
14 hours ago 00:24:06 1
Будущее музыки с ИИ: может ли ИИ заменить людей? | Ответ Игоря Рыбакова
18 hours ago 00:48:15 1
10 Шокирующих ИИ Технологий в 2024
2 days ago 00:16:36 1
Chaconne transcribed by Karl Scheit,supplemented by Maya Kimura koto
2 days ago 03:11:07 1
Как войти в топ 1% по использованию ChatGPT и других ИИ-инструментов? - Руслан Сыздыков, Higgsfield
2 days ago 00:01:22 1
“Три осени в каждом году...“. Минута поэзии.
5 days ago 00:22:22 1
Luma AI шатает индустрию. ИИ-видео невероятного качества | Замена SORA?
6 days ago 00:03:10 1
Francisca - Sora mea (Videoclip Oficial)
6 days ago 00:39:03 1
Külföldi magyarok figyelmeztetnek - Ezért költöznek haza - A Nyugat elesett
1 week ago 00:03:00 1
Watch Out [Amv, Higan 2011, ALEKSEI]
1 week ago 00:53:36 1
Корабль-призрак (1969) мультфильм
1 week ago 00:13:32 1
Ne ratez pas Minimax ! La révolution de IA text to vidéo ? Et Sora ?
1 week ago 00:11:40 2
Я НЕ ВЕРЮ. Бесплатная нейросеть для видео. SORA уже не нужна
1 week ago 00:03:34 1
RM ’Domodachi (feat. Little Simz)’ Official MV
1 week ago 00:25:05 1
Хромолевый туринг FORMAT 5222 переодет!
2 weeks ago 00:01:00 1
БОДРЫЕ ПРОДУКТЫ. ИЗЮМ.
2 weeks ago 00:06:56 2
あいみょん – あのね【OFFICIAL MUSIC VIDEO】
2 weeks ago 00:00:19 1
Making chive flower sauce, this stone grinder looks so fun
2 weeks ago 00:17:43 1
The Ultimate SCP Games Tier List
2 weeks ago 00:17:36 2
Корабль призрак - Летучий Голландец
2 weeks ago 02:09:24 1
Маркетинг инди игр, секреты Steam, советы гейм дизайнерам
2 weeks ago 00:04:30 1
Wovenhand “8 of 9“ (Official video)
2 weeks ago 00:03:27 1
TXT (투모로우바이투게더), Anitta ‘Back for More’ Official MV
2 weeks ago 00:04:17 1
sora openai new videos generated by sora 😱 demo samples 2024