Fun project to generate videos from text with the help of white-board animations.
YouTube Video: https://youtube.com/watch?v=iSb1HJXRO04&si=Nw0glViIoChYRghI
HOW THE VIDEO WAS MADE ?
- The story was taken from internet.
- A 4 line summary of the story was generated using ChatGPT
- For each summary line, a corresponding image was generated using Stable Diffusion
- In each image (4 in our case), the important object were masked using MetaAI SAM
- I have developed a custom image to white-board animation code which converted the images to videos (This was the most time-consuming part of the development process).
- The audio was generated using gTTS (Google Text-to-Speech)
- The sub-titles were generated with the help of OpenAI whisper
- All the different audios, videos and subtitles were somehow synchronized using FFMPEG and OpenCV (This part gave a lot of pain🥲... ALL HAIL FFMPEG🙌)
Github: https://github.com/yogendra-yatnalkar/storyboard-ai(PS: Sorry I just hacked up end-to-end stuff right now. I am in the process of cleaning the codebase and documenting it. Will do that in next 2-4 days.)
The image to whiteboard animation code: https://github.com/yogendra-yatnalkar/storyboard-ai/blob/main/draw-video/draw-u5.py (This works even if important objects are not masked by MetaAI SAM)