Operation
The software will idle when the backlog of compilations is full.
Whenever it detects that the backlog has a gap, it will try to make a new compilation. However, if there is not enough source material for a compilation, it will trigger the pipeline.
Modules
The software has the following main modules:
- clip maker
- compilation maker
- video collector
- video downloader
Video Collector
This module is responsible for collecting potential videos for future use.
This is the part that goes on YouTube and searches for videos. Every potential video is saved in a database, together with the topic of the video (e.g. cats).
Later, the video downloader will come in and download any of those saved videos.
Video Downloader
This module is responsible for downloading the collected videos. It looks at the database for potential videos and downloads any of them.
When downloaded, the module saves the video, along with an entry in the database about it, plus the topic of the video - that's important.
Clip Maker
Because a lot of videos on YouTube are already compilations, it would be stupid to take a whole compilation and put it as a part of our compilation.
This module is responsible for taking the downloaded videos and slicing them into a lot of smaller clips, which are now much more usable for a compilation video.
Compilation Maker
This module is the one responsible for making the final video. It does the following steps:
- Choose one of the predefined topics in which we are interested.
- Choose clips from the database for the specific topic.
- Calculate how long of a compilation it can make.
- If it's enough for the specific platform - great, continue.
- If it's not enough for the specific platform, it will then trigger a pipeline to get more clips.
- When enough clips are accumulated, start stitching them together.
The Pipeline
The so-called Pipeline consists of the first 3 modules - video collector, video downloader, and clip maker. It requires a parameter - topic. It is called by the Compilation Maker if it has not enough source material for a compilation.
The Pipeline will go once through every module with the specific topic.
When the pipeline finishes, it returns control to the Compilation Maker, which tries to make a compilation again. If it still has not enough material, it will trigger the pipeline again. This will continue until it has enough to finish its job.
Backlog
Every module does more work than what is required. That's by design.
Everything that a module creates and is not used is just kept in a backlog.
For example, the clip maker makes a lot of clips from a video, which won't be used immediately. However, everything is saved in the database as backlog!
The next time the Compilation Maker tries to run, it may have enough material to create a compilation immediately, without the need to run the pipeline!
Language
The short answer is Python and JavaScript.
There are a lot of great libraries for both Python and JavaScript, but for the things I needed, the Python libraries were much better.
So, the main project is in JavaScript, but some modules are written in Python.
In the JavaScript world, I created a simple mechanism to bridge my main application with my Python scripts, so the code remains clean and reusable.
Why the need for this Frankenstein? Well, this way I got the best of both worlds!