I’m getting into video encoding for a personal hobby project, and I’m needing to use low-level tools since I’m doing everything automated and unattended. (Basically doing a 24/7 timelapse out my Roosevelt Island, NYC window using a Raspberry Pi, to watch all the supertall construction that’s going across over the river over the next decade…)
Just for my edification to better understand what I’m doing, I’m hoping someone with expert knowledge can (without speculating…I can do that perfectly well myself) explain to me exactly how video codec formats interact with AV container formats, and what the division of responsibilities is between the two.
My own guess (basically, how I would do it if I were designing things, given the little I know) is that the “codec” defines only a compression algorithm and binary format for a stream of “fields” (not frames, but “fields”, since interlaced frames have two fields), and that all the other information needed to play back the video is metadata encoded into the container format (like AVI, MKV, etc.). Furthermore, time indexing information (for random access seeking) and interleaving of bytes between the different streams (video, audio, subtitles, etc.) is handled by the container, and the codecs themselves don’t natively know anything about that.
Does anyone know if that’s accurate, or do the codecs handle more than that? And if so, does anyone know why? If it does I assume there’s good reason for it if they did.
Finally, does anyone know how header (or footer?) formats work for common AV containers? I know there are a lot of GUI programs you can use to split video files, but they usually involve making entire copies of data. Are there container formats that one could conceivably just take a subset of the bytes, make absolutely no changes to the bytes themselves, and synthesize a new header/footer around them and get a valid file that will play in a regular video player?
(Basically, I’m thinking about writing my own FUSE filesystem for that will essentially use the equivalent of hard links at an intrafile level, so I can quickly get read-only versions of subsets of the whole video without any extra copying…)