YouTube audio creeps out of sync

Hello,

I tried to google this out and couldn’t get an answer that made sense to me.

I’m working through an online recording studio class and the audio gets out of sync with the people teaching it. I noticed that it does this for all videos on YouTube. Video doesn’t seem to be of bad quality, but it seems to creep around and get behind of the audio the longer it runs.

Is there a way to adjust this? I am using Firefox as my browser, and it would appear that my Java is up to date, it doesn’t update again after I tell it to.

Here are the specs of my FrankenTop laptop:

Operating System
Windows 10 Home 64-bit
CPU
Intel Mobile Core 2 Duo T9300 @ 2.50GHz 28 °C
Penryn 45nm Technology
RAM
3.00GB Dual-Channel DDR2 @ 332MHz (5-5-5-15)
Motherboard
ASUSTeK Computer Inc. K50IJ (Socket 478)
Graphics
Generic PnP Monitor (1366x768@60Hz)
Intel Mobile Intel 4 Series Express Chipset Family (ASUStek Computer Inc)
Intel Mobile Intel 4 Series Express Chipset Family (ASUStek Computer Inc)
Storage
111GB KINGSTON SV300S37A120G (SSD) 22 °C
968MB Multiple Card Reader (USB)
Optical Drives
Slimtype DVD A DS8A4S
Audio
VIA HD Audio

Thanks for the help I appreciate the time anyone takes to answer this for me.

Try clearing your browser cache:
Tools > Options > Advanced > Cached Web Content > Clear Now

Also, unless you actually need it, consider removing Java from your computer.

From my experience, it will be difficult for you to fix this. There can be many root causes.

I’m going to skip the legal and ethical issues to suggest that you consider using a Firefox plug-in to download the video(s). The resolution will be lower, but the audio should be in sync. Plus, it’s much easier to reverse and FF the video if you need to.

I use Firefox most of the time, but for YouTube videos I switch to Chrome. They run much better.

To provide a better understanding of why this happens:

When sound was added to film, it was recorded on different media. Clapper boards were used at the start of each shot so there was a distinct mark to line up audio and visual material. For the final release, it was logical to include the audio per frame, so each 1/25th of a second visual frame included a 1/25th second slice of audio, and the images and sound were naturally synchronized. But each frame contained a visual image and the associated audio.

Originally, digital media files were like this - fixed bitrate noncompressed video and audio. But it was slow to access and impossible for the low bandwidth internet of the time.

Modern digital media is different - we compress both the visual and audio, and the algorithms to do so are quite different. The work to decompress and display/play the material is also inconsistent - some material does not compress well, or decompresses slowly. You need to actually provide suitable timing points within both the visual and audio streams to keep the playback timing accurate.

This is the job of a “container” format - it holds both compressed video and compressed audio data, the information about how the streams have been compressed and should be played back, and timing information.

A naïve container format might start with the header that says that there is 100 seconds of H264 video, and then 100 seconds of MP3 audio, and then append the video and audio files.
But you would have to read the whole file, separate out the data, decompress it, then play the video and audio together. And it would be really easy to get them out of sync if the MP3 audio plays a bit faster than the video. It would be really bad for internet streaming.

A better “container” would split the video and audio into specific chunks (1/25th of a second), and interleave those chunks of video and audio. That would be great for synchronized playback and internet streaming, but it breaks up the data at really inconvenient times for both video and audio compression - loosing a chunk or two will produce really bad artifacts/distortion. Also, the overhead of the container format itself will make the final file quite a bit bigger, which is non-optimal for internet streaming.

So container formats are a balancing act between synchronization and file size/streaming capability.

Then you get Quality Of Service. Both content providers and your ISP look at the rate at which data is streamed to you. If they determine that the connection cannot keep up, they will signal the content provider, and ask for a less detailed stream that requires less bandwidth. So a good streaming container allows for bandwidth changes midstream without interruption or desynchronization.

The process of converting video to a different container or video/audio format is called “transcoding”, and is a very computationally expensive operation to get the best overall combination. Those YouTube videos you are having issues with were probably poorly containerized when uploaded, but use a container format that looks OK to YouTube, so they don’t transcode them into better containers. If you have bandwidth or QoS issue during playback, changing streams part-way through may cause the audio desynchronization you are experiencing.

Sent from my SM-G900I using Tapatalk

[clipped for length]

Good summary. I would just add that, for technical and practical reasons, the audio stream is usually less affected by errors and QOS. Dropping a frame or two of video, or speeding it up a bit to achieve sync, is not very objectionable to most viewers. But a stutter or gaps in the audio stream can make it unintelligible. If your audio is out of sync by, say, 1.5 seconds, it is probably more correct to say that the video is failing to sync with the audio than vice-versa. The audio you hear at 1:32.033 into the video clip is probably right where it should be, but the video is not.