One of the problems with mental models of how the sea surface changes is that we often don’t account for how shallow the sea is compared to its extent.
The bathtub example fails because the water has a depth of a similar magnitude to its extent. Change the example one that is representative of the real geometry with of the order of a thousand to one aspect ratio and it is apparent that mixing at the edges of regions of different sea surface temperatures is going to be very slow relative to the extent of the water. There isn’t likely much of any sort of flow where the warmer water overtops the cooler water and drives a large scale mixing. Even if there was, the region of mixing would only have an extent of the same size as the depth of the water.
But a few centimetres of difference between bodies a couple of kilometres deep is going to drive a very slow mixing.
The thermohaline circulations take of the order hundreds to a thousand years to perform a circuit. They are driven in part by temperature but mostly from differences in density due to different salt concentrations. The circuit time gives us an insight to the time we might expect any levelling of inhomogeneities to occur.