Short version: There’s no one hard and fast policy. It’s context sensitive. In general, you do the best you can given time/space constraints.
In a generic subtitling, you’d never leave it blank. I mean one where there was ample time to get the job done. You’ve got the mood of the scene, the intent of the writers, and the importance of the wordplay itself, as well as the time constraints. So you pick the most important elements and make it fit as best as you can. (I’ve seen a lot of English shows subtitled into Quebecois French in my youth, so I have an elaborate frame of reference, though a dated one.)
A more recent case in point: In the movie Amélie the mentally-challenged fellow is going off about the awful greengrocer fellow Colignon and getting some good rhyming burns in. “Colignon, tête de fion,” got translated to, “Colignon, big moron.”
The character was firing off sick burns in a rapidfire, rhyming kind of way. The English translation keeps that general feel but is a swing and a miss for meaning. But it generally fit the scene, all other things considered.
If you’re talking about realtime subtitling or live interpretation, I guess that’ll depend on the wit of the person doing it. In another thread (though it’s recently active it’s just faster to recall from memory), an interpreter might just say, “He has made a joke, please laugh,” rather than even try to get that in in realtime.
That’s about the best precision I can answer. I mean, other than, “They do the best they can,” which isn’t a satisfying response.
As an aside, my version of Velocity’s cat joke would have been, “purchatoire”.