I don’t have a website reference for you, sorry. If you’re interested in visual effects, there used to be a magazine on the stands called VFX that did a pretty good job covering the new techniques.
But if I understand what the OP is referring to, this isn’t a visual effect. As others have suggested, it’s achieved by removing frames. I most frequently see it used in trailers; there’s a shot in the movie that scored highly with test audiences, so the producer wants it in the trailer, but it’s too long, so it’s compressed in the middle. Hypothetical example: Imagine a shot of Lara Croft Tomb Raider where the camera looks over her shoulder from behind, seeing what she sees as she enters a spectacular space. The camera then dollies forward, panning sideways to stay focused on Lara, so that at the end we’re looking back on her face when she says, “Impressive.” In the actual (hypothetical) movie, this shot is seven seconds long, which doesn’t sound like much but is in reality way too long for a thirty-second TV commercial. So the editor modifies the shot so we get about a second at the beginning, showing the space; the camera move is highly compressed, so we zoom forward; and then we go back to normal frame rate at the end for the second it takes to see Lara’s face and hear the line.
This is accomplished by not just removing every tenth frame or whatever: It’s done by removing nine out of ten frames, or however many are necessary to compress the shot to the desired length without making the contents unrecognizable.
(Do the math: 24 frames per second in Western film. A seven second shot, 168 frames, shrinks to, say, three seconds, or 72 frames. The first 24 and the last 24 are intact. That means the 120 frames in the middle of the shot are compressed to 24; out of 120 frames, 96 are removed. That’s four out of five.)
The reason this has become popular is that with the new digital editing tools it’s relatively easy to accomplish. In the old days, where you were physically cutting film, this would have been a nightmare. But now, on an Avid deck (or whatever), you can easily go through a snippet of film and tell the machine to take just this frame and just this frame and just this frame and stick them all together. It can be time-consuming, but we’re talking a couple of minutes, whereas before it might have been half an hour.
Even better, if the resulting “swoop” is unreadable to the eye, you can modify your choices by pushing the frame selections forward or backward a couple to maintain enough coherence in the image that the audience can tell what it is. In other words, rather than blindly choosing every fifth frame (1, 6, 11, 16, 21, 26, etc.), if that makes the sequence too choppy, you can try slipping alternating selections closer together: 1, 5, 12, 15, 22, 25, and so on. The slight increase in “persistence of image” will help your eye decipher it, but it’s going so fast you don’t notice the herky-jerk timing.
Frames are sometimes removed for other reasons, too, which is perhaps where the original idea for the technique came from. I can’t tell you how many times I’ve seen a fight scene where a frame or two is removed at the moment somebody gets struck in the face, for example. Similarly, slapstick comedy, especially hits and pratfalls, can sometimes be punched up by the careful removal of a frame at a particular point.
And slow-motion can sometimes be “recovered” using the same method. If you’ve shot a sequence at 48 frames (twice normal film speed, resulting in half normal speed on playback), it’s pretty straightforward, albeit laborious, to go through and pull out every other frame to get it back to the 24-frame standard. I seem to recall some major film in the last couple of years doing this: There was an inexperienced director who had shot some important moment in slow-motion to emphasize the lead character’s emotional transition, but in the editing room it didn’t work, so the editor had to strip out alternating frames at some bizarre ratio to make the shot look “normal.” I want to say it was Bridget Jones’s Diary, but I may be misremembering.
Oh, one more thing:
A “crash zoom” doesn’t involve tracking. All it means is a very fast zoom. If you’ve seen the trailer for Tarantino’s upcoming Kill Bill, there’s a crash zoom at the point all the bad guys come running into some room; the camera starts wide so we see the crowd dashing in, then they stop and there’s a crash zoom to the guy in the center so we have an extreme closeup of him screaming, and then there’s a reverse crash zoom back to the wide angle so we see the crowd charging forward again.
What you’re thinking of is the “tracking zoom,” or the “Vertigo effect” after its prominent use in the Hitchcock film. The idea is (almost) as you have described it: The camera tracks or dollies forward while the zoom backs out, or vice versa. It works because the way different lenses see the same material. With a short lens, foreground material pops out while the background recedes; with a long lens, the background comes forward, creating a flattened look.
A famous recent use of this technique is in Goodfellas, where Ray Liotta and Robert De Niro are sitting at the table in the diner with a window between them in the wall behind them. The camera slowly tracks backward while the camera operator zooms in at the same speed. The effect is to keep the foreground characters in the same position, but to slowly make the background swell in significance, as if to suggest that reality is overtaking these two criminals and they’ll no longer be able to live in their own world.
Anyway, let me know if you have any questions or if any of this is unclear.