Well, actually it did. That’s why Mr. Swirlyface’s restored image does have some artifacts in it. Some original data was removed because it DID change the data. You can do this yourself in Photoshop or Gimp and depending on the offset between the center of the original effect and counter effect you can get some interesting combinations, but if you don’t offset you will still lose data. Like on this D-bag’s nose area. But it was still enough to identify him.
After looking through your examples a couple of times, I have a better idea of what you’re talking about. I’m going from step 1 to step 3 in your example. You’re going via some really heavily compressed JPEGs, and with a blacking-out procedure that basically encapsulates the very edges of letter boundaries, thus leaving behind a little bit of JPEG compression. In a very specific example such as yours, you leave behind some evidence of the underlying text.
But if you look in your example 3, your orange box is not covering all the underlying information, and you can quite clearly see this in Photoshop, if you look at your channels or whatever. Your method may possibly yield the vaguest insights if somebody redacts a single line of heavily compressed text so precisely that they leave behind some JPEG artificats.
If someone is blocking out text in a document, it’s possible that they might do so quite tightly, so as not to destroy text on adjacent lines.
I’m sure nobody here imagines a complete reconstruction of the destroyed data would be possible, and for sure, threemae’s example is constructed so as to make the effect grossly visible, but even in normal cases, the artefacts are still there, just not contrasty enough to be visible to casual observation.
Also, if the image has been compressed several times (and resized so as to shift the compression blocks), it’s possible that several generations of artefacts could have spread out further than one block from the original data. Such spreading would be even harder, perhaps impossible, to work backwards, but I suppose there are cases where even the slightest clue about the original contents could prove useful.
Definitely not the best way. What the scanner sees and what you see could be quite different.
Back when Dick Cheney shot that one guy, cnn.com posted a scanned copy of the indicent report. Mr. Cheney’s personal information had been blacked out with a marker prior to scanning, and I happened to download a copy almost immediately after it was posted. You could totally read half his driver’s license numbers, among other things. I couldn’t believe what I was seeing. Less than an hour later, cnn.com had added digital blackout. I always wondered what happened to that poor intern who did the original scanning. (I have a copy of the original PDF on one of my machines somewhere. I’ll see if I can find it.)
This is a really bad idea. In the past I’ve used my very high end (for its time) scanner to scan photographs and effectively re-develop them. It had such a high sensitivity in the scanning head that combined with photoshop I could lighten apparently solid black areas and clearly show incredible detail that wasn’t present on the original photograph.
It would be quite simple to use this with your method. For instance, if you take a laser printed document and jiffy through it, by holding it just so wrt the lights you can see different sheens and surface textures caused by the jiffy ink and the toner particles. If you can see that with your eye, you’d be amazed at what I can grab with the scanner.
tim
ps Loved the Daily WTF reference Intentional or otherwise?
Exactly, I’m not trying to scare anyone or say that blacking out JPEG’s is a complete failure. Instead, I’m just saying that, “.jpeg’s are flat, it is absolutely impossible for information under the black boxes to leak out,” isn’t strictly correct. As I’ve said before, it depends on the route the image takes to go to the final .jpg. If it’s scanned and saved as a .jpg and then saved as a .jpg again, it’s an issue. If it’s a screen-capture blacked-out from a vector PDF with the sensitive information, it’s not a problem. And aren’t rigorously correct answers what The Dope is about?
Again, the issue of if it’s a practical concern has everything to do with the importance of privacy and the sophisitication of the audience that might be trying to learn the removed information. It probably isn’t a problem for what the OP intends to use it for, but again, I wouldn’t use it to black-out information on the machining of plutonium nuclear warheads.
Agreed, however, I wonder how often that will tend to be the case nowadays.
An image scanned directly into an image program (well, any decent one) and edited to add the black bar will not have experienced compression until the data has been destroyed (unless there are some scanners/scanner drivers that compress on the fly or something), but some of the other ways of getting an image into a computer (most digital cameras, under their default settings, for example) will automatically involve some kind of compression, before you even get your hands on the image to edit it.
That’s what I thought it was. I often find myself adding extra qualifiers just because I know if I don’t someone will nitpick me. There’s one in my previous post - the bit about default settings - if I just said “most digital cameras compress images”, someone would point out that quite a few offer the option of not compressing.