Is it safe to black personal information on a jpeg?

Stan_Shmenge · May 12, 2008, 4:25am

Well, actually it did. That’s why Mr. Swirlyface’s restored image does have some artifacts in it. Some original data was removed because it DID change the data. You can do this yourself in Photoshop or Gimp and depending on the offset between the center of the original effect and counter effect you can get some interesting combinations, but if you don’t offset you will still lose data. Like on this D-bag’s nose area. But it was still enough to identify him.

/pet peeve

pulykamell · May 12, 2008, 5:04am

After looking through your examples a couple of times, I have a better idea of what you’re talking about. I’m going from step 1 to step 3 in your example. You’re going via some really heavily compressed JPEGs, and with a blacking-out procedure that basically encapsulates the very edges of letter boundaries, thus leaving behind a little bit of JPEG compression. In a very specific example such as yours, you leave behind some evidence of the underlying text.

But if you look in your example 3, your orange box is not covering all the underlying information, and you can quite clearly see this in Photoshop, if you look at your channels or whatever. Your method may possibly yield the vaguest insights if somebody redacts a single line of heavily compressed text so precisely that they leave behind some JPEG artificats.

si_blakely · May 12, 2008, 6:52am

But you can identify the age of the table from the pattern of tree rings exposed in the photo, and couple that with the exif data from the camera…

Its Worse Than Failure (or at least, it was for a while).

Si

Mangetout · May 12, 2008, 8:38am

pulykamell:

After looking through your examples a couple of times, I have a better idea of what you’re talking about. I’m going from step 1 to step 3 in your example. You’re going via some really heavily compressed JPEGs, and with a blacking-out procedure that basically encapsulates the very edges of letter boundaries, thus leaving behind a little bit of JPEG compression. In a very specific example such as yours, you leave behind some evidence of the underlying text.

But if you look in your example 3, your orange box is not covering all the underlying information, and you can quite clearly see this in Photoshop, if you look at your channels or whatever. Your method may possibly yield the vaguest insights if somebody redacts a single line of heavily compressed text so precisely that they leave behind some JPEG artificats.

If someone is blocking out text in a document, it’s possible that they might do so quite tightly, so as not to destroy text on adjacent lines.

I’m sure nobody here imagines a complete reconstruction of the destroyed data would be possible, and for sure, threemae’s example is constructed so as to make the effect grossly visible, but even in normal cases, the artefacts are still there, just not contrasty enough to be visible to casual observation.

Also, if the image has been compressed several times (and resized so as to shift the compression blocks), it’s possible that several generations of artefacts could have spread out further than one block from the original data. Such spreading would be even harder, perhaps impossible, to work backwards, but I suppose there are cases where even the slightest clue about the original contents could prove useful.

Pasta · May 12, 2008, 9:13am

Definitely not the best way. What the scanner sees and what you see could be quite different.

Back when Dick Cheney shot that one guy, cnn.com posted a scanned copy of the indicent report. Mr. Cheney’s personal information had been blacked out with a marker prior to scanning, and I happened to download a copy almost immediately after it was posted. You could totally read half his driver’s license numbers, among other things. I couldn’t believe what I was seeing. Less than an hour later, cnn.com had added digital blackout. I always wondered what happened to that poor intern who did the original scanning. (I have a copy of the original PDF on one of my machines somewhere. I’ll see if I can find it.)

PuddingCat · May 12, 2008, 9:54am

This is a really bad idea. In the past I’ve used my very high end (for its time) scanner to scan photographs and effectively re-develop them. It had such a high sensitivity in the scanning head that combined with photoshop I could lighten apparently solid black areas and clearly show incredible detail that wasn’t present on the original photograph.

It would be quite simple to use this with your method. For instance, if you take a laser printed document and jiffy through it, by holding it just so wrt the lights you can see different sheens and surface textures caused by the jiffy ink and the toner particles. If you can see that with your eye, you’d be amazed at what I can grab with the scanner.

tim

ps Loved the Daily WTF reference Intentional or otherwise?

threemae · May 12, 2008, 3:39pm

Exactly, I’m not trying to scare anyone or say that blacking out JPEG’s is a complete failure. Instead, I’m just saying that, “.jpeg’s are flat, it is absolutely impossible for information under the black boxes to leak out,” isn’t strictly correct. As I’ve said before, it depends on the route the image takes to go to the final .jpg. If it’s scanned and saved as a .jpg and then saved as a .jpg again, it’s an issue. If it’s a screen-capture blacked-out from a vector PDF with the sensitive information, it’s not a problem. And aren’t rigorously correct answers what The Dope is about?

Again, the issue of if it’s a practical concern has everything to do with the importance of privacy and the sophisitication of the audience that might be trying to learn the removed information. It probably isn’t a problem for what the OP intends to use it for, but again, I wouldn’t use it to black-out information on the machining of plutonium nuclear warheads.

pulykamell · May 12, 2008, 3:41pm

If you’re starting from a non-compressed or losslessly compressed file, it is imporssible for the information to leak out.

Mangetout · May 12, 2008, 9:28pm

Agreed, however, I wonder how often that will tend to be the case nowadays.

An image scanned directly into an image program (well, any decent one) and edited to add the black bar will not have experienced compression until the data has been destroyed (unless there are some scanners/scanner drivers that compress on the fly or something), but some of the other ways of getting an image into a computer (most digital cameras, under their default settings, for example) will automatically involve some kind of compression, before you even get your hands on the image to edit it.

Mangetout · May 12, 2008, 9:31pm

That’s what I thought it was. I often find myself adding extra qualifiers just because I know if I don’t someone will nitpick me. There’s one in my previous post - the bit about default settings - if I just said “most digital cameras compress images”, someone would point out that quite a few offer the option of not compressing.

Topic		Replies	Views
lets discuss digital photo archiving In My Humble Opinion	20	2773	October 2, 2008
Distorting PDFs to prevent character recognition Factual Questions	45	14956	March 20, 2013
Hidden source web pages Factual Questions	30	1541	August 25, 2004
New Urban legend? Printer 'black boxes' Factual Questions	45	8363	January 5, 2015
Spy movie cliche: Cleaning up fuzzy pictures Factual Questions	26	3048	September 28, 2001

Is it safe to black personal information on a jpeg?

Related topics