Is it safe to black personal information on a jpeg?

si_blakely · May 10, 2008, 11:13pm

One of the problems with releasing redacted material is not understanding how the redacting tool works. In a number of highly public instances, PDFs containing redacted material have been released, only for the obscured material to be extracted. This usually happens because someone uses a PDF editing tool that creates a black rectangle over the offending material. If you printed this, you could not see the hidden data. But the data is still present within the PDF. All you have to do is open and edit the PDF to remove the obscuring rectangles, and the data is now visible.

It may be that this has been addressed in the latest versions with the Redaction tool, but I would be very careful about using PDFs. Likewise (as mentioned) Word can retain historic editing information which may inadvertantly leak unintended information. Even using a black felt pen on a printed document is not safe - the inks may have different florescence spectra that can be used to see the redacted info, and toner can leave a shape on the papers surface that can be read using specific light sources.

The scan, edit as a bitmap and flatten to a JPEG scheme is probably safe (actually safer than printing and using a black felt pen), but don’t ever make assumptions unless you understand the process and the tools.

Si

Mangetout · May 10, 2008, 11:26pm

Only if the image was not previously a jpeg - I believe threemae is referring to jpeg artefacts that become permanently engrained in an image - you can black out part of it, editing as a bitmap, but if those artefacts are already there, they will contain some information about the object that originally provoked them.

Not enough, I’d wager, to be able to retrieve anything meaningful in anything but the most exotic of cases.

Mangetout · May 10, 2008, 11:30pm

Also, hasn’t there been some flawed implementation of some lossy compressed format that didn’t truncate a file if it was opened and edited in such a way as to compress to a smaller size? I doubt if such a thing survives today, but I’m sure I remember hearing about such a thing happening once.

kellner · May 11, 2008, 12:44am

That’s generally true but this interdependence has its limits. JPEG images are compressed in blocks of 8x8 pixels. Those are processed almost - but not quite - independently. This means that by and large artefacts remain within the 8x8 blocks where they originated. In your example some artefacts show more than 8 pixels from the apparent edge of the letters but that’s because the background close to your smoothed letters isn’t exactly the same green as the rest (but that’s a separate issue unrelated to JPEG.) IIRC the only information shared between blocks in a JPEG image is the so called “DC coefficient.” Very, very roughly, the average of one block is expressed relative to the average of the preceding block. Anyway this has to be computed anew each time the image has saved. So the blocks are not theoretically independent but the possibilty of useful data bleeding across block boundaries is very limited.

Mister_Rik · May 11, 2008, 2:01am

I think that the accepted terminology is confusing this issue. We use terms like “painting over” or “blacking out”, and our brains associate these terms with the “physical world” acts of the same name. For example, using a paintbrush and a bucket of paint to paint over the graffiti on the side of a building.

However, this is a case where the terminology is inadequate. When working with a flat image format there is really no such thing as “painting over”. When you select your “paintbrush” tool and use it to “paint over” something in the image, what you’re actually doing is “completely replacing” what was there before.

In real world terms, it’s more akin to using a stick to write in the sand, and then using a broom or a rake to wipe the writing away. When you do that, it’s literally as if the writing had never been there.

I’m unfamiliar with Paint and PSP, but using the paintbrush tool in Photoshop does not create a new layer. Using the shape tool to draw a rectangle does create a new layer, but flattening the image again completely replaces the previously existing pixels with the rectangle’s pixels, and as has been mentioned already, simply saving the file as a JPEG will automatically flatten the image.

Edit: I’ll add that you might feel more comfortable using the Eraser tool to “erase” the sensitive information.

GuanoLad · May 11, 2008, 2:05am

Nothing like a group of experts to overcomplicate what’s a very simple issue.

Black out what needs blacking out and save as a new JPEG file, keeping the original safe. Done.

Colophon · May 11, 2008, 2:07am

I can’t believe how many complicated answers have been given to this question.

The simple answer is yes, it is completely safe. JPEGs don’t have layers. What’s under the rectangle is gone for ever.

Gukumatz · May 11, 2008, 2:13am

In Paint, yes.

However, you might also want to use the same technique to obscure information like document identification number, etc. Particularly if it’s from a document that’s available online, or subject to public disclosure.

crowmanyclouds · May 11, 2008, 6:54am

This [del]delightfully[/del] tragically happened to TechTv’s Cat Schwartz!

CMC +fnord!

Bob55 · May 11, 2008, 2:45pm

If you really want to be sure it’s lost, paint over it like you suggested, and then do a screenshot (printscreen), paste that into a new picture-editing program, crop it, and save it as a new file. No way any data would stick through that. But I’m sure Paintbrush is just fine.

drhess · May 11, 2008, 3:47pm

If we are discussing JPGs, then I have an even simpler way: select the space you want to remove and then “cut” it out. Presto. (Actually this *is * the same as painting over, but for the non-technical, it might seem more like what would work best with real paper and pen: cut out the text, not marker over it.)

alice_in_wonderland · May 11, 2008, 4:05pm

Well, this isn’t nearly as sophistocated as what you guys are doing, but I would just print out the document, use a Jiffy marker to cover the info you want covered, and scan it back in as a new .jpg.

Pretty well 100% fool proof, but not as high tech as these answers.

Derleth · May 11, 2008, 4:21pm

I use Linux and the JPEG format doesn’t change based on OS. My information and CookingWithGas’s information is correct as regards all OSes.

And, in what way did you write your own screenshot script? Does it talk Xlib or Gtk or is it a Perl or shell script that invokes xwd? I’m just curious.

You can do that with other tools. This project looks promising, for example.

I do a lot of work with digital image files too. You can save it with a more descriptive name and erase the original later.

I don’t know what tools you’re using, I suppose.

It’s an open standard and there are third-party PDF readers, but that is not relevant if Acrobat does what minor7flat5 says it does. Since Acrobat is closed-source, there is no way to verify that.

By which you mean “no”. It isn’t possible except in a Tom Clancy novel.

Fubaya · May 11, 2008, 5:05pm

Well yeah, of course it doesn’t change. I was just pointing out that I’d rather not duscuss MS Paint or whatever.

It just uses the import command but does a little other work on them and names them sequentially.

Sure, for anything you want to do on a computer, you can probably do it with other tools. That’s one program to read exif data, jhead will actually delete it.

Uh yeah, I know you can name files anything you want. I was just mentioning the convenience, this really has nothing to do with blacking out jpgs. I tested it, it took 12 seconds to open a photo from a separate partition, draw a square over an object, fill it with black, and save a snapshot where I know it’s name and don’t have to look for it. It took nearly 40 seconds to open, edit, click save, navigate to my home folder, rename and click ok.

Derleth · May 11, 2008, 5:27pm

This doesn’t answer my question at all. In fact, it doesn’t even make that much sense.

So you don’t need to do the screenshot rigamarole just to get rid of EXIF data. We’ve established that.

As I said, I don’t know what tools you’re using, but they must be pretty odd if making a screenshot of a program displaying an image is the easiest way to do anything with that image. Try playing with the Gimp for a while.

Ah, the “wooden table” solution arrives. (Anyone else read TDWTF?)

Fubaya · May 11, 2008, 6:05pm

Try “man import” in a shell.

Of course not, you can always use another program to do it.

I’ve been using Gimp for 8 years. Using Gimp, open an image from /usr/share/wallpapers and black out a section. Now, check your watch and crop the image and save it in your home directory with a different name, then erase the embedded information with whatever program you like. It took me 2.5 seconds by taking a screenshot, and most of that time was used drawing a box around the area I wanted to crop.

Screenshots aren’t necessary but they’re foolproof and work for me. If only Cat Schwartz had known.

pulykamell · May 11, 2008, 7:29pm

I agree that it is practically foolproof, but there is still a possibility with a sensitive scanner that there is a difference in the blacked out pixels and the writing underneath. If all the black areas are clipped to 0 in RGB channels, then there is no information. But you could, in theory, have a scan that looks completely black on the screen, but the blacked out data may have tonal values of 1 or 2, which can be converted to higher brightness values. On an 8-bit scan, it’s not terribly likely, but I’d hesitate in calling it foolproof.

The blacking out methods would be the most secure method, in my opinion.

Mangetout · May 11, 2008, 10:41pm

This board would be a dull place were it not so. The simple question has been answered, then we started discussing the interesting fringe issues.

threemae · May 12, 2008, 3:51am

Take a careful look at the examples I posted again. You’ll notice that the e’s, s’s, and i’s all produce distinct interference patterns on the surrounding background. Essentially a direct blob-for-letter replacement and could easily yield quite a decent amount of information to a person with minimal technical skills. It’s not exactly Tom Clancy novel level stuff here.

pulykamell · May 12, 2008, 4:18am

A healthy caution is good when it comes to these sorts of things, but Derleth is correct. If you draw a black box over a piece of text (and flatten it, if you’re working with layers), and then save it as a JPEG, there will be no JPEG artifacts that will allow you to reverse engineer the original text. As far as the JPEG knows, you’re just trying to compress a black box. A black box that used to have an “A” underneath it and a black box that used to have an “X” underneath it all look the same at the time of compression.