Protect online content

For a website where public information is summarized and posted by the site’s owner, what are some good ways to protect the content?

Examples so far:
post a gif of the document instead of the document (how does this stop OCR–watermark?)
disable right click

If you want to protect the content, why are you putting it online in the first place? If you want people to be able to read it, you can take steps to make it more difficult for them to copy it but you’re not going to prevent it. If you post images instead of text, OCR or transcription are options. If you disable right-clicks to stop people from saving the image, they can retrieve it from their browser cache or take a screen shot. If you watermark the image or use a high-contrast background to prevent OCR, many people won’t be able to read it either. Any measure you take may protect the content from some people by raising the level of effort required, but you can’t protect it completely.

If you don’t want the content copied, you need to seriously rethink why you’d make it available in the first place.

ANYTHING can be copied from online. If nothing else, a person can take screenshots of it. As stated, if you don’t want it copied, do not put it online. You can, however, limit access to documets to, say, paying customers through the use of htaccess and other security measures.

And disabling right click is just obnoxious and can be easily gotten around anyway, either by screenshot, or saving the whole page to HD, then changing the code.

Not to mention that most right-click-disabling scripts have to pop up a stupid box telling you the content is “protected,” which leaves them open to this trick:

  1. Hold down the right mouse button.
  2. When the stupid box pops up, hit the space bar to dismiss it.
  3. Release the right mouse button.

Lo and behold, there’s your usual right-click menu!

Honestly, these people who think their precious content can somehow be protected while still remaining available on the web must be smoking crack or something. The most they can do is make it slightly inconvenient for someone to copy their content.

the right click blocker in mozilla does not work. the alert pops up but so does the contextual menu.

The closest thing you can do to protecting documents is to use the Adobe PDF format. That said, you really just need to post it clean. Trying to frustrate illegitimate users just makes it too much bother for legitimate users.

Protected PDF’s would probably be about the most secure you could get. At the very extreme end of the spectrum, a person could just manually re-type out what you wrote so theres no way to prevent it.

If you really want to make the content difficult to copy, not only could you post the text as a bitmap or in a PDF file, but also use a font that would be difficult for most OCR applications to pick up.

I’ve done quite a bit of work with OCR, not developing it but using commercially available OCR engines in other applications. I also know several vision scientists who study readability of text and how font, color and background texture influences readability. Based on this experience (and as noted above), it’s pretty clear to me that once you obfuscate the text enough to prevent OCR you’ve also made it unreadable to a large number of humans. There’s a whole lot of content on the web which is pretty unreadable if you’re not a 20-something with good vision. A lot of people routinely cut-and-paste text into something like notepad just to be able to read it because the web designer thought it looked cool to put dark red text on a black background or yellow text on a textured beige background. Many of these situations are readily handled by OCR even when many people can’t make them out.

I’m not saying you’re wrong. You can certainly choose fonts, foreground/background colors, and background textures to confuse OCR. I’m just pointing out that if you do that, a lot of humans won’t be able to read it either. It all depends on who your target audience is.