I found this product on ebay. If you look at the source code you can see the page has been converted to Javascript but how does it work? Is it just a big graphic? Would links still work? There are none in that page.
It doesn’t look like encryption, more like obsfucation, but the result is similar. I’m gussing is this is how it works: You write your html page in the normal way with links, graphics, etc. You run it thru the software which takes your html and produces the obsfuscated java script. When you do a view source, all you see is the script, but executing the script returns your original html page.
Just had a quick look at this.
The software package takes the original HTML page and Base65 encrypts it. It creates a JavaScript block and drops the encrypted data in as a variable followed by a CRC for the decrypted data. On the end of the same line it tags on a simple (non-Base64, this one uses simple hex values for each character - little more than obfuscation) decryption routine and a small JavaScript library encrypted in this manner.
It evaluates the results of decrypting the JavaScript library into the document. This library contains a few small functions such as the Base64 decryption routine (including the hardcoded key), a CRC check and some other bits and pieces.
The code decrypts the encrypted document and runs the CRC against it. If the CRC checks out it writes the document to the browser. If it fails then it flashes up an error message and suggests that the page has become corrupt in downloading and the user should reload the page. If the user clicks the “Yes” button the page is automatically refreshed.
Once the document is down the JavaScript determines the browser type (it recognises IE, Netscape 4- and post-Gecko Netscapes only. No Mozilla, Opera etc). It uses a few well-known tricks to disable stuff like selecting, context menus, dragging etc. It also ensures that the URL of the document begins with “http://” - if not, it navigates away. On IE it also stops the page being printed.
Finally it does a quick check for frames and breaks out of them if it finds it’s running in one.
So, basically, the page that eventually arrives at the browser is the one that was originally written. Links will work and it’s not just a big graphic.
Does it really protect your pages? Well, sort of. Recovering the original source and disabling the protection is the work of a few minutes, so it’s not really much good for stopping people pinching bits of your script library or single images. If, however, you’ve got a whole load of images on your site (photography, art etc) that you want to protect then this package will make it much more time consuming to steal them.
It’ll also keep out anyone who’s not familiar with stuff like JavaScript.
On the downside, pages will fail to display if the user has turned off JavaScript. Some protection will fail to work in browsers like Opera or Mozilla.
The bottom line is that if you want me to be able to see your data then there’s almost no way you can stop me from stealing it. I went through this whole thing with trying to prevent a financial website being hit by scraper tools. There are ways to render data human-visible while remaining useless to scraper tools, but that’s a whole other can o’worms.
I don’t know what’s more impressive - the length someone goes to to hide their page generation, or the clear and informative manner in which Armilla described it.
My hat is off to Armilla.
Since when is Bas64 encoding considered encryption?
In my mind, and in the rest of the world, apparently, encryption involves making information secret by applying an algorithm to it that needs a key to correctly reverse. Base64 and CRC don’t fit that description. RC4 and DES fit that description to a tee, however.
So, can you explain your use of that word?
On the subject of Base64 encryption:
I call it encryption because it can be used as such. The key you say it lacks is actually the definition of the characters it uses to represent 0-63. Most Base64 algorithms have, when you look at the source, a string that goes something like “0123456789abcdefghij…xyzABCDEFG…XYZ+/” that is this definition. The sequential definition is the one you see most often, but it needn’t be so. Without knowing the definition key used to encrypt the data you can’t decrypt it correctly.
Is it good encryption? No, nowhere near. It’s basically a substitution cipher and they’re notoriously easy to attack with statistical analysis.
I’m not sure why you seem so offended by my definition, but if it makes you feel any better feel free to read “encryption” as “encoding” in my post above.
Armilla, thanks for the very interesting (and over my head) analysis. If I have any difficulty saving a grphic from a wevpage I’ll just do a screen capture so this would not work but it would prevent me from capturing text.
I do not believe there is anything wrong with your use of the word “encryption” at all. Words are used in context and in this context it is appropriate. Encrypt means to hide and the purpose of that software is to hide the original code. That is also the word the seller uses. Of course, this would not be be considered valid encryption for communication purposes but the word is still correct in this context.
More importantly, if one is viewing the graphic, one has already downloaded it. All those silly ‘context menu’ tricks do is prevent you from renaming the temporary file from within that browser session. (Granted, screen captures beat any of these types of protection generally, not just this one type of obfuscation.)
Yes, I found all the graphics from that page in the “temporary internet files” cache folder. I sometimes get images from there anyway because there’s a known issue with IE which sometimes, when you right click an image, only lets you save it as BMP. I remember there was a thread about this and someone posted a workaround but I don’t remember.
Clear the cache.
Since when was I offended? Where do you get off, thinking you offended me? I’m never offended! Ever!