As I understand it, when performing data compression there is always a trade-off between the amount of data you can compress vs. the amount of time/calculations it will take to compress and uncompress the data. For instance, compressing an image using JPEG2000 will result in a smaller file than if it were compressed using JPEG, but decoding that image from a JPEG2000 format will (I presume) take more CPU resources than decoding the JPEG file.
I was wondering if there is some theorem in information theory that states this always has to be the case, something analogous to a conservation law in physics. Like, is there some quantity of ‘energy’ associated with data that is the same after we compress the data, as long as we take into account the ‘energy gains’ or ‘losses’ that took place in the process of compression. Or is my hypothesis simply not true?
Hmm… I’m not sure, you could probably find some examples that at least seem to contradict it, but then, in computation as in thermodynamics, there are always processes that expend ‘energy’ without having much effect. (It’s very easy to write an algorithm that will use a lot of CPU resources and not compress a file at all )
Part of the effect is that out of all possible compression analyses, some will always be too CPU-intensive to perform given the current state of technology, and a few of those will be very efficient. The most efficient out of those that are feasible, and that we’ve figured out, will be implemented, and as better CPUs are invented, some uses may be found for the ones that aren’t feasible now.
Sorry that I can’t provide more useful answers. I’ll be watching this thread when I can.
One thing to keep in mind is that you can trade off compression time and decompression time separately, to an extent. So decompressing your JPEG2000 might not take longer than decompressing a regular JPEG (but I don’t know whether it does).
Your right, that was a bad example on my part, (1) for the reason you point out and (2) because I should have specified lossless compression, to avoid muddying the waters. So, to remedy things:
Imagine losslessly compressing an image file using JPEG2000 vs JPEG. I presume the JPEG2000 encoded image will be a smaller file than if it were compressed using JPEG, but the overall CPU resources required in both compressing and decompressing that image using a JPEG2000 format will (I presume, again) be more than what is used in compressing and decompressing the JPEG file.
FWIW, this is the nugget of info I was going off of:
Well, except for the business about “energy gains” or “losses” that take place in the process of compression. There’s none of that with information entropy, at least not as I understand the intent of those terms. And therefore, I don’t think it relates very directly to the OP’s concerns; one can apply a very complex function to a source without changing its information entropy at all, but while making it terribly difficult to recover (you could even use uncomputable transformations).
Agreed. The OP even agreed that his JPEG2000 example wasn’t on point. But if you omit the qualifying phrase, the fundamental concept that he seemed to be approaching was the concept of a minimal expression in any general domain to represent a message.
I disagree. While it’s true that you can apply a very complex function without changing the entropy, and make it terribly difficult to recover the original information, it is not true that you can apply that function to compress a general message (i.e. no private shared key) beyond a certain limit. That limit is the measure of the information entropy of the message.
Well, right, but I don’t think I said differently. But perhaps all we disagree on is whether this is the sort of thing the OP was getting at. I guess, even if not exactly what he was asking for, it’s in the area, but it does seem to me the OP’s question is very specifically about the trade-off between the efficiency of an encoding/compression scheme (in the “How large is the encoded/compressed message?” sense) and the computational complexity of its compression/decompression. And the issue of computational complexity is not something which the plain-vanilla concept of information entropy addresses at all.
I don’t know about JPEG/JPEG2000 particularly, but I don’t think that would necessarily hold in a more general case. Improvements in the compression phase can be made by more detailed analysis of the data to find the most appropriate method, leading to an increase in complexity. During decompression the processor is told explicitly which method to employ so no further analysis or complexity is required.
I think this is more of an economic theory than an information theory.
Imagine the reverse of your situation:
For instance, compressing an image using JPEG will result in a larger file than if it were compressed using JPEG2000, and decoding that image from a JPEG format will take more CPU resources than decoding the JPEG2000 file.
Who would use JPEG? Why would anyone choose to use the algorithm that is less effective AND uses more resources? They wouldn’t, JPEG would simply be the old, crappy predecessor of JPEG2000.
Economically, the result will be that the cheaper process will also be the less effective process, because if it were just as effective, or more effective, it would be the ONLY process.
I’ve actually published a paper on this topic. Well, a theoretical version. In short, you can define compression methods associated with different categories of Computational Complexity. E.g., Poly-time§, Nondetermininistic Poly time (NP), Exponential time, etc., as well as a myriad of inbetween complexities. (Doing this is not as easy as it might seem. Some very famous Computer Scientists have erred in this regard.) As you move up to higher complexity levels, the better the compression you can achieve.
But that’s theoretical. In practice, the usual methods are comparatively ad hoc. No one yet knows how to turn the theoretical ideas into practice. (It sort of has to do with all the usual open problems in Complexity Theory.)
The OP asks a lot about decompression time, and the above also applies to decompression time as well. The better the method, the longer it takes. Provably.
While there’s a lot of Information Theory stuff that could be applied to the problem, I used straight Complexity Theory methods.
(To give an idea: Take a string that has 1’s separated by a lot of 0’s. I mean a truly immense number. Exponentially increasing is nothing compared to the spacing I’m talking about. You can show such strings compress better using an Exponential time algorithm than any possible Poly time algorithm.)
Having said all that, for the most part the limits of decompressing most stuff is still the hard drive or network speeds. (You have to get up to HD video to seriously load a modern processor.) If your jpeg decoder is taking a long time, it’s probably not reading data in very efficiently.
Thanks ftg. In fact, the theoretical version is more what I was interested in. I had a hunch this idea might be related to computational complexity, but I’m not well enough versed in the area to invoke the jargon.
Could you give a cite for your article or a link to your article? (That is, if you feel comfortable doing that–I understand any privacy concerns you may have). I would be very interested in reading it, even if it’s likely I won’t understand a whole lot.
Would I be correct in guessing that you’re talking about gaps of zeros so wide that the limiting factor becomes the speed with which one can write down the number of zeros?
This is precisely the case. Compression/decompression time is generally not symmetric, and is weighted heavily toward the compression side. In fact there are many compression schemes that are asymmetric, usually because there is some realtime requirement for the data, e.g. a video or audio stream being played back. Depending on the hardware available, various flavors of MPEG, H264, MP3(mpeg audio) and many more take more resources to compress than to decompress.
I wasn’t aware that JPEG could compress losslessly at all; it’s by definition a lossy compression scheme.
Whoops, my bad. Okay, officially throwing out example.
New example: gzip vs. bzip2. Compressing a file using bzip2 results in a smaller file size than gzip, but takes much longer to compress (though decompression is still relatively fast).
Yes, the JPEG specification includes a lossless compression mode. But it’s not used very often. I don’t know of any image manipulation programs that allow the user to save lossless JPEGs.
Wrong track. Remember, the goal is to compress the string as small as possible. The extremely low density of 1’s means that the original string has virtually no information so it should compress to a tiny amount. Low complexity algorithms basically “get lost” and can’t accurately keep track of where the 1’s originally were if they try to compress as much as the higher complexity algorithms.
ultrafilter: Wrong function. Think Busy Beaver (but time or space bounded). Ackermann’s is so simply defined it is “too easy” in terms of compression complexity!
My understanding is that a better compression algorithm implying a more complex algorithm is true at two extremes. At one extreme, there is a definite limit to the compression you can get with an algorithm of a certain complexity, and if you’ve reached that limit, then the only way to improve compression efficiency is to use a more complex algorithm. At the other extreme, a fairly trivial way to get slightly better compression is to run an iterative process many times, each time producing a slight refinement, or to try several different compression algorithms, and use the one that compresses the data in question the best.
In the middle, however, there is definitely space for a new, clever algorithm to be both better and simpler than what came before. I don’t know enough about the state of the art in compression to know how much middle there is left, but it’s certainly possible that one of our current algorithms is both less efficient and more complicated than it needs to be, and that a refinement would run faster and compress better.