# Is there a limit to how computational expensive a function can be?

Let’s say I have some function that maps a string of n bits to a string of m bits, and I want to implement it on some computer. Obviously, there are an infinite number of ways to implement it, and some will be more expensive than others, which implies that there is an optimal implementation(for a given computer architecture, of course).

Now, suppose I have an implementation of my function, but it seems to be quite a lot more expensive than it needs to be. Is there any algorithm I could apply to my function to make it cheaper(without blowing up memory requirements to 2**n, that is)? I don’t think that the algorithm necessarily needs to operate on recursively enumerable functions; the function will be limited to looping, basic arithmetic and if statements.

Is what I’m asking for hopeless vague(or just plain hopeless)?

I suppose another way of putting this is:

a) Is there any way I can even approximate the cost of the optimal implementation of a given function that corresponds to my restrictions?

b) How would one go about actually writing a near-optimal implementation?

I think this is hopelessly vague, but as a start, here are two ways to reduce your computation time these days:

1. Parallelize it across multiple processors.
2. Use SIMD instructions if possible.

You can be bound by CPU, memory, or I/O, so optimal implementations may vary. In some instances it’ll be cheaper to by more memory, in other it’ll be cheaper to get a faster CPU, etc. There’s no way to approach this purely generically.

You are completely out of luck.

The areas of interest in Computer Science are Algorithms and Complexity Theory. Algorithms deals with finding the best solutions do particular problems and Complexity Theory deals with how problems fit into general categories. There’s a lot of overlap as knowing the most efficient way to solve many problems in turns determines the complexity of entire classes. The most famous open problem is of course “P=NP?” and your example of course contains that problem (and many more).

Problems with provable optimal solutions are limited to a number of special cases. E.g.,

1. Trivial lower bounds like linear time. E.g., taking the average of n numbers must take on the order of n steps since each input has to be read.

2. The problem is complete for a class with a bound that gives you your answer. E.g., if it is complete for exponential time, then you cannot do better than exponential time. Solving formulas in many simple logics is known to be exponential time (or worse) for this reason.

Pretty much outside of such cases, there are no known non-trivial matching upper and lower bounds.

So if humans don’t know how such things might be solved, then there is no way a computer can be programmed to do it.

Umm, but you see that is all you need to implement a Turing Machine and therefore (in terms of computational power) you have already hit the maximum.

I think you are in over your head by several hundred feet.

As ftg points out, looping, basic arithmetic, and if statements are enough to implement the whole class of computable functions.

If the bitlength n of the input, from your problem, is supposed to be a constant, then you can always make a constant time implementation, by just constructing a lookup table for the whole thing. (Blowing the memory requirements up to 2^n, as you noted). Clearly, you don’t want this, but you’ll have to be more specific about what it is you do want.

There is a clear, brute-force way of finding the absolute optimal implementation, whatever your metric is, if it’s reasonable; look at all the possible computational runs that take at most 1 instruction and at most whatever amount of memory you’re looking for on all the possible inputs of n bits, then all the ones that take at most 2 instructions on all the possible inputs of n bits, etc., till you find the smallest such sequence implementing the desired behavior. Of course, this method is entirely impractical. Hideously, monstrously so.

You can take various solutions and figure out how many times they need to process each item of data per the total size of the data. If you have 100 bytes to process and you only need to look at each byte once, then your run time is going to be 100 times the time it takes to look at a byte. While as if each time you see a byte, you run back to the beginning and scan forward to the next byte (1, 1-2, 1-2-3, 1-2-3-4, etc.) then you’ve got an algorithm that’s going to take exponentially more time to process the data as the number of bytes grows. These two examples might be represented as the following:

O(n)
O(n^2)

Obviously the top one is better unless you’ll only be dealing with very small n’s, and the second solution has a run time per look that is impressively better than the first such that even given the exponential nature, it will still outperform for small enough data sets.

Taking an algorithm and figuring out what sort of formula it will follow as data grows is essentially something you just need to be able to do as the programmer, having an intimate understanding of how you’re processing the data. Reducing the program into math and reducing that down to a formula would be prohibitive. If you don’t know how the algorithm works, though, you can just pipe different sized data into it and graph the line and then work back from that.

Now none of this gives you an optimal solution, but as a human you should be able to figure out what is the best possible graph one could concievably have, and then figure out which algorithms actually match that.

You have some high expectations for humans. What’s the best possible conceivable “graph” for solving an NP-complete problem?, for example.

(Not to be snarky, I just don’t like glib appeals to human intuition)

Hey if you got something better, go ahead and propose it. Till then, you’re just going to have to go by what your own capabilities tell you is possible and what isn’t (as well as some good ol’ research.)

Sure. I’m just saying, sometimes those capabilities say “I dunno…”; that this is a class of some extremely challenging questions.

1. IIRC for mathematical functions that are continuous in some senses, Taylor series expansions often offer faster means of computation.

2. Some functions can be solved with iterative methods like Newton - Raphson. I think that’s how Pentium FPUs implement division, in fact (it is NOT a computerized version of the way children are taught in school). Some methods (Marquardt (sp?)) require the derivatives of the function with respect not to x but to each of its parameters. You can often speed execution by replacing the correct derivative with an expression that’s close and easier to calculate.

3. If you are doing an iterative calculation of a continuous function, you can store a lookup table and eliminate the earlier iterations.

:smack:
I should have mentioned: it only uses a fixed amount of memory. And I don’t mean that in the sense of it’s a 32-bit machine and can only address 4GB. It only uses a kilobyte or so of memory. Besides, given the fact that we’re limiting the size of the input to n(n is fixed for our function here), the function can only ever be operating on n bits of information. Technically speaking, that falls in the realm of deterministic finite automata.

Anyway, let’s back up a bit. Let’s say I wrote a program that outputted 1 million random arithmetic instructions – forget about any if statements or loops. If there were a small enough number of input bits to this randomly generated function, then a lot of those instructions would be redundant somehow. The trick is finding and eliminating them.

Does this clarify things at all.

This is a very common misconception. Everything for everybody is always fixed. But that doesn’t mean spit.

Extremely simple computations on extremely small inputs can be fantastically expensive. (Google “Busy Beaver function”.)

If you think “but I’m only doing it for a single n therefore it’s an FSA” You Are Doing It Wrong.

I’d be very surprised if division were implemented with Newton-Raphson in the Pentium FPU, since there are already nice binary division algorithms that use a fixed number of steps (for a given desired precision). Also, Newton-Raphson, when applied to f(x) = x - a/b = 0, doesn’t have a very helpful form for finding x.

But, I could believe the square root function is implemented with Newton-Raphson in the Pentium.

Optimizing compilers do this trick all the time of course. They’re not guaranteed to find the absolute best re-write of any expression you throw at them, nor are they meant to especially. But the absolute best one can’t be found anyway in the general case.

Suppose you have a function is_prime that takes a machine-sized integer and returns a boolean indicating whether the number is prime or not. Suppose your main program takes a whole number as input, adds 1, doubles it, and outputs the result of calling is_prime on this number. You and I know that the output is always going to be False. But where are you going to find the Ultimate Omniscent Optimizer that will infer this fact on its own, and substitute print False for the entire program?

is_prime isn’t a fundamental arithmetic operation, though.

Whatever it is you consider to be the fundamental arithmetic operations, is_prime could presumably be built up from them.

Fine, but compiler optimizers don’t look into the implementation of functions that get called to optimize the callers.

I’m having difficulty understanding this sentence. All I meant to do was clarify Bytegeist’s point: you can write some very hairy programs (using only your fundamental arithmetic operations) that simplify to some very simple things, but without there being any reasonable expectation for a compiler to be able to deduce the simplification.

For example, you might write from scratch the code for a function that takes in an input of four integers, and outputs True if they are a counterexample to Fermat’s Last Theorem and False otherwise. A smart compiler could deduce that the output will always be False, and thus simplify your code tremendously, but… it would have to be very smart. In general, no compiler is going to be able to figure out all the potential valid optimizations of this sort.

You did specify a fixed amount of memory in your above clarification, which means that you could theoretically brute-force solve your problem, as I mentioned above. But not realistically; a kilobyte of memory is still a shitload of possibilities to sort through.

Perhaps if you stated precisely what your fundamental arithmetic operations are, then we could say something more meaningful about simplifications.

For example, if your fundamental arithmetic operations are just addition, multiplication, and introduction of constants, then every calculable function is just a polynomial over its inputs, and the question becomes “Given a polynomial, how do we deduce the most efficient way to compute it?”, under some weighting of addition and multiplication operations (and either ignoring or not ignoring the effects of overflow). This doesn’t seem like it would be so hard to solve, though I can’t come up with anything particularly pretty off the top of my head, just brute-force. Still, is this the sort of thing you’re looking for?

This is a reasonable statement if you’re coming from the point of view of recursive function theory, but when most people talk about the fundamental arithmetic operations, they mean addition, subtraction, multiplication and division. I don’t think you can get every recursive function from those.

Nah, just the rational functions, of course. When I made that statement, I had missed the part where Rysto had removed loops (and presumably unrestricted recursion) from consideration; I had assumed he meant to deal with functions using arithmetic operations in addition to, say, while loops.