Splitting Efficiency Hairs -- trig functions or powers/roots?

We had an example problem in my Graphics course today, it’s just a standard “rotate something around an arbitrary line through (0,0,0) and (a,b,0).” The insight is that there’s some angle -beta that you can rotate the line by to get it on the x-axis, and then you can rotate it about the x-axis. Either way, that’s not the point, the idea is that beta = tan[sup]-1/sup, or the equivalent for cosine and sine. Then you can get a rotation matrix that needs the elements cos(-beta), sin(-beta), -sin(-beta), cos(-beta). He asserted that since you know a and b you can make the following substitution instead of taking time to compute beta:

cos(beta) = a/sqrt(a[sup]2[/sup] + b[sup]2[/sup])
sin(beta) = b/sqrt(a[sup]2[/sup] + b[sup]2[/sup])

My question is this, from previous threads here and searching elsewhere, I think the trig functions and their inverses generally use a lookup table. So one division and four (assuming naive computation) trig lookups would likely be cheaper than four divisions, four square roots, eight squares, and four additions (assuming naive computations).

I asked and he said he researches and has a degree in computational geometry, which intersects with graphics, but since he doesn’t focus on graphics he’s not really overly knowledgeable about the intricacies of implementation details as far as graphics processing goes (so we focus more on theory). And that this suggestion just comes from a book he read. Which is fine, but I want to know now :p.

I understand it probably depends on the language and compiler in question (whether it does taylor series expansions vs lookups for trig functions), but in general, or if you want to get specific, let’s say gcc with the standard issue Ubuntu Linux <math.h> functions, what would likely be cheaper?

For the hardware I typically work with, sin/cos/divide/sqrt are all “special” functions and take about the same amount of time to compute. You can do something like 8 a*b+c ops per special op. I don’t know the numbers on x86 but the basic principle is the same; adds/multiplies are cheap and other ops are more expensive.

So with the breakdown you mentioned, it’s tough to say–it could go either way, depending on the architecture. You’d definitely want to share the 1/sqrt(a^2 + b^2) computation, though.

Overall, though, I’d say it’s irrelevant. In practice, no one ever generates a transformation matrix and then transforms two vertices with it. More likely, you will transform thousands or millions of vertices, and in that case the cost of constructing the matrix is infinitesimal compared to the other costs.

I’m no expert at these things, but I have some not-quite-wild-ass guesses.

First, I doubt that trig functions are computed by table look-up. Do you imagine there is a built-in table of radians at increments of, say, 0.001 and their functions? Or at increments of 0.0001 which would be quite a huge table. How would you compute sin(1.3745285743)? Just find the nearest adjoining entries in the table and do linear interpolation?

I think that the functions are actually computed by their series, requiring some iterative algorithm for each one. (I understand that there are very efficient algorithms knows for many kinds of computations like this, even though I don’t know any of the details.) But still, each function call to sin(x), cos(x), or atan(a,b) is computationally expensive, compared to simply computing a single sum or product or quotient, or even several of those.

Computing something like: a / sqrt(a^2 + b^2)
must certainly be faster than that – especially if the compiler is smart enough to compute that as:
a / sqrt( aa + bb )
rather than do the generalized exponentiation routine. If in doubt, write it explicitly as
a / sqrt( aa + bb )
in the first place.

Then only the sqrt(x) function requires some kind of iterative computation, and even I know of algorithms to do that reasonably quickly.

Next, we take up your comment about “naive” computations. Just how “naive” are we talking about. We note first that of the four functions that you need to compute:
cos(-beta), sin(-beta), -sin(-beta), and cos(-beta)
two of those are the same! So you only need to compute cos(-beta) once.
And once you compute sin(-beta), then you can easily get (-that ) without having to compute the sin again.

And, furthermore, this part: sqrt(a^2 + b^2) occurs repeatedly, so you only have to compute all that once.

My strong intuition here is that the short-cut computations suggested by your professor or textbook will be computationally much more efficient than using arctangent.

And, building upon Dr. Strangelove’s remark that you build your transformation matrix just once and then apply it to a bazillion points:

Note further that matrix multiplication is associative, so the matrix product
A(B(CD)) is the same as ((AB)C)D

Well suppose you have several distinct transforms (say, a rotation, then a translation, then a scaling, then another rotation) that you want to apply to a zillion points. Instead of transforming each point by four separate matrix multiplies, you can instead multiply all of the transform matrices to produce just one all-in-one-combined transform matrix and then transform each of your zillion points by that. More efficient indeed!

By naive, I’m talking as if the compiler is dumb as a rock, it computes sqrt(pow(a,2), pow(b,2)) every time cos(-beta) every time, doesn’t realize that -sin = a negative version of a value I already have etc.

I was operating under what I’ve always been told that multiplication, division, pow, and sqrt are ridiculously inefficient, and what I’ve been told that there’s often a table lookup to a certain precision for most implementations of trig functions (with a check to see if your number is above a certain precision, or outside the range of common numbers, which then kicks in alternate methods). If either of those facts were incorrect, I can see how it would fall apart.

Sorry, I didn’t mean to sound pompous, it was just a question that got me curious, I wasn’t trying to posit that I’m smarter than the professor and books or whatever, it was just something that I thought might be true based on what was apparently fallacious information. Sorry.

PS: I know that it’s trivial compared to the amount of vertices you’re transforming, I was just wondering.

Multiplication is cheap these days. The others are more expensive, but exactly how expensive depends. Pow, for instance, is typically done as a log-mul-exp sequence, so it’s at least as expensive as any of those components. Divide and square root often aren’t so bad, and sometimes you can so 1/sqrt(x) as a cheap, single op.

Yes, many of these functions are partially implemented as a lookup table for the initial approximation–sqrt(x) is sometimes implemented as a lookup to get the first few bits, and then using a Newton-Raphson iteration to generate the full precision.

Special functions are also sometimes implemented via a lookup table to get coefficients for a Chebyshev polynomial or similar.

Isn’t that the sort of thing people use quaternions for? That way, you can get away with only doing multiplication…

Quaternions are fun, but in my experience they come after learning transformation matrices (which can also handle scaling, translation, shearing, etc.). Only after matrices are understood do you move on to quaternions, which are more limited but offer some advantages. I was first introduced to them in terms of rotating an object on-screen via mouse, where the scalar+vector notation is much more convenient than Euler angles.

Besides, most of the time the quaternion gets turned into a matrix anyway, and you can’t get away with some trig functions in any case.