What is the difference between a regular Fourier Transform and an FFT (Fast Fourier Transform)? I gather that an FFT is some kind of approximation that is much easier to calculate, given a set of conditions which must be true. Can anyone clue me in on the details?
The Fourier Transform works on continuous waveforms. The Discrete Fourier Transform is for things that are sampled in time. The FFT is a special case of the DFT. Someone figured out a much more efficient way to calculate DFTs, under certain conditions. One major condition is that the number of sample points has to be a power of two (128, 256, 512, 1024, etc.).
The FFT is an efficient method of calculating a DFT. (Some call the algorithm FDFT to highlight this.)
It can be calculated for any number of arbitrary points. (Good old Al Borodin figured that one out.) It is slightly more efficient for certain “nice” values (roots of unity for your field) and powers of 2 number of points.
The sampling space is problem dependent. It may be time or it could be any other measure depending on your application. Note that one of it’s common application is convolution. That stuff is really cool. It has an amazing number of applications. Unfortunately, it has become one of those 'if your only tool is a hammer …" things and is used too often.
I wrote a solution to FFT at one point in MatLab (yes, I know it has its own – that’s not the point). It worked flawlessly, provided the number of samples was a power of 2.
I failed to get it to work on data sets of any other size.
In doing some research, I discovered that it is quite common to take a data set of n points and pad it out to the next power of 2 either by padding with zeroes or by padding with a repeat of the data set.
Given the improvement of the FFT over the DFT, this doesn’t result in any serious time wastage. The DFT could operate directly on the n points, but the FFT on the n+k points (n+k being the next power of 2 above n) is still faster, even when n is just one bigger than a power of 2 already.
I discovered that I could use the FFT as long as I could divide the data set into two equal sized sets (An even number of data points), and the FFT is recursive in nature, so I gain the FFT’s benefit on each of the two equal sized sets. The moment I have an odd number of data points, I found I could only use a DFT.
So my worst case scenario is where n itself is odd (or, from what I understand, even worse, prime).
I was under the impression, like CurtC, that the FFT only really works for powers of 2 (or at least a very large number of factors of 2).
What is the worst case scenario, and what solutions are there to help deal with it?
The original FFT algorithm worked with vectors of length N = 2[sup]n[/sup] (and ran in time ~Nn, much faster than the naive N[sup]2[/sup] algorithm). But that technique can be adapted to more efficiently transform vectors of length N whenever N is not prime. The best-case improvement is still when N is a power of two, but the FFT is still very efficient as long as N has no large prime factors. (Such numbers are called “smooth.”) The DFT of length N, when N is prime, can still be approximated as closely as desired by padding the vector to a power of 2 and then resampling.
So if I had, for example, a data set of size n=3^p for some integer p, I have no large primes – what do I do now? I presume there’s some way to break this up into equal sets of 3 instead of equal sets of 2. And now the problem becomes factoring n into primes.
OK, here’s a sketch. First, let’s review the FFT for even numbers. Suppose we have want to find the DFT V of a vector v of length 2N. We can write this as
V[sub]b[/sub] = (sum[sub]a=0[/sub][sup]2N-1[/sup]) w[sup]ab[/sup] v[sub]a[/sub] ,
where w = exp(pi i / N). We break v into two interleaved pieces (the odd elements and the even elements):
V[sub]b[/sub] = (sum[sub]a=0[/sub][sup]N-1[/sup]) w[sup]2ab[/sup] v[sub]2a[/sub] + (sum[sub]a=0[/sub][sup]N-1[/sup]) w[sup]2ab[/sup] w[sup]b[/sup] v[sub]2a+1[/sub] .
Each of these two sums is a DFT of a vector of length N; the second one has a phase shift w[sup]b[/sup] applied to it. So we can recursively compute the DFT.
Now suppose v has length 3N. Now we break v into three interleaved pieces instead of two:
V[sub]b[/sub] = (sum[sub]a=0[/sub][sup]N-1[/sup]) w[sup]3ab[/sup] v[sub]3a[/sub] + (sum[sub]a=0[/sub][sup]N-1[/sup]) w[sup]3ab[/sup] w[sup]b[/sup] v[sub]3a+1[/sub] + (sum[sub]a=0[/sub][sup]N-1[/sup]) w[sup]3ab[/sup] w[sup]2b[/sup] v[sub]3a+2[/sub] .
Again you see that the three sums are DFTs of length N; now the second and third sums have phase shifts applied.
The problem of factoring N is not really as bad as it sounds. It’s true that factoring a large number is (believed to be) hard, but what this actually means is that no factoring algorithm is known that’s polynomial in log N. Even the naive check-every-possibility factoring algorithm only takes time ~sqrt(N), while the FFT takes time ~N log N, longer than the factoring takes. The point is that the parameter used to measure the difficulty of a problem is the length of the input, which is different in the two cases: ~log(N) for factoring but ~N for the DFT.
I recommend A. B. Borodin and I. Munro, The Computational Complexity of Algebraic and Numerical Problems, American Elsevier, New York, 1975, for a general (vs. ad hoc) method. Note that it is O(N log[sup]2[/sup]N) vs. O(N log N) but, as indicated above, one can usually avoid the extra log factor.
[hijack]
This reminds me of a “paradox” from a professor whose algorithms class I graded for. Suppose you have an algorithm to fill an array with the first n Fibonacci numbers. This is pretty straightforward: just fill in the first two values, and obtain the others from the recurrence. What’s the running time?
Well, it depends on whether the array is passed in as part of the input. If it is, the algorithm is linear in the size of the input. If not, all you have is the integer n, and the algorithm is exponential in the input size!
Of course, it’s not really a paradox, as the two inputs are drastically different in size. You start from two different points, you might get different results.
Still, this is kind of an interesting case.
[/hijack]
IIRC (and that’s a big “if”), Cormen et al. have some material on the FFT. Might be worth looking at, just cause it’s a standard reference.
[hijack]
Mm, I fail to see why the input form affects the big-O in the fibonacci hijack.
The problem requires you fill an array filled with the fibonacci sequence numbers in some fashion.
The order of magnitude of difficulty of this is independant of the inputs. If your input is the integer n, then you have a few options:
Use the recurrence relation F(n)=F(n-1)+F(n-2) directly as a recursive function to compute each number. This is O(2^n) for large n, so O(n2^n) for the whole array.
To acquire the nth item, start with the lowest two (0 and 1), and add pairs to produce a third, and then toss out the lowest, until you’ve added n-2 times. Any n can be found in O(n) this way, so the whole array can be filled in O(n^2) time.
Same as 2, but store the sums as you go, since they’re all part of the sequence. This means the whole array can be done in O(n) time.
The only requirement for #3 is that there be some way to pass an array back to the caller. The mechanism for doing this is language dependant, and there’s usually more than one technique. But one that immediately leaps to mind that’s pretty universal is dynamically allocating some memory and using it as an array and passing a memory address back to the caller.
None of this requires any sort of change to the inputs.
[hijack]
Of course, we would use option 3). Dynamically allocating the array and returning the address seems like the best option. But in that case, the array is not part of the input, and so its size is not relevant to the input size.
Consider if instead we were told to generate the first n Fibonacci numbers and not store them. n would be the only input there, and the algorithm would be exponential.
Remember, a number is exponential in the size of its representation (unless you’re using unary).
[/hijack]
int F(int n)
{
int c;
int a=0,b=1;
if (n==1)
return a;
if (n==2)
return b;
n-=2;
c=a+b;
while (n>0)
{
a=b;
b=c;
c=a+b;
}
return c;
}
This C source is clearly linear in time to produce the nth fibonacci value, and produces all the others along the way. Even if you were to then use this to fill the kth spot in an array of n items by calling F(k), you still would only get a quadratic algorithm.
[hijack]
Actually, a number’s storage requirements is logarithmic. It takes k bits to store a number n as large as 2^k, so for a number n, it takes log2(n) bits to store. Okay, depending on the way you state it, you could say that the number that is representable is exponential to the number of bits used to store it. Fine.
None of this changes the order of magnitude of computing a Fibonacci number. Whether you store the array of all fibonacci values or you don’t, you don’t change the big-O of doing it, and its not exponential. Even if you go outside the integer boundaries of a particular processor, adding together two w word values is still only O(w), which is still proportional to O(log n) where the biggest value is integer n, so the whole fibonacci algorithm is, even using multiword integers, no worse than O(n log n) using a variation on the code I just posted. Which is still not an exponential algorithm.
Running time is computed wrt the size of the input. In terms of arithmetic, the size is measured by the number of bits required to represent the number.
To count from 0 to n is exponential in the number of bits representing n. That’s why the complexity of numerical algorithms is usually measured wrt log(n)–that’s how much space the input actually takes up.
But with an array (or other container) of size n, you need one slot for each element. Counting the slots is linear in the number of slots. Cormen et al. discusses this in their section on arithmetical algorithms, and they probably explain it more clearly than I have.
Mmm, interesting. Okay, well, I suppose it does depend very highly on what you’re measuring with respect to. The size of the input’s very logical. I’ll buy that. Very universal means of measurement.
On the other hand, this means that if you passed in an array plus the integer n, you’d say you had a running time that was a higher order of magnitude for generating the nth number than if you just passed in the integer n.
To me this seems a counterintuitive measurement. I would want to know the order of magnitude of running time based on needing to know the nth number, the purpose of the algorithm, measured in terms of operations needed to solve the problem for the given input. In this case, there is no difference in big-O except for a constant factor.
Besides, if you’re measuring based on strict definition of input size, you’ve two problems.
the storage requirements for integers is quantized (its not continuous).
The storage requirements for each element of the array grows in a similar manner to the integer n, since the elements are themselves integers, so they are subject to any sort of sizing you care to place on the integer n.
I don’t recall whether or not the algorithms text I used in university analyzed integers as flexibly sized (except in the sense of developing multiword integer algorithms themselves).