# Linear least squares optimization

I need to compute a set of coefficients c that best minimizes e for:
x[sub]n[/sub] - e = c[sub]1[/sub]x[sub]n-1[/sub] + c[sub]2[/sub]x[sub]n-2[/sub] + … + c[sub]l[/sub]x[sub]n-l[/sub]

I have quite a lot of data points (millions) which I can easily access in C++, though (at least with my skill level) not MatLab. I’ve been able to perform a decent analysis in MatLab on short sets of data points, but I’d like to go further.

The algorithms for doing this analysis seem to involve pretty heavy matrix math, such as a [n, m] * [m, n] matrix multiply.

The answer doesn’t have to be perfect, but I want good coverage of my data set. I don’t want to have to load everything into memory at once if possible. Anyone know of techniques that converge on the right answer given enough samples? A progressive technique that operates one sample at a time would be optimal (especially if it works in a single pass).

Also, are there any techniques that minimize a different falloff function, such as exponential? Minimizing the squares isn’t necessarily optimal for my purposes.

This isn’t my area of expertise but the key step for this minimization is just to invert a lxl matrix where l is the number of coefficients. The millions of data points need not be present in the memory; just make a single pass through them to build the matrix.

I’ll e-mail you the C code I use if you wish.

God gives burdens, also LAPACK.

Main download site. It ships with C APIs, which you can use from C++.

Example code.

More example code.

Intel’s documentation.

As for not loading everything into RAM, I dunno. Buy an SSD and transfer the data to that? There’s probably a clever solution which is elegant, efficient, and wrong the first few dozen times, which is why we lean on library code so heavily in the first place. Sidestepping the problem with hardware is a reasonable option at this point.

This is probably a bit opaque even to CS geeks.

I mean, load the data onto an SSD, mmap(2) it all into your address space, and let the OS take care of moving pieces of it in and out of RAM with the normal page cache mechanisms. That’s why we have OS kernels, to abstract away these kinds of details. It will entail a performance hit relative to the tuned-and-optimized code you haven’t written and probably shouldn’t try to write, but if the data’s on an SSD it shouldn’t be that bad.

(mmap(2) is a Unix system call. Windows has… something… hand-wave… check MSDN.)

Aha! That you told me it was possible led me to the solution.

I was thinking that I needed to do the A[sup]T[/sup]A multiply in memory, but it’s really just the repeated sum of a matrix with x[sub]i[/sub]x[sub]j[/sub] as elements. That’s easy enough, and from there I can do the inversion and the rest. Thanks!

BTW, I meant to thank you as well, even if I didn’t use that technique. Windows does in fact support memory-mapped files (and my SSD happens to be particularly fast).

I ended up coding it all up in C++, using the Eigen library for the matrix math. Nice and easy to use. The results match Matlab (more or less), so I’m pretty confident in the results.

I do find the results very curious. I was expecting the coefficients to go down roughly monotonically with their distance from the predicted sample. And this is true to a broad extent (the closest two samples dominate), but there are some interesting exceptions.