Anybody here have any experience with tweaking the Linux kernel for heavy duty calculations? Specifically Nastran. Anybody know of any special math libraries that can speed up solving time? Any memory management tweaks or anything of the sort?
sorry about the hijack, but googling for nastran cluster does return some hits, although not nearly as many as I would have expected. I have also seen mentions of it of the “Chinese build yet another big cluster” variety. So I guess running finite elements analysis on large clusters is not yet a commoditized service for whatever reason.
What sort of calculations are you trying to do?
Unfortunately I think only ansys has only one of their solvers ready for gpu computing. We are doing mostly static and dynamic solves. On the other hand we also use quite a bit of Fluent.
Assuming you only have an executable you re pretty limited. On x86 it is worth looking at the Intel math libraries, Here. They are pretty much as good as it gets. Whether existing compiled code can or will make use of them is another matter.
As to kernel tweaks, there isn’t a lot of influence the kernel has. A few things are worth doing.
Turn off hyperthreading if you have it. (This is a BIOS thing often.)
Try to avoid context switching. This means keeping the number of threads the same as the number of CPU cores. You could try playing with thread affinity to try to keep threads associated with a particular core. What you want to do is prevent a thread losing its cached data. Context switching will do this. Disabling all the background cruft in the OS may help a bit to. It isn’t the time taken in these tasks that matters, it is the disruption caused.
It might help if you reduce the amount of RAM the kernel uses by recompiling it without any modules you don’t use. This won’t do anything at all if you’re truly compute-bound as opposed to being RAM-bound and thinking you’re compute-bound. Other than that, the kernel doesn’t have much of anything to do with your software’s numerical performance. If you are truly compute-bound, the only thing you can do is go with more and/or different processing hardware.
Search for Beowulf