Java memory handling-- arrays of arrays of...

The biggest improvement you can make in the program is using a sieve algorithm instead of a one-at-a-time algorithm… Just like you already did before you asked us the question. The things like using a square instead of a square root in comparisons are good habits to get into, but only really make a big difference if it’s in the middle of a loop, not something that you only do once.

One other thing you can do in this specific program is to consider more of the earlier primes as “special cases”, and deal in blocks of numbers. For instance, you could consider 2,3, and 5 to all be special cases, and then your only candidate primes are 30n - 23, 30n - 29, 30n - 17, 30n - 13, 30n - 11, 30n - 7, 30n - 1, and 30n + 1, with n an integer. It makes your indexing a headache, but it saves you the trouble of eliminating all the multiples of 2, 3, and 5. How many numbers to treat as special cases depends on how high you want to take your list, and how much you value programmer time vs. processor time.

erislover wrote:

No surprise there, since it’s already been pointed out that it only ran once. :slight_smile:

Now go the next step, turn those booleans into bytes, use a bitmask, and you’ll multiply the total number of primes you can seive by an additional factor of 8 (if booleans are, indeed, bytes on the system you’re using). Of course, this will cost you some computation time, but that’s a standard trade-off where you need to pick for yourself what’s more important: the number of primes you can do at one time, or the total time taken to seive those primes.

This brings up another question (to me, at least): given that most of the math texts I’ve seen talk about primes with the notation P[sub]n[/sub] (the nth prime), what is generally recognized to be P[sub]1[/sub]? I’d guess that it’s 2. However, since 1 technically fits the defintion of a prime, does that make it P[sub]0[/sub]? Or is 1 actually P[sub]1[/sub], and there is no P[sub]0[/sub]? (Zero obviously not being a prime.)

Perhaps the compiler you’re using is smart enough to replace such an initialization loop with the equivalent of a C-style memset().

No surprise there, either, given how many times that inner loop runs. Without resorting to writing a program of my own (again), my gut feeling is that the first 10,000 or so primes cause that inner loop to iterate many times more than the value of ceiling, while the number of iterations after that point (to the end) is very much smaller. If the 10,000th prime is around 100,000 or so, no prime after that can make the inner loop run more through more than 250 iterations (ceiling/prime*2), whereas, again, the first prime, 3, forces over 16,000 times as many iterations as that.

When I worked on my big prime-finder, I ‘cheated’ a lot. The first program was nothing but a brute-force, divide-by-every-odd-number-up-to-sqrt(n), up to 65535 (16 bits). The output from this program was a text file, a C-style initialized array of the first 6,000 or so primes. This I cut-and-pasted into the second program, which started at 65537, and did a divide-by-every-prime-up-to-sqrt(n) algorithm (still only checking odd numbers, of course).

The output from this program was a binary file, which contained modified Huffman codes (I was doing a lot of work with faxes at the time) for the distances between all the primes starting with 2. The codes I used were based on the most-common distances observed in the first 6,000 primes (the output of the first program). Still, when the second program ended at 2[sup]32[/sup]-1, the output file was over 80 meg.

Considering that 4 gig over 16 (2[sup]32[/sup] bits, bit masking only odd numbers, 1=prime, 0=not prime) is 250 meg, the big pains I went through to encode all that crap only bought me a 3x compression. I was, to say the least, disappointed, since at the time, I was working on what would now be considered an antiquated 486 machine, and the program took many weeks to complete (and I had code in there so that if the process was interrupted, it would start over where it left off - a big pain by itself - and the program got terminated due to power glitches and actual work quite a lot).

The break even point is probably treating 2 and 3 as special cases. I was going to bring this gimmick up. All primes other than 2 or 3 are = 1 or 5 mod 6. Anything = 0,2,3, or 4 mod 6 has a factor of 2 or 3 in it.

The “isPrime()” routine I’ve used for along time makes use of this by simply looking for factors less than the square root of the number in a loop where two variables are incremented by 6, representing the possible factors mod 1 and mod 5. For reasonable sorts of numbers (I’m usually doing this to find a hash table size or some such thing), you aren’t going to be testing that many redundant factors (25, 35, etc), and I can’t justify a “cleverer” test or caching results or anything like that.

The reason I say 2 and 3 might be the break even point, is that, at that point, you are eliminating 2/3 of the numbers, a significant gain over just knocking out the even numbers. If you go to to 2,3,5 you wind up with 8 sets of numbers mod 30 as you noted with your indices (1, 7, 11, 13, 17, 19, 23 and 29 mod 30). You are eliminating 22/30 of the numbers, not that much better than 2/3, and probably not worth the added complication (If you REALLY want complication, try coding this sort of thing for the general case of the first N primes).

I think we all agree that if the OP wants to extend the range, if not the performance, of his sieve, he can go to explicit bit masks for the array, rather than booleans, and get an 8 times compression, as at least 3 of us have suggested.

And thank you DaveW, for looking closer than I did, and suggesting, along with other things, that the OP didn’t need to index from 2 every time on the inner loop. That alone should save some cycles, in addition to avoiding even multiples.

yabob wrote:

I’m not sure I did any such thing. The original inner loop went from just-found-prime-times-2 to the end of the array. My code does that, too. My only real improvements were skipping the even numbers, and switching to just addition, instead of that nasty multiply (even if it is a multiply-by-two, usually optimized to a shift-left).

Oh, just making 3 a special case would turn the indexing into the array into a royal pain, and probably negate any speed increase. After all, the array would then go 5-7-11-13-17-19-23-25, etc. There doesn’t appear to be a single, single function to go from some number n to an index - f(19) would be 5, or vice versa, where g(6) would return 23. Creating such functions wouldn’t be difficult, but is the expense at run-time at all worth the extra 1/6th memory savings (16% more numbers could be held in primes compared to only dropping storage of even numbers alone).

Also, yabob I wouldn’t be too sure that the priority is calculating more primes. One of erislover’s latest comments was: “Damnit, I WILL speed this program up.”

You could simply not worry about space compression and use the “increment two indices by 6” logic to ignore the ones that didn’t matter. If you bitmasked, you’d pick up enough extra space that you might not worry about squeezing out the evens.

Actually, you could even do a hybrid solution where you used two indices bumped by 3 into an array representing only the odd values. Same space compression you already have, the index calculation is not complex, and you still knock out the multiples of 3 as well as 2.

Anyway, we’re probably straining everybody’s patience with this question at this point, and have answered the OP’s prime question.

For my level of programming I think I’ll need to rewrite the entire code to get a clearer grasp of what is going on, because my two attempts to half the array size only resulted in a number of primes being skipped, and it also including numbers which weren’t prime :frowning:

Every definition of prime numbers I’ve seen involves stating “positive integers greater than one…[blah blah blah]” Sorta like 0! is just defined to be one, as there isn’t really a better way of doing it.

I’ll be back to programming tommorrow night so we’ll see how halving the array speeds things up. It should be at least by a couple of seconds.

Which was the limits on arrays! lol, we haven’t even touched that, but I suppose we are eating up board time for chatting about primes. Not that this is necessarily a bad thing…

So long as the thread stays open I’ll post any updates which actually increase speed.

Wha?!

erislover wrote:

Heck, post your modified loop. “Where’s The Error?” is a fun and instructive game for all. :slight_smile:

Uh, no. Quite the opposite. Since it just about doubles the number of primes the program will find, that’s more work for the inner loop to do, not less. If you’re halving the value of ceiling, though, then, yeah, compared to your original code, it won’t have to do the (ceiling/2) tests of “if primes[even number]” which would increase the speed ever so slightly.

Good grief! Okay, here goes: as far as I know, arrays in Java are limited only by the resources the Virtual Machine has available to it, or the limits a particular VM establishes on its own. Try switching to a byte array (just change ‘true’ to 1 and ‘false’ to 0 in your code - this shouldn’t change the speed at all), and see if the 50,000,000 value changes. As for multiple arrays, the limit should again be “available resources.” Do I understand the OP correctly in that you’re already allocating multiple 50-million boolean arrays? I’ve never heard of a limit on the number of variables for any function, so if there is one, it’s probably really, really high (like 32 or 64 thousand - anyone who writes a single function with 32,000 individually-named variables simply has waaaay too much time on their hands). But if you’re dynamically allocating variables, then once again, the limit should be “available resources.”

Well, I recently got into designing window based apps in Java, and my third test of ability was to make the primes application into a window application.

Knowing the absolute mess the last one was, I decided to start from scratch. interestingly enough I did get the code to run faster… a blazin’ 54 seconds now! :slight_smile:

The MUCH simplified loop runs thus:


public void findPrimes(int length) {
  boolean[] primeArray = new boolean[length+1];
  int i = 0;
  numPrimes = 0;
  for (i = 2; i < primeArray.length; i++) {
    if (!primeArray[**i]) { // if the number is false/prime
                      // then apply sieve with it
      numPrimes++;
      // [error checking]
      //result.append("
" + i);
      // result.append("" + numPrimes);
      // [/error checking]
      for (int t = 1; i*t < primeArray.length; t++) {
        primeArray[i*t] = true;
        }
      }
  }

:smiley: You might note the vB-styled comment tags there! lol But the core loop really is very simple, and it starts from 0, and it doesn’t initialize the array with new values. This makes it pretty simple.

I think that working with bits could make the total number of primes go on for a long, long time.

Consider that I create an array as big as it can get with ints. Each entry is initialized to be -2147483646 (which should be all on bits, and ints are 32 bits long) and I then manipulate those. I would be limited by maxInts * 32 primes to find… not too shabby.

I wonder how it stores booleans… if a boolean takes up more than a single bit (which it probably does, but I don’t know for sure) then this method would certainly return a higher number than 50 million primes. I shudder to think how long that would take. :wink:

I’m afraid this method is as pared down as it is going to get. Java is just that slow. :frowning: Any C coders want to whip this up quick and use the same loop to tell me how long it takes them to run? Interesting test…

If so, thanks :slight_smile:

You can probably get a speedup by using addition rather than multiplication in your inner loop. Multiplication takes a few cycles, up to 32, depending upon the architecture, while addition takes ony 1. You can change the above code to:


for(int t = i; t < primeArray.length; t += i) {
  primeArray[t] = true;
}

Do you know about binary trees, specifically, heaps? You can use much less memory and even get a speedup if you only store the primes, and test for primes in a breadthwise fashion, rather than in a depthwise fashion.