# Sierpinski Triangle - why does it form in the 'chaos game'?

The Sierpinski triangle is (for me at least) a neat and simple way of illustrating the concept of self-similarity and the nature of fractals.

One way of generating the Sierpinski triangle is by means of the so-called “Chaos Game” (which, by the way, is more than easy to program in Basic. Even I was able to).

Here is a much more detailed and generic reference for the Chaos Game. Rather than me listing the game’s very few and very simple rules, just peek at the first few lines of the that link.

One thing I’ve never understood, and the reason why I’m opening this thread, is just how choosing the vertices at random causes the triangle to form. There are explanations given at a number of sites including the same one I linked to in the preceding paragraph. I’ve tried, but just don’t understand it.

So, I ask, is there a simpler or perhaps more intuitive way of demonstrating how/why the randomness of choosing the vertices comes into play in the generation of the Sierpinski triangle?

Don’t think in terms of where the dots are landing. Think in terms of where the dots aren’t landing.

Care to explain a bit more thoroughly?

Let’s assume that you start with a point that’s on the Sierpinski triangle. The chaos game picks a new point that’s halfway between the first point and one of the vertices.

Can you see that, as long as point N is on the triangle, then N+1 is as well? Let’s reorient things such that the chosen vertex is at the top. Since the upper triangle is similar to the triangle as a whole, except scaled by 1/2 in each dimension, the new point will be in the same place relative to the scaled triangle as the original point was to the whole triangle. It is like a pantograph at work.

You might ask how our random initial point gets to be on the triangle, even if it didn’t start on one. Well, suppose our original point was some distance e to its nearest point on the triangle. On the next iteration, that distance can be no more than e/2. On the one after that, e/4, and so on. This exponential decay means that it doesn’t take long for our points to get very close to the original triangle.

To add a bit regarding randomness:

Hopefully I convinced you that the chaos game iteration quickly gives you a sequence of points that lie on the triangle. This is true regardless of which vertices you pick. However, the randomness helps you fill out the triangle somewhat evenly. Clearly, if you pick the same vertex over and over, your points will just converge to that vertex, and you’ll only fill out a tiny portion of the triangle. Choose randomly, though, and the points will tend to spread out more.

The sequence doesn’t actually have to be random. The digits of pi in base 3 would work fine. I suspect you could also just run through all the integers in base 3 (0, 1, 2, 10, 11, 12, 20, etc.).

Dr. Strangelove: Thank you so much! What you said, and how you said it, has finally made things a lot clearer for me. A lot! Much obliged.

Here’s one way to think about it (re-expressing the observations that Dr. Strangelove made):

Any point within a triangle can be uniquely expressed as some weighted average of its vertices: if the triangle’s vertices are A, B, and C, then any point within it is of the form aA + bB + cC for some unique weights a, b, and c which sum to 1.

When we split our large ABC triangle up into four sub-triangles in the Sierpinski way, the sub-triangle containing A consists of those points which give more than half their weight to A, the sub-triangle containing B consists of those points which give more than half their weight to B, and similarly for C, while the remaining, central sub-triangle (the one we exclude from the Sierpinski fractal) is where all the individual weights are below 1/2.

In other words, if we look at some point’s weights a, b, and c in binary (so we have a = 0.a[sub]1[/sub]a[sub]2[/sub]a[sub]3[/sub]…, b = 0.b[sub]1[/sub]b[sub]2[/sub]b[sub]3[/sub]…, and similarly for c), since these weights sum to 1, at most one of a[sub]1[/sub], b[sub]1[/sub], and c[sub]1[/sub] is 1. If all three are 0, the point being described lies in the excluded central triangle and is not part of the Sierpinski fractal. Otherwise, knowing which of these is 1 tells us whether we are in the sub-triangle containing A, B, or C, while the remaining bits 0.a[sub]2[/sub]a[sub]3[/sub]a[sub]4[/sub]…, 0.b[sub]2[/sub]b[sub]3[/sub]b[sub]4[/sub]…, and 0.c[sub]2[/sub]c[sub]3[/sub]c[sub]4[/sub]… describe how we are situated within that sub-triangle (note that these will also sum to 1).

Accordingly, a point is in the Sierpinski fractal just in case, when we write its weights a, b, and c in binary, we find that there is no bit-position at which they are all simultaneously 0 (rather, at each bit-position, precisely one of them is 1 and the others are 0).

Note what happens to these binary expansions when you take the half-and-half average of some arbitrary point aA + bB + cC with one of A, B, or C: it’s as though you stick the bit 1 in front of one of the weights a, b, and c [according as to whether it was A, B, or C you were moving towards], and you stick the bit 0 in front of the other two.

Now, if we assume lower-order bits are visually insignificant, so that only the first N bits of the weights matter, then we will find that when we go to draw a point, the only thing that matters are the last N vertices randomly chosen (the last vertex chosen determining the first bit of our weights [one weight receiving here a bit 1 and the others receiving bit 0 according as to which vertex was chosen], the second last vertex chosen determining the second bit of our weights in the same way, and so on).

In this way, we find that at each moment, of the about 3[sup]N[/sup] many possible points in the Sierpinski fractal (up to the indistinguishability of our not caring about weights beyond the first N bits), we pick one at random. There will be some correlation between the point chosen at one moment and the point chosen in the very next iteration, but there will be no correlation between the point chosen at one moment and the point chosen N or more iterations later. Accordingly, we expect that almost certainly, over time, we will end up choosing each of these points about equally often, thus drawing out the Sierpinski fractal to suitable resolution.

It’s not been proven that the digits of pi in base 3 (or any other base for that matter) comprise a normal, or even disjunctive, sequence. Accordingly, it’s not known that the digits of pi in base 3 would work fine at arbitrary resolution. They may act in a manner very far from random, never containing some particular string of digits.

Yes, this would work fine, as this does comprise a normal sequence.

If one is dead-set on drawing the Sierpinski triangle in this particular manner, continuing over time to achieve as good a resolution as possible as soon as possible, though, one should use a base 3 infinite de Bruijn sequence to select the vertices (that is, a sequence in which each of the 3[sup]n[/sup] many possible n-length strings of 3 symbols first appears with start within the first 3[sup]n[/sup] symbols of the sequence, for each n).

Neat one. I like it especially because my company sells more barycentric interpolators than anyone else on earth.

It also proves my suspicion about just running through the base-3 integers. If we decide that points are visually indistinguishable if the first N points are the same, then running through the integers of length N+1 will be guaranteed to cover all possible points (we only need to keep the last point after feeding in the number). If we keep going, we can reach arbitrarily high visual precision.

Ninjaed!

I should have known that someone would nitpick my pi comment :).

I wonder, actually, what the highest number is such that pi contains all sequences less than that. I’m guessing it’s large enough that it would be sufficient for any ordinary visualization.

What would happen if instead of each vertex having the same chance of being selected, one was, say, twice as likely as the other two (i.e. probs of 0.5, 0.25, and 0.25)?

A Sierpinski triangle would still form, but less rapidly, no?

And, in general, so long as no vertex had probability of 0, it would form, right?

I know the above is hardly profound but it helps me be certain that I understand the role of randomness in the exercise (that is, assuming I’m correct )

ETA: IIRC, when I wrote my little BASIC program decades ago, and allowed the number of vertices to go to infinity, not only did a ‘circle’ form, but the distribution of points within it seemed to show a pattern; a pattern reminiscent of the distribution of galaxies

ETA: I think

Uh, I forgot a few words here. I meant to say such that pi is known to contain all sequences less than that. That is, take the latest record in number of computed digits of pi, and see what the largest number is such that all numbers less than that are included. You would expect that number to be the same basic order of magnitude as the number of digits, but I wonder what the number actually is given the current state-of-the-art.

Sure. As noted above, we can identify suitable regions of the Sierpinski fractal with finite sequences of choices of vertex (longer sequences corresponding to smaller regions, at some point becoming single pixels), and a particular such region will be hit once its corresponding sequence describes the last however many vertices randomly chosen.

If some vertices are more likely than others to be chosen, then, in the same way, some sequences will be more likely than others to arise (e.g., if A is twice as likely as B or C, then ABA is twice as likely to arise at any given moment than ABC, and four times as likely as BBC), but each sequence still has some positive probability of arising, and thus will almost certainly arise eventually. Accordingly, we will still end up drawing each bit of the Sierpinski fractal eventually; however, we’ll hit some parts more frequently than others (thus filling them out in higher resolution faster than others).

Gotcha! Thank you.