For completeness, note that the statistical error on your number (binomial statistics) is 5000, so you can say that you’ve measured the probability to be:
p = 0.49999 +/- 0.00005
which is indeed consistent with 50%.
For completeness, note that the statistical error on your number (binomial statistics) is 5000, so you can say that you’ve measured the probability to be:
p = 0.49999 +/- 0.00005
which is indeed consistent with 50%.
Of course it’s 50%. The shape either is or isn’t concave. Since you have two possibilities, it’s 50-50!
I know it’s already been answered to satisfaction, but I thought of an analytical method using angles. The ‘bounding shape’ doesn’t matter, since all angles are equally likely.
Briefly sketched,
You have a line with points A and B on it, and point C and D not on the line. For the shape to be convex, angle ACD >= ACB (more or less; there’s symmetry involved here so we can draw this how we want to).
Call angle ACB x, then the probability the shape is convex = ( pi - x ) / pi .
Integrating this from 0 to pi, and considering that x is uniformly distributed on (0,pi) so we divide by pi to normalize, we get
poof
( pi * pi - pi [sup]2[/sup] / 2) / pi * pi = 1/2
This assumes an ordering of the points, just like my test does. If A and B are not adjacent vertices, then C and D are on opposite sides of the line through A and B, and your test breaks.
I didn’t state that at all clearly. Put another way, your test is verifying that polygon ACDB is convex, but it doesn’t work for the case where ACBD is (picture a segment AB with C and D on opposite sides).
I’m not sure whether this helps, but if I recall correctly, the convex hull generated by x_1, …, x_n is defined by
{ sum_{i=1}^n a_i x_i | a_i is in [0,1] for all i from 1 to n and sum_{i=1}^n a_i = 1}
In other words, a convex combination of x_1,…,x_n is a linear combination of those points where the coefficients are in [0,1] and sum to 1. The convex hull generated by those points is the set of all convex combinations of those points.
I’m not sure if an easier test can be derived from this.
Question: If you have a computer, what is the problem with tests that depend on the order of the points. You can simply run the test for all permutations.
Also, I believe yabob’s suggestion would work; I seem to recall a theorem about supporting hyperplanes or something.
And apparently so do I. :smack:
Given 4 points (1,2,3,4), there are only 2 ways to split it into 2 triangles. Now there are 6 ways to choose the vertices for the triangles from the 4 points. Yes I know that this will make some duplicates but it avoids having to know if the diagonals are (1,2),(3,4); (1,3), (2,4); or (1,4), (2,3).
Here is the key. Given the points we can find the lengths of the sides of each triangle, thus we can fing the area of each triangle using Hero’s Formula, and from here we add the two area’s together to find the area of the quadrilateral.
In a convex quadrilateral, all 6 combinations of choosing vertices will give the same area. In a non-convex quadrilateral, some combinations will give the true area of the quadrilateral while some will give a larger value since it will not account for the indentation as a negative value but rather as a positive value. In fact, you’ll know the true area of the quadrilateral (the smaller of the two answer) and the area of the indentation ((bigger answer - smaller answer) / 2).
I’m sure that someone can make Excel do these calculations, but I think using this for Monte-Carlo would be better suited for a programming task.
Hm. That Mathworld page is convincing (despite the inconsistent notation), and it looks like it’s the same problem we’re asking here… But that’s a pretty big discrepancy from the 50% ntucker’s code is indicating. On the other hand, it actually agrees pretty well with SaintCad’s result, once one takes into account the missing factor of 3. Then again, though, n=100 is awfully low…
25/36 = 0.69444…
Me: 0.21
with Chronos Correction: 0.63
Error: 10.2%
Not bad for only 100 tests
We need to coordinate better. We wrote the last 2 posts simultaneously.
I think this is how the formula in the mathworld article was derived. (Actually, Squink’s suggestion was quite close.)
Let A, B, C, and D be randomly chosen points from the rectangle. Then, the probability that they form a convex quadrilateral is the 1 minus the probability that any of the four points lies in the triangle (ignoring degenerate triangles for now) formed by the other three; in other words, it is
1 - Pr((A in triangle BCD) or (B in triangle ACD) or (C in triangle ABD) or (D in triangle ABC))
Note that if the triangles are not degenerate, then the events are disjoint, so you have
1 - (Pr(A in triangle BCD) + Pr(B in triangle ACD) + Pr(C in triangle ABD) + Pr(D in triangle ABC))
By symmetry, the above four probabilities should be equal, so you get
1 - 4 p, where p is equal to one of the probabilities in the above sum.
The probability p should be equal to the expected area of a triangle in the rectangle divided by the area of the rectangle. This gives you the answer in the mathworld article. I think degenerate triangles can be ignored because they give zero contribution to the probability, but I am not quite sure how to show this rigorously.
Okay, so I just did a Monte Carlo with a completely different convexity test. My convexity test is this:
The images are available here. By connecting the points with lines, it’s actually quite fast to determine visually if the outermost contour has three line segments or four. If a graphical determination was too difficult (due to resolution), I flipped a coin. My results:
p = 0.660 +/- 0.046 (if correct answer of 0.6944), or
p = 0.660 +/- 0.050 (if correct answer of 0.5000).
(Recall that the binomial error depends on the actual probability.) Thus, if the answer were really p=0.5, my run would be a 3.2-sigma fluctuation, which has a 1.4-out-of-1000 chance of occuring.
Nothing in particular wrong with it, so long as you realize that’s what you need to do. We’re just trying to be strictly correct about it – my “isconvex” code does not answer the question as it was posed in the OP; it answers a subtly different question, and answering the OP with it requires it to be applied repeatedly in all permutations. It’s what we programmers tend to call “inelegant.”
May I suggest an algorithm:
So, anyone got any clue as to why the math disagrees with the Monte Carlo method? Flaw in the Monte Carlo test?
SaintCad, I’m dismissing your results because you used a ridiculously small run. You seem to think that the numbers have proven your method was sound, but all you’ve proven is that they’re so sloppy, they can be used to support any of the answers arrived at so far. I just did 20 runs of my test with a sample size of 100 and got results ranging from 38% to 64%. It’s not useful data.
The math agrees with my Monte Carlo test.
Nothing in inherently wrong with any particular sample size. You just need to calculate the uncertainty in the result and ask if the measurement is useful. For your 100-sample example:
standard deviation of probability = sqrt((1/6)(5/6)/100) = 0.0373
after factor-of-three scaling: p = 0.5 +/- 0.11
0.694 is only 1.76 standard deviations above 0.5, and you should get an upward fluctuation at least that large about 4% of the time (in agreement with the spread you see in your 20 runs.)
SaintCad’s method also takes the factor-of-three hit, so his 0.63 still has the large 0.11 uncertainty. His result is inconsistent with p=0.5 at the 89% level. I agree that that is not conclusive.
However, the the number 100 isn’t to blame. I used 100 samples, but I didn’t have to take the factor-of-three hit. My result is inconsistent with p=0.5 at the 99.93% level. (In my last post I used both sides of the normal distribution to calculate the confidence level, but really only the upward fluctuation counts here.) And, it is quite consistent with p=0.694, being only 0.74 standard deviations away.
My algorithm used a column matrix M but as multiplied the inverse by a row-vector. If you change M to be a row matrix it works fine. Running it over 400 million samples gives a probability of 0.69444, pretty close to the expected probability 25/36.
OK, I think I can safely say I’m convinced of the 25/36 answer now. And I must add, while Pasta’s n is a bit lower than one might like, we at least know that his convexity test is robust! And it’s good enough statistics to serve to confirm 25/36, even if it wouldn’t have been enough to find it a priori. Thanks, all… I’m still curious about the bug in ntucker’s code, but I’m willing to consider the main question answered.
Still bugging me. There must be something wrong with my concavity test, but I don’t see it. I even wrote a little quick-and-dirty interactive drag-points-around-thingy to test it (warning, probably only works in IE: http://www.area.com/ntucker/test/concquad.html) and all seems ok.
(there’s another at http://www.area.com/ntucker/test/conc.html which allows you to specify how many points, but it doesn’t test all permutations of point ordering, so you have to keep the points in order)