Rigorous proof for the VC-dimension of circles and triangles in the plane

I have seen hand-wavy arguments that the VC-dimension of circles is 3 and the VC-dimension of triangles is 7. But I cannot find a rigorous proof of the result for triangles. As for circles, although I did find a proof for a more general result about balls in R^n, I can’t seem to find it now.

I guess the problem is that most machine learning theorists are not geometers, but neither am I. Can someone provide a rigorous proof for either result?

Don’t know what you mean by rigorous proof.

Try this one.

Could you provide a reference to the notion that VC-dimension applies to “geometry”? All the references I can quickly find talk about modelling data sets, and I don’t at all see how circles or triangles come into it.

As a side question, do you mean filled circles and triangles (as most nonmathematicians seem to use the terms)?

Let X be a set, and H (the set of hypotheses) be a collection of subsets of X. Given a finite subset S of X, we say that H shatters S if for every subset T of S, there exists a hypothesis H in H such that T is a subset of H and S\T is a subset of X\H.

The VC-dimension of H (assuming it exists) is the defined as

VC(H) := max {card S | S is a finite subset of X shattered by H}

Oops, I posted that last one before I was ready.

The VC-dimension of H is n if there exists a set of n points in X shattered shattered by H, but no set of n+1 points in X can be shattered by H.

  1. If X is a plane and H is the collection of circles in the plane, show that the VC-dimension of H is 3 (unless you think this is wrong).
  2. If X is a plane and H is the collection of triangles in the plane, show that the VC-dimension of H is 7 (unless you think this is wrong).

And yes, when I say “circles” and “triangles”, I mean for the interiors to be included; I don’t think it matters whether or not the boundaries are.

So you’re asking if there’s a set of seven points such that for any subset there’s a triangle containing exactly those points, and further that there is no set of eight points for which this is true?

Yes, that is correct.

I can verify by doodling on paper that there exists seven points that can be shattered by triangles. But how do you show that there are no eight points that can so shattered?

Similarly, I can draw three points that can be shattered by circles, but how do you show no four points can be shattered?

I can outline a proof that there exists no four points that can be shattered by circles (that’s the easy one):

  1. Suppose there are four points that can be shattered by circles.
  2. You can prove that three points on the same line cannot be shattered by circles, therefor the four points must form a quadrilateral.
  3. You can prove that if the quadrilateral is concave, it cannot be shattered (one circle cannot contain the three outside points without containing the one in the middle).
  4. A convex quadrilateral must have two diagonals. Let these be AB and CD. If A and B are to be separated from D and C with a circle, D and C must lie outside the minimum circle containing A and B (that is, a circle with diameter AB. But if C and D both lie outside of the circle with diameter AB (and ACBD is a convex quadrilateral), then both A and B cannot lie outside the circle with diameter CD, therefore the four points are not shattered by circles (contradiction).

The triangle one is considerably harder, but in structure, it would be similar. You want to limit the possible configurations of 8 points down to a convex irregular octagon (?) and then show that there must exist four points in it that cannot be separated from the other four points.

I thought of this, too, but it’s a little more complicated than that. Let’s say that our quadrilateral is a kite-shape, with AB the short diagonal, and CD the long diagonal and axis of symmetry. Let’s further say that the angles at A, B, and C are all obtuse. In this case, the circle with diameter AB will contain point C, but it will still be possible to draw a circle which contains AB but not C or D. One might, for instance, take the circle with diameter CD and shrink it a small amount (since A and B are obtuse angles, they’ll be inside this circle).

You’re right. The devil is in the details. Nice counterexample.

So after thinking about it a little bit more about what Chronos said, perhaps the best way to approach it is, for quadrilateral ACBD, look at the circumscriptions of each combination of three points. To separate A and B from C and D, then the circumscription of ACB must not contain D, or the circumscription of ADB must not contain C. This is possible as Chronos pointed out, but if this is true, it won’t be possible for the circumscription of CDA to not contain B or the circumscription of CBD to not contain A. It’s much trickier than I thought.

By this same reasoning, consider the circle with diameter that is exactly CD (C and D lie on the circle). Since angles at A and B are obtuse, they must lie inside this circle (if they laid on the circle they would have to form right angles, and outside the circle they would have to be acute). Thus, a circle with diameter CD would contain A and B. Any larger circle would necessarily contain this circle, therefore any circle containing C and D must also contain A and B, so the subset C and D cannot be isolated.

I know nothing about VC dimensions, but am interested in this topic from a geometric point of view. If I have misunderstood the definitions of shattering or the like, please excuse me.

The problem as Chronos pointed out, is here. There ARE larger circles that contain C and D that do not contain all the points in the circle with diameter CD! As it turns out, some of those larger circles will contain A, and some of them will contain B, but it is harder to prove that all of them contain A or B. I’ve been looking at it on paper, and it looks obvious, but I already made the “looks obvious” mistake today, so that’s all I can say ;).

The case with shattering by “circles” (disks) has already been reduced to considering shatterings of convex quadrilaterals. For this case, use the fact that for two intersecting chords of a circle, the segments of each chord cut by the other have the same geometric mean. That is, let P, Q, R, S be points on a circle such that PR and QS intersect at point X within the circle. Then PXXR = QXXS.

Now let ABCD be a convex quadrilateral, and let X be the point of intersection of the diagonals AC and BD. Assume without loss of generality that AXXC >= BXXD (swapping the labels of A<->B and C<->D if necessary). Now any disk containing points A and C must contain at least one disk with A and C on its boundary, and thus having the chord AC. The line BD intersects this disk in a second chord EF, which cuts chord AC at X. Now by the theorem stated above, we have AXXC = EXXF >= BXXD. But if both B and D were outside this circle, then we would necessarily have BXXD > EX*XF, contrary to the conclusion. So either B or D (or both) lie within the disk, and the set of disks cannot shatter the set {A,B,C,D}.

(This is much easier to convey with a picture.)

Yes, but you’ve done quite well without one. Beautiful proof. That geometric mean theorem nailed it! I’ll have to remember that one.

Oops; should have said "Any larger circle containing C and D would necessarily contain one of the semicircles lying on the diameter C and D. Since ABCD is convex, A and B lie in separate semicircles, and therefore any circle containing C and D must also contain A or B, so the subset C and D cannot be isolated.

The case with shattering by filled triangles involves several special cases. As jawdirk notes, we can immediately reduce to the case of a convex polygon; if D is contained within or on the triangle ABC, then any convex figure containing A, B, C must also contain D, so any set containing A, B, C, and D cannot be shattered by convex regions (including circles and triangles). We will use the fact that for a convex polygon P[sub]1[/sub]P[sub]2[/sub]P[sub]3[/sub]…P[sub]n[/sub], the line P[sub]i[/sub]P[sub]j[/sub] (i<j) has P[sub]i+1[/sub],…,P[sub]j-1[/sub] on one side and P[sub]1[/sub],…,P[sub]i-1[/sub],P[sub]j+1[/sub],…,P[sub]n[/sub] on the other.

So consider a convex octagon ABCDEFGH. The shattering we prove impossible is the separation of T={A,C,E,G} from S\T={B,D,F,H}. Consider a triangle which contains the points of T. As before, we can simplify matters by considering only the “minimal” hypotheses: those hypotheses H containing T for which there exist no hypotheses H’ contained in H which also contain T. In the case where H consists of all triangular regions, it is clear that the minimal hypotheses are triangles with at least one point of T on the interior of each edge, or with points of T at each endpoint of the edge. (If not, then we could shrink the triangle by adjusting this edge while maintaining a valid hypothesis.) [So if the triangle PQR is a minimal hypothesis, then either P and Q are in T or there is a point on the interior of segment PQ in T.]

One can consider four cases, depending on whether the minimal hypothesis PQR under consideration has three, two, one, or zero of its vertices P,Q,R in T. In each case we can use the convexity of ABCDEFGH to show that at least one element of S\T must be in the triangle as well, so that the hypothesis is not a valid separation of T from S\T. This is the sort of geometric argument that’s much easier to understand when you draw your own picture to see where the points have to go, so I’m not going to do all the cases. Here’s an example, for the case in which exactly one of the vertices is in T:

The triangle has the form APQ, with C on AP, E on PQ, and G on AQ. By convexity, points D and F must be on the same side of line AC (= line AP) as Q and on the same side of line AG (= line AQ) as P. But by assumption D and F lie outside of the triangle APQ, so they must lie on the side of line PQ opposite A. But D and F must be on opposite sides of line AE, so it’s easy to see that the triangular region ADF must contain E. So the octagon is not convex, a contradiction.