My answer is "Ignore technical formalizations of probability theory. Let’s look at all the possibilities, and how good it is to switch in each possibility. It looks like this:
YOUR ENVELOPE
| 1 | 1 | 1 | | | | |
| - | - | - | 1 | 2 | 4 | 8 |
| 8 | 4 | 2 | | | | |
+---+---+---+---+---+---+---+
+---+---+---+---+---+---+---+ +--
| | | | | | | | |
| | | | | | 4 | | | 8 T
| | | | | | | | | H
+---+---+---+---+---+---+---+ +-- E
| | | | | | | | |
| | | | | 2 | |-4 | | 4 O
| | | | | | | | | T
+---+---+---+---+---+---+---+ +-- H
| | | | | | | | | E
| | | | 1 | |-2 | | | 2 R
| | | | | | | | |
+---+---+---+---+---+---+---+ +-- E
| | | 1 | | | | | | N
| | | - | |-1 | | | | 1 V
| | | 2 | | | | | | E
+---+---+---+---+---+---+---+ +-- L
| | 1 | |-1 | | | | | 1 O
| | - | | - | | | | | - P
| | 4 | | 2 | | | | | 2 E
+---+---+---+---+---+---+---+ +--
| 1 | |-1 | | | | | | 1
| - | | - | | | | | | -
| 8 | | 4 | | | | | | 4
+---+---+---+---+---+---+---+ +--
| |-1 | | | | | | | 1
| | - | | | | | | | -
| | 8 | | | | | | | 8
+---+---+---+---+---+---+---+ +--
Is it good or bad, on average, to switch? Well, on the one hand, on each column, we can see that switching is good on average. On the other hand, on each row, we can see that switching is bad on average. On the other hand, along each -diagonal, we can see that switching is a wash, on average. That’s odd; how can the analysis change based on how we group things? Well, this square shows us how; it’s just a brute fact that regrouping can change things in this sort of way, in this sort of example. We know that regrouping won’t change things if there are only finitely many possibilities, but clearly, it can in cases like this. None of the paradoxical reasoning is fallacious; this is just the sort of case where there is no well-defined answer as to how good switching is on average, independently of how we group the calculation."