In the end, math is just:
[ol]
[li]Make up a couple of things[/li][li]Make up rules for the things to follow[/li][li]See where those rules take you[/li][/ol]
The universality of math (well, one kind of universality) lies in the fact that, given the same things and the same rules, everybody gets to the same places. There’s no guesswork, no human input, no intuition needed, at least in principle. Of course, if you use different rules or things, you’ll end up different places; but that’s empty, it’s nothing else but saying that if you don’t agree, you disagree.
The point is that if you have a set of axioms A that prove some theorem T, then the proposition that A proves T (A -> T) is provable from no axioms at all; it’s as universal a truth as you’ll get anywhere.
However, there is another kind of universality to math that’s somewhat more subtle: if the system of things and rules I use is sufficiently complex, then, for any system of things and rules you might use, I can encode your things and rules in mine, and use my system to prove everything yours can; this is, for instance, the basis of Gödel numbering, which encodes a mathematical system of arbitrary symbols and their manipulation-rules into the natural numbers. The requisite complexity is reached precisely if my system can be used to encode a Turing machine. (Though ‘complexity’ is really a bit of a red herring, or at least used in a somewhat unfamiliar way, here: very simple systems can be Turing complete (cf. Rule 110), while it is possible to design systems of near-impenetrable intricacy that fail to be.)
So all those ‘sufficiently complex’ systems are in a well defined sense ‘the same’: For any two systems A and B, both of which are Turing complete, both A can encode B and B can encode A (I sometimes prefer to say ‘emulate’ instead of ‘encode’, since it’s basically the same: two different computational architectures – say, Mac and PC – can be made to emulate one another for much the same reason two mathematical systems can).
This also gives a nice answer to Wigner’s ‘unreasonable effectiveness’: in the end, the world around us is only composed out of certain things that follow certain rules, as well; that one thus can find embeddings of these rules and things within mathematics is perfectly unsurprising (well, at least as long as reality isn’t hypercomputational, but that’s not something to go into right now); at least as unsurprising as it is that one can create computer simulations of aspects of the real world (to my knowledge, nobody has ever marvelled at the unreasonable effectiveness of computers in modelling – that’s because the notion of universality, or Turing completeness, was built into the field from the beginning).