Well on that basis human vision is of course going to be superior to all other animals’. But this is quite a stretch from how we’d normally define vision and “better”.
In as far as a bird knows anything, it does appear at least capable of parsing its world into obstacles, predators, prey, rivals and mates. And they make use of their superior visual acuity, movement detection and colour gamut in doing this.
You may consider our vision to be superior because of our superior post-processing, but I for one would disagree with you.
Nor does the shrimp, of course, but that is not so much because humans have a larger visual cortex (even when you count in other parts of the human brain not traditionally considered to be visual cortex, but still largely devoted to vision, such as the LGN and the frontal eye fields) but because of other parts and functions of the human brain.
Anyway, I did really mean to disagree with you: more just to point out that things are a good bit more complex (and less well understood) than you post might have been taken to imply. Yes, (brain) size probably does matter, to some extent, but not in a very straightforward or linear way. Men’s brains are, on average, markedly bigger than women’s, yet men are not systematically more intelligent than women. Elephants and whales have brains considerably larger than humans do, but, although they seem to be quite intelligent as animals go, there is no reliable evidence at all to suggest that their intelligence comes anywhere near human levels. (Chimps, with much smaller brains, may well come closer.) You just cannot safely infer all that much just on the basis of relative brain size (or brain-area size).
Simple illustration of this: You know what a penny looks like, right? Of course you do; everyone knows that. So… Which way is Lincoln facing? If we actually saw images, it would be easy to answer this: Just look at your memory of the image of a penny, and see. But we don’t: We see (and remember) according to our models. So, when we see a penny, we model it as “side view of a man with a beard, on a brownish metal disk”. But the model might not even include which side it is, or if it does, it’s a minor detail that can be easily lost.
It seems pretty simple to me. Animals are more concerned than humans with the recognition of food or predator. Both are more likely to move than to be stationary, so they are conditioned to pay attention to movement, a clue that the object is either food or predator.
They see exactly the same things that humans see, but all species evolve through natural selection a survival value in paying attention to what is relevant in their lives. Looking at it from the opposite viewpoint, the praying mantis can catch prey only be remaining very still, so it doesn’t attract attention like a predator. The prey sees it, but does not interpret it to be a threat because it is motionless, so ignores it.
Humans have not been in a state of comfort and little fear of predators or easily accessible food for very long, geologically speaking. It is not likely that such large physiological changes could happen in that time.