There are a lot of inter-playing issues here.
Propagation of signals is a very difficult one. It is true that the velocity of light is about one foot per nanosecond in a a vacuum. However this isn’t the issue with chip design. When sending a signal you are concerned with the time it takes the signal to rise (or fall) to the point it switches the receiver’s logic. At chip scales a conductor behave, not as a wire, but as a delay line. Resistance is significant in the conductor, and depend upon the material the conductor is made from (metal is good, polysilicon is dreadful) and the conductor is capacitively coupled to its surroundings. The effective propagation speed of signals, especially in very small feature size chips is astonishingly slow.
However, the same problems exist coupling chips together with wires. Once you drop off the die, the signalling speeds are a couple of orders of magnitude slower. The latency to perform a transaction off a chip is a significant limitation to speed. One of the reasons the current generation of processors is moving the memory controller onto the die. Moving caches onto the die was one of the key enablers of the current processor performance.
As he feature size shrinks, the clock speeds increase. One of the reasons we saw ever increasing clock speeds. However as the feature size falls, the signal propagation speed falls even faster. There comes a time when the potential to increase clock speed can’t be met due to the falling signal speed. The signal speed (very roughly) is inversely proportional to the resistance, and the resistance increases as the square of the shrink factor. So if you halve the feature size, your signal propagation time may quadruple, and therefore the distance across your design the signal can propagate actually falls. Clearly there is a crossover point. A point where chips actually can’t clock faster, even if they shrink. And guess what? We crossed that point a couple of years ago. (This is all very rough, there is devil in the details.)
So why keep shrinking? Well you get more on a chip. And you do still keep making speed gains. The individual processors are not much faster, and we are at the point where it is hard to throw more logic at a processor design to make it faster. One, we are running out of ideas, and two, the propagation time to cross all that extra logic slows the chip down, and so you don’t win. But coupling the processors to lots of fast cache on the chip works wonders, and ever more nearby cache is one of the current reasons we still see increasing performance out of modern processors. Plus keeping all the other logic nearby, bus controllers, memory controllers, and more processors. A parallel algorithm is often limited by the speed at which the parallel components can communicate (even it is to do nothing more than synchronise.) Going off chip to do this is very expensive. If your peer processors are on the same chip you are an order of magnitude better off. If you share cache, even more so.
The big limit to the physical size of a chip is the yield rate. This is something that has been a constant for decades, indeed even going back to the days of discrete transistors. As you increase the size of the chip the probability of including a defect grows very quickly. There comes a point where you simply never get a working chip. It isn’t just defects in the wafer, but the chance of a tiny imperfection in any part of the fabrication process. The actual sizes of chips sold has changed almost not at all over the last couple of decades. Essentially you get small, very cheap processors, used for embedded control and the like, a few millimetres across. You get consumer level processors that go into most people’s PCs and laptops, about 10mm across, and you get huge very expensive processors at the top end of the ecosystem, up to 2cm across. Costing a few dollars, a hundred or so dollars, a thousand dollars or more.
On the other hand, how big a chip could we make? Or do we make. Current mainstream fabrication is on 300mm diameter wafers. Could we make a chip, one per wafer? One thing we do do is make CCD sensors nearly that big. They are bespoke items, and it is hard to put a price on them. But for astronomers, and probably for some security applications, they are great. As a production item you could think of Kodak’s KAF-50100 CCD, which is is 49.1x36.8mm and delivers 50 megapixels.
Two decades ago the Sematech consortium started out, and one of its intial goals was wafer scale integration. So the idea isn’t new. But for the above, and many more reasons, the idea didn’t work out. Sematech still exits.