Basically, microprocessors are successions of logic gates (transistorized AND, OR, FLIP-FLOPS, etc.) With each clock cycle, the contents of a particular logic circuit may cause the next gate to change state. (Very simplified).
So the limiting factors are fairly obvious - how fast can the clock go? How fast can a transistor change state? How long is the distance from one logic circuit to the next? (Resistance can slow electricity, but the theoretical upper maximum is of course the speed of light.)
I once saw Admiral Grace Hopper (“Grandma Cobol”) on the Dave Letterman show. One of the items she brought was picoseconds. Actually, it was a packet of pepper, but her point was - this is how far electricity travels in a picosecond.
So-
Transistor speed is getting faster as transistors get smaller. But obviously, the theoretical limit is the minimal size of a transistor - too small, and there are not enough atoms for the transistor effect. (Also, too small and random cosmic ray particle collisions can flip the transistor) In fact, what’s a problem is the technique for fabricating transistors this small. The separation distance between elements on a chip are on the order of nanometers (IIRC< there was an item about 7nm fabrication in the news recently).
Smaller transistors means shorter distance between elements. Plus, part of the design is to move the elements - registers, bus, gates to bus, ALU (arithmetic logic unit) etc. - so that the typical paths are as short as possible. Shorter distance - less delay in transferring data between elements.
As more of the components are moved from separate support chips to the single chip, this eliminates a lot of fairly long (many centimeters) path that the signal has to traverse - and also means the signal does not need as much power (voltage, current) to function.
We also read about stacked or multi-layer chips, where the pieces of the processor are layers in 3 dimensions rather than plain 2 dimensions. but then, fabrication processes and removing heat are issues.
Each time a transistor changes state from on to off, it generates heat. the determining element is how much power is involved. IIRC, the defining factors are the size, the amount of current flowing, and the speed of the change of state. Hence, the answers above about heat - change a transistor’s state (on-off or off-on) too many times a second, and even a low power transistor will generate heat.
Notice that chip speed has stalled at about 4GHz or less? That seems now to be a practical limit for heat and switching speed.
So - future enhancements - still smaller transistors, more dense packing, multi-layer; fitting more and more of the entire PC on one chip.
Open task manager in Windows, see how many parallel processes are running at once. There’s a real advantage to having multiple cores, because many of these processes are relatively independent of each other. Rather than waiting a turn for one processor, why not have several?
This is the same route older computers took. IBM mainframes and DEC VAX both hit this wall, there was only so fast processors could be; any further increase was very small increments, but adding multiple processors or cores made the computer that much faster.
Another trick is to offload tasks to other chips - disk IO, graphics, networking. This is essentially variant on parallel processing.
So have we hit a limit? No, but this particular configuration of PC processors is reaching its technical limit; the rate of increase, as noted in the OP, is slowing down. My current PC is 5 years old. 10 years ago, 20 years ago, a 5-year-old machine was barely usable with modern software. now, it’s not unusual. I have helped some clients who are just now dumping XP and still run Windows Server 2003.