You seem to have a fuzzy understanding of what microcode is.
All you need to make a very primitive processor is a couple of registers and some control logic. You can implement your control logic using a simple ROM and a register to hold the “state” of the machine. In this sense, a processor is just a glorified state machine.
Taking a very simple example, most processors go through four basic states, called instruction fetch, instruction decode, execute, and write back. Instruction fetch simply fetches the next instruction from memory. Decode fetches all of the necessary data for the instruction, so if the instruction is “add A and B together” it goes out to memory and grabs the values of A and B. Execute is where the data gets fed through the ALU, and write back stores the result of the ALU somewhere.
To create a four-state machine using a ROM and a register, your ROM would have the following values:
1
2
3
0
When it starts with an address of 0, the output of the ROM is 1, which gets latched into the state register on the next clock cycle, changing the state to 1. This state is then used as the next address into the ROM, and the value at address 1 is 2. So on the next clock cycle, the state changes to 2, the output of the ROM changes to 3, and on the next clock the value 3 is latched into the state. At this point, the output of the ROM is 0, so on the next clock it goes back to the initial state.
This is a bit of a stupid state machine, since it basically emulates a simple counter circuit, but the ROM code is our “microcode”, and if we want to change the way the state machine operates all we need to do is reprogram the ROM.
In a real processor, the next “state” would be determined by not only by the output of the ROM but also possibly by other external signals such as the value in the instruction register. Back in the old days, dedicated hardware for multiply and divide was too expensive, so when one of those instructions was being executed, it wasn’t done in a single pass through the ALU. Instead, multiplies were done using cycles of adds and shifts (look up Booth’s algorithm) and divides were similarly done using subracts and shifts. This makes the state machines much more complicated, and the microcode gets similarly much more complex.
Around the time of the 386 CPU, most processors ditched the microcode model and started using RISC style pipelines, in essence “unrolling the loop” (as they call it). Instead of using just a couple of registers and a microcode ROM to cycle through states, each state was given its own separate piece of hardware. While one instruction was being fetched, another was being decoded, a third was being executed, and a fourth was being written back (actual processor pipelines are often much more complex than this, but you get the idea). Microcode mostly went away, but is still used in some cases, as was mentioned by previous posters.
An FPGA is completely different from microcode. With microcode you typically have fixed hardware with a programmable ROM to control it. In an FPGA, you have programmable interconnects between the gates. When you program an FPGA it kinda looks like it has an instruction set, especially if you are programming it using VHDL, but you are really programming connections, not a ROM (although your “program” will typically end up being stored in a ROM).
If you wanted to be really tricky about it, you could program a processor into an FPGA with microcode in a ROM inside the FPGA. Then you could have microcode inside VHDL code.