Does CPU stop accessing memory or stop operating (RAM) while a PCIe device access RAM?

On electronics.stackexchange.com, it says

I am wondering if the current computer architecture on desktop PCs (x86 PCs) completely stop accessing memory or stop process any instruction when they receive an “interrupt notification”? Or does CPU continue accessing RAM in parallel to other PCIe devices?

An interrupt notification doesn’t cause the CPU to stop working, it just gets it to work on something new (the interrupt servicing code). If a (non-DMA) device sends an interrupt to signal that data is ready to be read, the CPU will often be the one doing the reading, so it stops what it’s doing (as described in your quote) and starts executing code to fetch the data from the device using I/O instructions and store it somewhere in memory. Some other types of interrupt, such as timers, will have different code that gets triggered. Note that, in a multiprocessor or multi-core system, only one of the cores handles each interrupt, the other cores continue doing their own thing.

Some devices have the ability to access memory directly (DMA). In such cases, there is hardware (or hardware-like) functionality in the bus itself to synchronise accesses between the various devices and processors (and/or cache handlers) that want to access each section of memory. When the device’s DMA operation is finished, it sends an… interrupt to the CPU so that it will take action either (input) to process the newly-arrived data and maybe ask for another read or (output) to prepare a new block of data to be written and maybe ask for another write.

It’s possible for a core to stop processing anything until a new interrupt is received. There are halt instructions (such as HLT on Intel x86) for this purpose. The processor doesn’t quite stop, it’s just busy waiting. In a modern multitasking operating system, the OS will normally find something else for that core to do instead of letting it sit idle waiting for something to wake it up. In practice, each thread of each process is suspended a good fraction of the time because of some I/O operation that needs to be completed, and that’s when the other threads get to run (which keeps the cores busy).

PCIe is not like the old interrupt-drive DMA bus-master scheme. It is a serial, packet-switched, point-to-point architecture. PCIe connects each device with a dedicated, bi-directional link to a PCIe switch. As a result, PCIe supports full duplex DMA transfers of multiple devices at the same time.

Memory is not on the PCIe bus, but on a separate dedicated memory bus on the motherboard. The CPU itself has several levels of on-chip instruction and/or data cache. The goal is satisfy as many fetches as possible from this, and only access the memory bus (which is separate from PCIe) when necessary.

There were several evolutionary steps from the old system of memory on a generalized bus to the latest scheme of a dedicated memory bus with the memory controller integrated into the CPU.

This was an early interim step: Front-side bus - Wikipedia

The below post covers the recent architectural developments.

Memory bandwidth is part of the “Von Neuman bottleneck”, so it’s advantageous to streamline this and use instruction/data caching: von Neumann architecture - Wikipedia

Even though single-chip CPUs in the mid-to-late-1990s equaled the computational performance of a late-1970s supercomputer, they did not have the sustained memory bandwidth to maintain good performance on large arrays.

This gradually changed with improving memory architecture, so today high-end Xeon configurations may have 6-7 terabytes/sec memory bandwidth. By contrast in 1993 the prototype Cray-3 using galium arsenide “only” achieved 128 GB/sec memory bandwidth: Cray-3 - Wikipedia