300GB/s x 1.5 for more channels x 2.5 for DDR5 = 1.1 TB/s.
Not too shabby! As you said, it would likely be eclipsed by the next-gen NVIDIA accelerator, but... damn, over a terabyte per second for general-purpose compute is just nuts.
In principle, AMD could go even higher if they really tried to optimise the platform for this one metric, but server CPUs tend to be "balanced", so I doubt this will happen. One can dream...
No it isn't, it's ~1.5x the speed of DDR4: 4800 vs. 3200.
The 8400 is a hypothetical module that they _plan_ to make not that they've actually managed to make. And the first generation of CPUs with DDR5 support are unlikely to immediately support the maximum DDR5's spec plans to achieve. Just like CPUs only very recently officially supported DDR4-3200, despite that being on the market for years and years (the 9900K only officially supports up to DDR4-2666 even).
> AMD could add extra memory channels, a 50% increase is reasonable.
Say what now? A 50% increase is reasonable? You're expecting 12-channel memory? The 8-channels in Epyc Rome is already the most of any CPU on the market. I don't see any chance at all that this jumps to 12 in a single generation?
That's not going to happen. Extra memory channels are very expensive die-wise. Nvidia and AMD achieve these rates with hbm, which has very wide buses (4096 bits) and short traces from stacking. I can't see any way CPU memory will compete until they move to hbm. Keep in mind gddr6 is available in GPUs now, and is faster than ddr5, but much slower than hbm.
So can Intel, but they don't. Hbm would likely require them to sell a fixed-size memory amount, which can be severely limiting for server applications. Not to mention it's extremely power hungry compared to ddr, so you won't get anywhere near the amounts ddr gives without making power consumption go way up.
EPYC is already a modular architecture, literally nothing stops AMD replacing a couple of "compute" dies with HBM2 stacks. They could release CPUs that don't require DIMM sockets at all. E.g.: instead of 2 sockets + a bunch of DIMM sockets, the same motherboard space could be used for 4 sockets with embedded memory.
They could but then you're cutting your FLOPS down to get your memory bandwidth up. And HBM2 doesn't get you much capacity. The 7nm Instinct MI50 has 4 stacks of HBM2 to achieve 32GB in capacity. So other than as a joke toy, what would you do with a 32-core / 64-thread CPU with 32GB of RAM? That's what you'd end up with if you swapped out 4 compute dies for 4 HBM2 stacks.
Assume that in 1-2 years HBM capacity doubles, and it's a quad-socket motherboard. You'd have 64GB per socket, or 256GB total.
Remind me how much memory an NVIDIA accelerator has?
To play Devil's advocate, putting HBM2 in the package doesn't magically solve everything. The intra-socket bandwidth could be enormous, but the inter-socket bandwidth would still be whatever it is now, and would be difficult to increase.
Epyc doesn't do quad sockets. Is this just another hypothetical "what if" at this point with no basis in reality?
Because sure, a hypothetical non-existent Epyc re-designed to compete in the double precision floating point space favoring memory bandwidth above all else could be really cool. Then again, so could anything else custom designed exclusively for that use case.
> but the inter-socket bandwidth would still be whatever it is now, and would be difficult to increase.
64 PCI-E 4.0 lanes form the CPU-CPU interconnect currently.
Since we're making up stuff why not assume that's doubled next generation along with being PCI-E 5.0? So that'd be 500GB/s give or take.
That would be great, but to date hbm is always part of the board. I'm all for selling motherboards with the ram already on it if it means higher bandwidth, but it's just never happened before.
1) DDR5 is about 2.5x the speed of DDR4: https://www.anandtech.com/show/15699/sk-hynix-ddr5-8400 2) Dual socket roughly doubles the bandwidth. Measurements are showing something like 300GB/s in practice: https://www.anandtech.com/show/14694/amd-rome-epyc-2nd-gen/6 3) AMD could add extra memory channels, a 50% increase is reasonable.
300GB/s x 1.5 for more channels x 2.5 for DDR5 = 1.1 TB/s.
Not too shabby! As you said, it would likely be eclipsed by the next-gen NVIDIA accelerator, but... damn, over a terabyte per second for general-purpose compute is just nuts.
In principle, AMD could go even higher if they really tried to optimise the platform for this one metric, but server CPUs tend to be "balanced", so I doubt this will happen. One can dream...