The technology allows for more cache memory to be allocated to the CPU core. This is particularly useful in certain applications: AMD mentions computational fluid dynamics (CFD), finite element analysis (FEA), electronic design automation (EDA) and structural analysis as the use cases likely to get the most improvements.
A leaked spec sheet mentions the 100-000000892-04, a 96-core/192-thread processor likely to be called the EPYC 9684X, that sits in a SP5 socket and can work in a dual-CPU configuration.
The total amount of cache available (L1, L2 and L3) stands at more than 1.25GB; that’s more than 10 times what Intel’s top of the range Xeon, the Platinum 8490H can offer and would be a new world record. The other non-3D members of the Genoa family, like the EPYC 9654, have around a third of the L3 cache.
Three other SKUs are expected with TDP likely to be 400W (up from 280W for the current “ROME” processors and up from 360 for the non-3D cache EPYC 9654. Turbo speeds remain the same compared to the latter and up by 200MHz compared to the 64-core EPYC 7773X.
L4 in the future?
The role of cache cannot be understated in modern computing: of the many levers available to processor designers to improve performance, it is one of the most important and it is not surprising that AMD may consider rolling out L4 cache in the future. In general, more cores mean slower speeds and the need for fast memory (aka cache) closer to these cores to keep them working (rather than idling). It also introduces more latency and other compromises on the CPU floor plan.
With 96-cores and a planned 128-core version landing in the near future, that is likely to come sooner rather than later. A 2020 patent called “Steering Tag support in virtualized environments” points to AMD’s willingness to explore exotic solutions like multi-gigabyte L4 cache.
Of course, all this is nothing new. Others before AMD have done it: IBM Z-series for example has 128MB L4 cache and as reported by Anandtech back in September 2021, Big Blue was working on a new chip (Telum, now present in the z16) where each private L2 cache could house its own virtual L4 memory (note the use of the word virtual).