Arm announced today the first two chips for its Neoverse “cloud-to-edge” platform, the Neoverse N1 and Neoverse E1. The N1 is a high-performance processor meant to be used in the data center. The Arm Neoverse N1 platform, the first built on the 7nm “Ares” core, scales up to 128 cores and delivers a 2.5x performance improvement on key cloud workloads, according to Arm. The company’s Neoverse E1 platform, also announced, debuts as a high-efficiency throughput platform, promising a 2.7x improvement in throughput performance over.
Arm has just announced two new processors for compute workloads with a power efficient Neoverse E1 platform targeting edge devices like 5G base stations, as well as the more powerful Neoverse N1 platform designed for the cloud, and aiming at challenging Intel Xeon processors. Arm Neoverse E1Key specifications and features of Arm Neoverse E1:. Simultaneous Multithreading (SMT) supporting two threadsconcurrently.
Up to 8 cores (16 threads) per cluster. Superscalar, out-of-order pipeline. Configurable private L2 cache. Configurable L3 cache. Low-latency Accelerator Coherency Port (ACP) for closely coupled accelerator integration. Support cache stashing into L2/L3 cacheArm Neoverse E1 is the first Arm processor to support SMT and is best suited for data plane compute workloads such as 4G/5G transport, software-defined networking, software-defined storage, and SD-WAN.
The platform features a scalable architecture suitable for 10Gb wireless/wireline devices to high-performance 100G+ Dataplane Processing Unit (DPU).Arm developed a 5G small cell transport software prototype which simulates packet processing workloads at a 5G base station in order to evaluate the platform and found Neoverse E1 to boost throughput performance by 2.7 times over Cortex-A53, and throughput efficiency by 2.4 times.Neoverse E1 Edge Reference Design through its paces. The Neoverse E1 Edge Reference Design includes sixteen Neoverse E1 cores arranged in two clusters of eight cores, connected through the high-performance CMN-600 mesh interconnect, MMU-600 system MMU, and 2-channel DDR4-3200.The prototype ran on Neoverse E1 Edge Reference Design with 16 E1 cores arrange in two 8-core clusters, and consuming less than 4W at 2.3GHz.More details can be found in the and the. A lot of code dealing with packet processing is in fact very simple and more dependent on frequency than on the ability to parallelize instructions.
When you parse a packet you spend time incrementing offsets, comparing values and jumping. An A53/A55 there will not be far below an A72 on this task. And accesses are quickly memory-bound. So it makes sense to use a larger number of dumber cores, with SMT to improve the execution unit’s usage while waiting for memory. I really think that this approach makes a lot of sense.