12 CPU Chiplets On 18A Node, 288 Darkmont Cores, 17% IPC Increase, 2x L2 Cache Bandwidth, DDR5-8000 Support

Intel has just unveiled new information on its next-gen Clearwater Forest “E-Core” Xeon CPUs with up to 288 cores, based on the 18A process node.

Intel’s First 18A Xeon CPU, Clearwater Forest, With 288 E-Cores Unveiled, Full-On Upgrade With 12 Compute Chiplets

Intel’s next-gen E-Core only “Xeon” family, codenamed Clearwater Forest, is making its way to servers soon. Just like how the Xeon 6 lineup was segmented into P-Core and E-Core flavors, such as Granite Rapids & Sierra Forest, we will see the next-gen Xeon family in P-Core only “Diamond Rapids” & E-Core only “Clear Water Forest” lineups. The P-Core family is optimized for performance & tackles more compute-intensive and AI workloads, while the E-Core only family is optimized for efficiency & tackles high-density / scale-out workloads.

In its Hot Chips 2025 presentation, Intel outlined that Clearwater Forest Xeon CPUs will be fabricated on the company’s latest and greatest 18A process node, which is also being used by Panther Lake on the client side, arriving later this year. Some of the main highlights of the new Xeon E-Core CPU include:

  • Intel’s latest process node, 18A: Improved performance and power efficiency
  • Intel’s latest Efficiency Core Architecture: IPC uplift tuned for 18A process
  • Intel Foveros Direct 3D Construction: Shorterpower-efficient routes, larger LLC
  • Increased Memory Bandwidth: 12-Channel DDR5-8000

Starting with the process technology, Intel’s Clearwater Forest is based on the aforementioned 18A node and utilizes Backside metal combined with gate-all-around to provide numerous benefits beyond just FET Z scaling. 18A brings lowered gate capacitance, which improves core logic power efficiency, higher cell density with over 90% cell utilization rates, improved signal routing, which helps reduce RC delay and further improves efficiency, and lastly, offers low-loss power delivery with losses being reduced by 4-5%.

Coming to the architecture, Intel is leveraging its Darkmont E-Core design for Clearwater Forest, which is an update to Sierra Glen E-Cores used by Sierra Forest. These cores offer:

  • Smarter Front-End
  • Deeper out of Order Engine
  • Bigger Scalar & Vector Execution
  • Enhanced Memory Subsystem

The front end features a 64kB Instruction cache, three 3-Wide instruction decoders that offer 50% more instruction bandwidth with nine decodes per cycle, and a much more accurate branch predictor, possibly using deep branch history and larger structure sizes.

The OOE (Out-of-Order Engine) sees an upgrade too, with 8-wide allocation (60% increase), with 16-wide retire (2x increase) for execution parallelism. The entry out-of-order window size is increased by 60% with 416 units, while 26 execution ports offer a 50% increase versus the prior generation.

The Execution Engine sees 26 execution ports to address a range of workloads, while dedicated hardware offers improved efficiency. The Integer and Vector Execution units are increased by 2x while Load Address Generation sees a 1.5x increase, and 2x uplift for Store Address Generation.

The core memory subsystem gets a 50% increase to Three-Load while the Two Store remains the same. The issuing of loads earlier could help reduce latency. Deep Buffering supports up to 128 outstanding L2 misses (2x increase). There are also advanced prefetchers on Clearwater Forest, while the list of Xeon E-Core specific features includes:

  • L1 Data Cache ECC
  • Data Poisoning Support
  • Recoverable Machine Check
  • Local Machine Check
  • 52 physical address bits
  • Core Lockstep

Intel is also leveraging a new modular architecture with Clearwater Forest “E-Core” Xeon CPUs. This includes 4 MB of Unified L2 cache with 17 latency cycles per four-core cluster for up to 288 MB of L2. The L2 cache also offers much higher bandwidth with up to 2x increase or 400 GB/s. The IPC increase is rated at 17% as per measurements conducted in SpecIntRate’17. Each core shares 200 GB/s of bandwidth with the L2 cache, while a 35 GB/s fabric interconnect connects the clusters together.

Intel went all 3D when building Clearwater Forest, with a total of 12 CPU chiplets, which are fabricated on the 18A process node. These sit on three individual base tiles, which include the Fabric, LLC, memory controllers, and IO, & are based on Intel 3 process node. The interposer houses two I/O chiplets based on Intel 7 and features high-speed IO, fabric, and accelerators. The communication is handled by Intel’s EMIB interconnect solution.

So in total:

  • 12 E-Core CPU Chiplets (Intel 18A)
  • 3 Base Tile Packages (Intel 3)
  • 2 IO Chiplets (Intel 7)

Clearwater Forest also uses a monolithic mesh coherent fabric, which uses shorter routes, more metal resources, and a high-density interconnect for improved power efficiency.

In the end, Intel shares some performance aspects of a 2S Clearwater E-Core Xeon solution. The CPUs support 12-channel DDR5-8000 memory with up to 3 TB capacities in a dual-socket server, and up to 1300 GB/s of memory bandwidth. For comparison, Intel’s Sierra Forest supports up to DDR5-6400 DRAM across 12-channels. The platform supports 2 x 96 PCIe Gen5 and 64 CXL lanes, 144 UPI (576 GB/s), and with a 576 core + 1152 MB LLC solution, you reach up to 59 TF/s that packs 5000 GB/s of raw bandwidth.

Intel’s Clearwater Forest Xeon family is expected in the coming quarters, so stay tuned for a bigger launch soon.

Intel Xeon CPU Families (Preliminary):

Family Branding Diamond Rapids Clearwater Forest Granite Rapids Sierra Forest Emerald Rapids Sapphire Rapids Ice Lake-SP Cooper Lake-SP Cascade Lake-SP/AP Skylake-SP
Process Node TBD Intel 18A Intel 3 Intel 3 Intel 7 Intel 7 10nm+ 14nm++ 14nm++ 14nm+
Platform Name Intel Oak Stream Intel Birch Stream Intel Birch Stream Intel Mountain Stream
Intel Birch Stream
Intel Eagle Stream Intel Eagle Stream Intel Whitley Intel Cedar Island Intel Purley Intel Purley
Core Architecture Panther Cove-X Darkmont Redwood Cove Sierra Glen Raptor Cove Golden Cove Sunny Cove Cascade Lake Cascade Lake Skylake
MCP (Multi-Chip Package) SKUs Yes TBD Yes Yes Yes Yes No No Yes No
Socket LGA XXXX / 9324 LGA 4710 / 7529 LGA 4710 / 7529 LGA 4710 / 7529 LGA 4677 LGA 4677 LGA 4189 LGA 4189 LGA 3647 LGA 3647
Max Core Count TBD Up To 288 Up To 128 Up To 288 Up To 64? Up To 56 Up To 40 Up To 28 Up To 28 Up To 28
Max Thread Count TBD Up To 288 Up To 256 Up To 288 Up To 128 Up To 112 Up To 80 Up To 56 Up To 56 Up To 56
Max L3 Cache TBD TBD 480 MB L3 108 MB L3 320 MB L3 105 MB L3 60 MB L3 38.5 MB L3 38.5 MB L3 38.5 MB L3
Memory Support Up To 16-Channel DDR5? TBD Up To 12-Channel DDR5-6400
MCR-8800
Up To 12-Channel DDR5-6400 Up To 8-Channel DDR5-5600 Up To 8-Channel DDR5-4800 Up To 8-Channel DDR4-3200 Up To 6-Channel DDR4-3200 DDR4-2933 6-Channel DDR4-2666 6-Channel
PCIe Gen Support PCIe 6.0? TBD PCIe 5.0 (136 Lanes) PCIe 5.0 (88Lanes) PCIe 5.0 (80 Lanes) PCIe 5.0 (80 lanes) PCIe 4.0 (64 Lanes) PCIe 3.0 (48 Lanes) PCIe 3.0 (48 Lanes) PCIe 3.0 (48 Lanes)
TDP Range (PL1) TBD TBD Up To 500W Up To 350W Up To 350W Up To 350W 105-270W 150W-250W 165W-205W 140W-205W
3D Xpoint Optane DIMM TBD TBD Donahue Pass TBD Crow Pass Crow Pass Barlow Pass Barlow Pass Apache Pass N/A
Competition AMD EPYC Venice AMD EPYC Zen 5C AMD EPYC Turin AMD EPYC Bergamo AMD EPYC Genoa ~5nm AMD EPYC Genoa ~5nm AMD EPYC Milan 7nm+ AMD EPYC Rome 7nm AMD EPYC Rome 7nm AMD EPYC Naples 14nm
Launch 2025-2026 2025 2024 2024 2023 2022 2021 2020 2018 2017


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *