Demos RTX PRO 6000 Running 4 Instances of Cyberpunk 2077 Using MIG

NVIDIA has once again highlighted the neural rendering & gaming innovations it delivered with Blackwell RTX GPUs such as RTX 5090 & RTX PRO 6000.

NVIDIA’s Hot Chips 2025 Presentation Focuses On Blackwell RTX & Its Various Innovations In The Realm of Neural Rendering & Gaming

NVIDIA’s Blackwell RTX was introduced in January this year, and since then, we have come to know a lot of what makes it tick. AI has been the fundamental feature of Blackwell, but all of this innovation started way back in 2006, since the introduction of CUDA, the company’s universal architecture, software architecture, for accelerated computing and AI.

The company states that compute density has outpaced Moore’s Law scaling, & this was enabled through Sparsity, new ISA, lower precision formats, and through rigorous work on architectural efficiency. In 2018, NVIDIA brought real-time ray tracing, and just a year later, DLSS was launched.

These were made possible using a combination of new technologies such as RT cores and Tensor cores. Today, Blackwell introduces the most advanced version of these technologies and then some more.

On Datacenter, NVIDIA was introducing FP4 precision, which brought a 4x improvement in dense scale workloads. It is stated that Jensen himself wanted to bring back AI to its home ground, which is graphics, hence the advent of the neural rendering and graphics era with Blackwell RTX, since RTX, as the company puts it, is their brand for simulation, their brand for content creation, and the brand for gaming. So let’s bring all those things that were being done in the data center and scale them to consumer RTX GPUs.

So what does Blackwell bring to the table? Technologies such as DLSS 4, MFG, ACE, Path Tracing, and more all lead to faster performance and beautiful visuals. NVIDIA itself quotes a “10x amplification in performance, footprint, design cycle” with Blackwell RTX. DLSS 4 also leverages AI to generate 100% of the pixels rendered after the initial frame, leading to shorter render times and more battery life on mobile platforms.

The main design principles with RTX Blackwell were:

  • Optimize for new Neural Workloads
  • Reduce Memory Footprint
  • Quality of service for neural+graphics
  • Energy efficiency that scales

On the highest level, RTX Blackwell is a engineering marvel with 4000 AI TOPS & High Speed FP4 support thanks to 5th Gen Tensor Cores, up to 360 RT TFLOPs designed for Mega Geometry using 4th Gen RT Cores, an AI Managment Process (AMP) that simaltenously handles AI models and graphics, up to 125 TFLOPS of compute with Neural Shaders within the Blackwell SM, 2x the MaxQ power efficiency, and featuring the world’s fastest memory solution with 30 Gbps memory, GDDR7. The architecture is also infused with display/video innovations such as DP2.1 UHBR20, PCIe Gen5, 4x NVDEC/NVENC with 4:2:2.

NVIDIA then gives a rundown of the RTX Blackwell SM, which is vastly different than the Blackwell SM used for Data Centers. One big change is that it combines FP32/INT32 units together versus the split design on previous-gen Ada SM.

RTX Blackwell also offers SER (Shader Execution Reordering) improvement of up to 2x.

Then we have the 5th Gen Tensor Core, which adds FP4 support and, with it, MFG mode in DLSS 4, enabling the GPU to render four frames using AI acceleration.

With DLSS 4, Frame Gen cuts the time to render each frame individually and delivers 10x faster core rail gating, 100x faster DRAM to self-refresh rates, and mobile platforms also see up to 2x reduction in GPU power towards battery life.

With GDDR7, RTX Blackwell enables NVIDIA to offer twice the data rate of GDDR6 with up to 30 Gbps speeds. On mobile platforms, the same memory standard enables up to 2x efficiency.

NVIDIA’s AMP unit allows simultaneous processing of AI and graphics workloads. This ensures smoother and evenly paced frames and faster model responses.

Moving from the gaming side of RTX Blackwell to the PRO side, NVIDIA showcases some of its newest features, such as Universal MIG, which are enabled on GPUs such as the RTX PRO 6000. With this, users can dedicate up to four instances of the RTX PRO GPUs, each with 24 GB VRAM and a sub-selection of core/hw units which run in parallel with predictable latency and throughput.

In a cool demo, NVIDIA showcases RTX PRO 6000’s scaling using MIG. The workload used was four instances of Cyberpunk 2077 running at 1080p using max settings. This is a relatively low graphics workload for a graphics card like the RTX PRO 6000.

A standard timesliced instance was used as the baseline and compared to MIG 2x and 4x modes, which deliver higher scaling of up to 60%. So yeah, if you want to run four instances of Cyberpunk 2077 at the same time, the RTX PRO 6000 Blackwell GPU is going to be a good fit for this task.

Overall, NVIDIA’s RTX Blackwell GPU architecture has been out for a few months now and is being further tuned for consumer and pro applications. Several upcoming games and content creator apps are starting to harness RTX Blackwell’s huge array of AI and Neural enhancements, and we can’t wait to see how devs expand their apps with these features in the coming years.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *