Haswell, SLI, and Memory Bandwidth

By Dustin Sklavos, on November 6, 2013

Of the three primary components of the enthusiast PC that can be overclocked (the CPU, the graphics card, and the system memory), the results from overclocking system memory can often be the hardest to quantify. A synthetic test like AIDA64 or SiSoft Sandra will bear out substantial differences in memory bandwidth, latency, and overall speed, but in practice things can be a bit blurrier. Understanding that balancing an overclocked enthusiast system is an exercise in isolating and moving bottlenecks, it’s easy to forget about memory as a bottleneck when many applications don’t show a performance benefit from going to faster RAM. What you must remember is that while many applications will get very little benefit, some will.

1

Our friends over at AnandTech posted a very interesting piece about memory scaling on Haswell. One of the bigger takeaways is that on the Intel side, the era of 1600MHz being enough is basically over. It’s gone from being the price-performance sweet spot to being almost a bare minimum, and at least 1866MHz is really more desirable. But one of the big takeaways from their piece is that memory bandwidth can come into play in a much more profound way with multi-GPU configurations. The good Dr. Ian Cutress did fantastic work, but I felt like there was a blank spot: he went straight from a single Radeon HD 6950 to a Radeon HD 5970 and 5870 GPU trifecta, leaving the most common multi-GPU configuration out of the mix. Interested in seeing if his results would indeed trickle down to a dual-GPU system, I went ahead and tested my overclocked i7-4770K with a pair of overclocked GeForce GTX 780s and a set of 32GB Dominator Platinums (4x8GB) able to hit 2400MHz at CAS 10.

The testing proved very illuminating. First, this was the testing that led to my find with Battlefield 4, but other games all generated measurable performance improvements when making the jump from 1600MHz CAS 9 to 2400MHz CAS 10, at least at the single display resolution of 1920x1200.

2

First, Futuremark’s 3DMark Fire Strike and 3DMark Fire Strike Extreme both demonstrated minor but measurable performance increases. As it turns out, these minor improvements are the exception not the rule.

2

4

In practice, average framerates got saw fairly small jumps in performance, but the arguably more important minimum framerates received very healthy boosts. BioShock Infinite gained a beefy 30% on its minimum framerate, while GRiD 2 got a more modest 7% bump. The traditionally GPU limited Tomb Raider still saw a 13.1% increase. Moving to 5760x1200 saw very negligible improvements with high speed memory in these games as they were uniformly GPU limited at this point, but remember that they’re also almost a year old each. Battlefield 4 still gets a sizable performance boost even at 5760x1200 as its 64-bit executable taxes system memory much harder than these 32-bit games do. With the Xbox One and PlayStation 4 both sporting 64-bit processors and 8GB of shared system and graphics memory, I would expect situations like Battlefield 4 to become more of the rule and less of the exception in the future. It’s one thing when you only have 2GB of system memory to work with and bus information in and out of the way 32-bit games do; it’s another entirely when suddenly games are hitting three or even four times that.

If you have a high performance graphics subsystem, upgrading your system memory can definitely yield measurable improvements in gaming performance, especially since these improvements are most substantial at the coveted minimum framerates where the fluidity of the gaming experience is largely defined. Going from a 120fps to 130fps average isn’t really going to be noticeable in practice, but raising your performance floor from 19.9fps to 25.9fps is a big deal. Ultimately, these results echo my sentiments from the Battlefield 4 memory testing in a smaller scale: 1600MHz may simply not be fast enough anymore.


Comments