To deliver a full featured article for launch, my look at AMD's Ryzen Threadripper 2990WX and 2950X combined Windows and Linux performance in the same article. As it turns out, it was an error, since few people noticed that we had Linux benchmarks, despite the fact that there was an obvious demand for them.
Before publishing, I was discussing whether I would break Linux performance in its own article, but in this case I chose the combo because I felt the larger image was necessary. This is because in Windows, performance scaling on such a large CPU is lost or missed, while the Linux kernel appears to support AMD's biggest problem.
I will not stand here (or sit) and pretend to understand why 2990WX does not work well in all Windows tests because it's tough to get a clear answer from someone. Nobody wants to pass the blame, but in all cases it seems that a large part of the problem is Windows. This article exists to not only make it aware, but also mark a bit better what 2990WX is capable ̵
Instead of copying and pasting past Linux results in this new article, I tested both 2990WX and Intel Core i9-7980XE with a newer kernel (4.18.5 versus 4.15.0) and with some additional tests added. To see more thorough results with six other CPUs, I recommend checking out the 2990WX launch article.
Almost all Linux benchmarks I perform are achieved using the Phoronix Test Suite, which makes it almost also easy to generate many useful results very quickly (depending on the hardware, of course) . Unlike the launch article, this has 13 results in total (from 8), which represent a few angles of where a 64-thread CPU can shine if it gets the right opportunity.
Before testing, I run this command as sudo to enforce the performance profile:
echo performance | tee / sys / devices / system / cpu / cpu * / cpufreq / scaling_governor
As I mentioned in the 2990WX launch article, this command on the 2990WX returned an error saying the file did not exist. Interesting (or is it?), Does this error no longer exist on newer kernel, so that's a plus. At the same time, performance has not changed much between 4.18.5 and 4.15, so the error clearly did not mean too much.
One of the most interesting applications with high performance hardware is reproduction, and believe me there is no lack of renderers that can benefit from every kernel You can provide them (either CPU or GPU). The Blender test is used Cycles renderer only on the CPU, and if that's the case, AMD's many cores help to take over Intel's best-selling chip-and simply.
Blender is optimized quite well for these many core loads, and as I see this performance, I'm even more excited about version 2.80 of the package since it will introduce heterogeneous rendering (CPU + GPU) capabilities.
Despite the redundancy, I tested using HandBrake both in Linux and Windows for the launch article because it's always interesting (for me) to see the differences in performance for the same test on two different operating systems. Going that route turned out to be a blessing, because when I used a version in Linux, I used another newer one in Windows – and that version had major problems with Zen.
The version is 1.1.1, which is still the current stable version available on HandBrake's official website. I used 1.1.1 for Windows testing, which resulted in Zen-based chips taking twice as long for each code, while 1.1.0 in Linux had no such thing and gave us decent scaling. As it turns out, newer has more improvements, and not just for Zen.
For x264, the performance between 1.1.0 and the nightly is the same for both AMD and Intel. x265 shakes things pretty much, but provides improvements on both chips, although Intel does not get nearly as much as AMD here. The difference is simply amazing, so it goes without saying: If you're a HandBrake user who uses H.265, you'll have the last night.
Rendering and ray tracing are two peas in a pod so it's no surprise to see a large scale between the two CPUs here in these tests. I should note that none of these rays are considered "present", but some have been repeated in newer renderers. It does not mean that the performance of these tests is useless because, like any good rendering that is worth its weight in exchange, beam radiators are built to scale and all three of these tests are doing very well.
It's hard to ignore the fact that AMD only cuts Intel into the smallpt test, which is not uncommon ever since Zen dropped last year. I'm not sure how the particular test reflects the performance in today's landscape, but the safer scaling to look at is with Tachyon or C-ray, but as Blender has shown, perfect synthetic scaling does not always lead to perfect scaling in reality.
Scaling is the name of the game with scientific tests, so this Rodinia kit utilizes 2990WX 64 threads no problem. The scaling is better with the LavaMD test than the lover, but a 33% gain in low-value performance is hardly anything to iron.
AMD is great at encryption, and it is easily proven here. That said, this is a test Intel usually cuts through like a hot knife through butter. However, the pure performance of pressure that AMD points to is the one that is a certain winner. With a result like this, it is clear that Intel could compete extremely well if it decided to break through its 18-core barrier on the desktop (we are still waiting for the 28-core shown on Computex).
This set of results may be the most interesting in the article. In the launch assessment, 2990WX broke out the Intel MIP with 2000 MIPS, but with the core upgrade, the performance of AMD actually dropped . I had to check this health and it got stuck. For some reason, 4.15 gives better performance for this test. There is no good reason to use an older kernel, and it may be that the next kernel will fix it again. This is not the last time I'm going to test 2990WX, so I'll test again once at 4.20 (or 5.0 if it's skipped).
For the sake of interest, the same compression test in Windows is only 55K MIPS at 2990WX, so even with the performance drop, it is still far ahead of the operating system here.
This result requires some explanation, because it is fair to transfer it as final. AMD beats Intel by about 10% here, which is remarkable, but it is also because the AVX-512 was not introduced. In Windows, SiSoftware's Sandra will use the most appropriate instructional set, and so on on some CPUs and tests, Intel can come forward. Only the top of the Intel stack has this, but for a more comprehensive look, I'd recommend looking up the launch article.
These results also do not highlight a 2990WX memory signal problem initially. Overall bandwidth is good, but latencies are what makes the chip not so ideal for non-intensive purposes – like gaming. Due to time, I have not explored the 2990WX memory as much as I want, and it may take some time before I can jump because of other content that needs attention. Finally, 2990WX gives good bandwidth, but this result only sheds a part of the image. Therefore, you really pay your workload: You should not jump on a 2990WX over a piece like 2950X or 7960X / 7980XE unless you know your workload.
For this set of results, I decided to select relative performance since it was easier to keep all of them to the same graph. In the case of Darktable and Hackbench, the results are in seconds and the lower is better. The others have separate values, but always higher is better.
With Darktable, an Adobe Lightroom clone, and IndigoBench, a renderer, overall gains are smaller, but still remarkable. With Darktable, Intel's multimedia sample is super-close to AMD's much larger chip. Things change quite dramatically with the other tests, though.
With the HPC challenge, the fast computing power of 2990WX helps achieve 55% over Intel's chip. With HackBench, a referral point for system calls, AMD performs extremely well. Even 44% gain in the chess engine test is impressive. I really can not see too many chess fans that integrate a 2990WX into their games, but how amazing is it actually would be utilized?
As you can tell, AMD's 32-core Ryzen Threadripper 2990WX can beat a bit of serious ass in some tests, and still impress the others. You may have noticed a lack of single-testing tests here, and that's because I focused on multi-threaded tests to show what happens when it gets tough. It's a secret to nobody that the single-threaded performance of a 32 core processor will not be close to market leaders (single-run Windows tests were published in the launch article). It will not affect regular use, but the fact is: you will see lower performance in scenarios that depend heavily on single-performance performance.
This is the perfect example of a product that emphasizes how important it is to know your workload. There are some programs that just do not scale as well, even though they seem to. I do not have that problem in Linux that I can remember, but did in Windows with a few tests, although 100% of the CPU was used, the gain over half the thread was barely different.
With HandBrake here in Linux, not all threads will be used, so the 64-core chip does not show its strength as well as possible. To get around, I could have run the FFMPEG test that generates an FPS result, but I'm not quite sure how this reference is relevant to reality because I could not see the same type of scaling elsewhere. The same can also be said of Darktable; The reference will use full CPU, but I have never managed to get more than half of 2990WX threads used from authentication (but that's not to say it's not possible, I'd like feedback.)
If This article does not cover which type of performance you were after, please leave a comment and I will see if I can handle the next time the machine is connected (both the 7980XE and 2990WX are in their own dedicated PC so they are generally practical). You may also want to check the 2990WX launch article, although it largely focuses on Windows, as overall performance can still be transferred to your Tux solution.
There are more 2990WX tests coming on time, but other content and launches will prevent me from hoping too much in the near future, but suggestions for relevant tests are appreciated, as there are some feedback in general.