Jump to content
The Lotus Eaters: Share Bug Reports and Feedback Here! ×

Instability on recent Intel Processors


Recommended Posts

While investigating crashes in Warframe we came across a particular series that were not crashing in our code (they were crashing in nvgpucomp64.dll, a component of Nvidia drivers). After aggregating hundreds of reports from helpful players we discovered a pattern: almost all were coming from systems with 13th and 14th generation Intel processors.

nvgpucomp64.dll crashes.png

 

Luckily we found a staff member who would encounter these crashes on his home computer. Curiously, his computer at the office was fine: he was playing with the same loadout, the same customizations, with the same people, but he would only crash at home.

He wasn’t over-clocking anything and it was a new machine so there was no reason to expect problems. We tried all of the usual fixes: he got the latest Windows Updates, he updated all his drivers, he disabled all third-party overlays being injected, he tested his RAM, and by all accounts everything was fine.

We ran aggressive stress-tests on similar machines: we used scripts to repeatedly open and close various user-interface components that were mentioned in crash reports, we ran endless simulated battles between squads of NPCs, and we even we made a test that would load up random levels, teleport around quickly to a whole bunch of vantage points to exercise the graphics driver, and then move on.

Everything was fine for us and yet he kept crashing doing the most basic things like launching the game and flying to a mission.

Because the crash wasn’t in our code it was hard to guess what we could be doing wrong but as we looked over the reports we noticed that these crashes tended to occur when the graphics driver was working very hard on all CPU-cores. The penny dropped when we realized that this was a particularly power-hungry state for the processor to be in and we were reminded of a recent report from Intel that suggested that a BIOS update might help.

BIOS updates aren’t usually delivered automatically by Windows Update although they are for certain OEMS: many of our office machines get regular updates from the vendor but the person who was crashing was using a custom-build gaming rig at home – he checked and it turned out that it was running the stock BIOS from 2022 and was missing over a dozen updates including one that “replaced tweaked system power settings.”

After updating his BIOS to the latest he hasn’t crashed in nvgpucomp64.dll since and we’re optimistic that the weird crashes that only he was getting won’t be back either. We’re not positive that it was the issue described by the report linked above but we’re happy that updating the BIOS helped.

Updating the BIOS is usually a simple process but it’s not something we would normally encourage people to do – usually the advice is “if it ain’t broke don’t fix it” – however if you’re crashing playing Warframe and other games, you have a 13th or 14th generation Intel processor, and you’ve updated everything else, then it’s something to consider (check with your motherboard vendor for updates and instructions).

If you happen to be playing on an AMD CPU or aren’t lucky enough to have a recent Intel processor, don’t worry: we have a bunch of fixes for crashes unrelated to this issue coming soon – we’re just waiting to get through cert on all platforms.

 

  • Like 40
Link to comment
Share on other sites

2 minutes ago, [DE]Glen said:

If you happen to be playing on an AMD CPU or aren’t lucky enough to have a recent Intel processor, don’t worry: we have a bunch of fixes for crashes unrelated to this issue coming soon – we’re just waiting to get through cert on all platforms.

Haven't seen this issue on the AMD side, with either Zen2, Zen4 or Zen4 3D. Absolute stability on my side, and I'm probably the only person here running a 7950x3D direct-die cooled with an NH-D15, CCD1 disabled, and extreme undervolted. Don't think I've touched a UEFI update since the SoC voltage issue on 7000x3D, so ~1.5 years here.

  • Like 4
Link to comment
Share on other sites

For AMD (what I am currently using), I've experienced nothing like this. I have only experienced ultra-low 1% lows on framerates and weird RAM-based crashes.
For reference, I have a Ryzen 5950x CPU and an RTX 3090, and I am running the game with high-ultra settings—still nothing like you've encountered.

Link to comment
Share on other sites

For those who are reticent to update their BIOS for whatever reason, another potential fix for this issue is to manually cap the maximum power draw of the CPU in the BIOS to whatever is appropriate for your CPU (if you've got an adequately specced power supply, anything up to around 500W should be fine, though most CPUs will do fine at a tenth of that).

The issue most likely arises from motherboard vendors setting the maximum power for these CPUs to 4096 watts, which the CPU then attempts to draw. Since the motherboard and power supply are both incapable of providing this amount, it can result in instability.

Looking at the pie chart above, we can see that the most common crashes were on high-end K-series CPUs, such as the 13900K and 14900KF, which further supports the theory that this was caused by motherboard power limit settings, as well as the fact that these issues do not seem to be present on OEM systems (which have no reason to boost to excessively high power limits in an attempt to eke out a few more points of performance). Hope this helps someone!

  • Like 4
Link to comment
Share on other sites

2 hours ago, Razgarize said:

For AMD (what I am currently using), I've experienced nothing like this. I have only experienced ultra-low 1% lows on framerates and weird RAM-based crashes.
For reference, I have a Ryzen 5950x CPU and an RTX 3090, and I am running the game with high-ultra settings—still nothing like you've encountered.

Dual CCD Ryzen CPUs aren't great for gaming past Ryzen 3000, so R9 5000 series and later. The 1% low drops you're experiencing are either the result of the gaming spilling onto the other CCD which increases frame time latency dramatically, or a combination of that and FCLK:MCLK:UCLK instability, which are all supposed to be 1:1:1. More than likely, you're running either 1600MHz or 1800MHz, depending on what DOCP profile your RAM is running (3200MHz = 1600MHz, 3600MHZ = 1800MHz). Ryzen 5000 is perfectly stable at 1800MHz, but maybe not when jumping CCDs intermittently, since that'll operate off the FCLK to do so.

There's some scenarios where Warframe's CPU utilization spikes massively, even up to 100% on my CPU. In the case of any dual CCD Ryzen CPUs, you'd be bridging the infinity fabric to the other CCD, which is going to increase frametime latency. 

Its a major reason why I disabled CCD1 on my 7950x3D, even fully tuned to mitigate it, once you bridge from your 3D v-cache CCD0 to CCD1, the benefits of 3D v-cache are eliminated. In this case, direct-die cooling a 7950x3D with CCD1 disabled lets me operate at ~400MHz higher than a 7800x3D under load, maintaning >5.2GHz on multiple cores. With Warframe, that's running the game at 240 fps versus 160 fps, Warframe loving 3D v-cache. Same behaviors exhibited with the 5800x3D I had prior when upgrading from a 3950x.

 

TLDR: Don't use dual CCD Ryzen CPUs for gaming. Lasso or processor affinity your 5950x if you need those extra 8 cores, otherwise, just disable CCD1. The other solution would be to sell your 5950x and buy a 5700x3D/5800x3D. You might be able to trade it 1:1 even, since the 5950x is still a capable productivity/server CPU with how abundant DDR4 ECC is still.

  • Like 2
Link to comment
Share on other sites

Posted (edited)
6 hours ago, [DE]Glen said:

After aggregating hundreds of reports from helpful players we discovered a pattern: almost all were coming from systems with 13th and 14th generation Intel processors.

Funny. I have Risen.

Looks like at the newest drivers no crashes.

Or, maybe it just Matrix has me, and I really have overheated, while you

6 hours ago, [DE]Glen said:

Everything was fine for us and yet he kept crashing doing the most basic things like launching the game and flying to a mission.

Yeah, maybe it's unrelated, me crashing only on the Event mission

 

 

Edited by -JT-_-R3W1ND
Link to comment
Share on other sites

16 hours ago, Demigirlboss said:

For those who are reticent to update their BIOS for whatever reason, another potential fix for this issue is to manually cap the maximum power draw of the CPU in the BIOS to whatever is appropriate for your CPU (if you've got an adequately specced power supply, anything up to around 500W should be fine, though most CPUs will do fine at a tenth of that).

The issue most likely arises from motherboard vendors setting the maximum power for these CPUs to 4096 watts, which the CPU then attempts to draw. Since the motherboard and power supply are both incapable of providing this amount, it can result in instability.

Looking at the pie chart above, we can see that the most common crashes were on high-end K-series CPUs, such as the 13900K and 14900KF, which further supports the theory that this was caused by motherboard power limit settings, as well as the fact that these issues do not seem to be present on OEM systems (which have no reason to boost to excessively high power limits in an attempt to eke out a few more points of performance). Hope this helps someone!

Those 13000 and 14000 lines are known to even die because Intel messed up the inner power management. In its original setting, it is locally drawing more than the silicon can handle. I would strongly recommend the BIOS update, though it does have an impact on performance.

If anyone is interested in a long read: https://community.intel.com/t5/Processors/June-2024-Guidance-regarding-Intel-Core-13th-and-14th-Gen-K-KF/m-p/1607807

Edited by kadlis12
  • Like 1
Link to comment
Share on other sites

On 2024-07-09 at 11:54 AM, Agall said:

Dual CCD Ryzen CPUs aren't great for gaming past Ryzen 3000, so R9 5000 series and later. The 1% low drops you're experiencing are either the result of the gaming spilling onto the other CCD which increases frame time latency dramatically, or a combination of that and FCLK:MCLK:UCLK instability, which are all supposed to be 1:1:1. More than likely, you're running either 1600MHz or 1800MHz, depending on what DOCP profile your RAM is running (3200MHz = 1600MHz, 3600MHZ = 1800MHz). Ryzen 5000 is perfectly stable at 1800MHz, but maybe not when jumping CCDs intermittently, since that'll operate off the FCLK to do so.

There's some scenarios where Warframe's CPU utilization spikes massively, even up to 100% on my CPU. In the case of any dual CCD Ryzen CPUs, you'd be bridging the infinity fabric to the other CCD, which is going to increase frametime latency. 

Its a major reason why I disabled CCD1 on my 7950x3D, even fully tuned to mitigate it, once you bridge from your 3D v-cache CCD0 to CCD1, the benefits of 3D v-cache are eliminated. In this case, direct-die cooling a 7950x3D with CCD1 disabled lets me operate at ~400MHz higher than a 7800x3D under load, maintaning >5.2GHz on multiple cores. With Warframe, that's running the game at 240 fps versus 160 fps, Warframe loving 3D v-cache. Same behaviors exhibited with the 5800x3D I had prior when upgrading from a 3950x.

 

TLDR: Don't use dual CCD Ryzen CPUs for gaming. Lasso or processor affinity your 5950x if you need those extra 8 cores, otherwise, just disable CCD1. The other solution would be to sell your 5950x and buy a 5700x3D/5800x3D. You might be able to trade it 1:1 even, since the 5950x is still a capable productivity/server CPU with how abundant DDR4 ECC is still.

This issue only occurs when playing the DirectX 12 version of the game. I am using a multipurpose workstation-grade desktop that I built myself. It has 64 GB of RAM running at 3600MHz CL 14 and a CPU with an automated overclock averaging 4.8GHz. I built this system not only for gaming but also for animation, game development, and coding projects.

Regarding the game's performance, the DirectX 12 version runs at an average of 290 fps on high-ultra (custom) settings, with 1% lows around 60 fps. After experimenting with the game settings, I found that these changes do not affect the game's stability. When reverting to DirectX 11, the game's stability improves, with the 1% lows increasing overall, but the average fps drops to around 140 fps. I'm confident the issue is software-related, but I do not have the time to fully investigate it.

As for the original post, game crashes do occur, but there is no conclusive evidence that the cause is hardware-related on the AMD side. If I had more time to conduct more studies and in-depth tinkering with the game's stability, I could potentially find an exact fix to report to DE. However, I believe DE is likely working on moving from DirectX 11 to DirectX 12, as this has been their historical approach over the past eight years. I hope that we will see more APIs in the future for better stability.

Link to comment
Share on other sites

10 hours ago, Razgarize said:

This issue only occurs when playing the DirectX 12 version of the game. I am using a multipurpose workstation-grade desktop that I built myself. It has 64 GB of RAM running at 3600MHz CL 14 and a CPU with an automated overclock averaging 4.8GHz. I built this system not only for gaming but also for animation, game development, and coding projects.

Regarding the game's performance, the DirectX 12 version runs at an average of 290 fps on high-ultra (custom) settings, with 1% lows around 60 fps. After experimenting with the game settings, I found that these changes do not affect the game's stability. When reverting to DirectX 11, the game's stability improves, with the 1% lows increasing overall, but the average fps drops to around 140 fps. I'm confident the issue is software-related, but I do not have the time to fully investigate it.

As for the original post, game crashes do occur, but there is no conclusive evidence that the cause is hardware-related on the AMD side. If I had more time to conduct more studies and in-depth tinkering with the game's stability, I could potentially find an exact fix to report to DE. However, I believe DE is likely working on moving from DirectX 11 to DirectX 12, as this has been their historical approach over the past eight years. I hope that we will see more APIs in the future for better stability.

I would try adjusting Processor Affinity, it works live as you change it. You get to it with Task Manager > details > right click Warframe.exe > Set Affinity, then limit the game to cores 0-15 to restrict it to CCD0. I could see DX12 using more than 8c/16t and bridging to CCD1.

If that doesn't resolve it, then you probably just need to reinstall chipset drivers from AMD.com.

Link to comment
Share on other sites

This is truly funny. I'm rocking my old i7 4790 at 4GHz with a second-hand RX 6600 XT, 16GB RAM DDR3 at 2133MHz under CachyOS (Arch Linux) using Proton-GE and DirectX 12.

Warframe runs above 90fps in all instances, with stable, smooth frame times and no crashes. I have no idea what's going on with the latest Intel processors.

  • Like 1
Link to comment
Share on other sites

On 2024-07-09 at 7:39 AM, Demigirlboss said:

The issue most likely arises from motherboard vendors setting the maximum power for these CPUs to 4096 watts, which the CPU then attempts to draw. Since the motherboard and power supply are both incapable of providing this amount, it can result in instability.

4096W is 4.1 kilowatts. No household wall outlet is going to supply that, remember that it's Volts * Amperes = Watts. To use American standards, it's usually 120V with a 15A circuit breaker, so a single house circuit will usually provide up to 1800W in total across all its outlets. A "4096W" setting is basically lazy programmer speak for "unlimited power draw".

Edited by Dalewyn
Link to comment
Share on other sites

Hey, thank you guys for taking a look at this and releasing some of the data! If this is a hardware degradation issue as speculated elsewhere hopefully this will help get to the bottom of it.

 

I can't help but notice though, that there are duplicate entries for the i7-14700k and i7-14700kf in the pie chart though. Is that a simple mistake in the plotting or was there some differentiator in the source dataset?

 

I'd also love to see some normalized data showing what proportion of each CPU type is experiencing issues, if you’re able to share the data. If one or more CPU models have a high enough error rate it might even be worth showing players an in-game message letting them know they might want to update their bios.

  • Like 2
Link to comment
Share on other sites

7 часов назад, sopibar сказал:

I can't help but notice though, that there are duplicate entries for the i7-14700k and i7-14700kf in the pie chart though. Is that a simple mistake in the plotting or was there some differentiator in the source dataset?

 

It is not duplication, but CPU's from different skews.

14700 - base model

14700k - overclockable model

14700f - base model without iGPU

14700kf - overclockable model without iGPU. 

There is also 14900K, 14900KF and 14900KS (which isn't sold in KFS variant). KS model is one that was pumped out to the brim (aka 6 gHz boost one iirc). But they are in limited supply and were released later.  

What is interesting though, is that i don't see much F models (ones without iGPU). And there are no base models as well, only overclockable ones. 

Link to comment
Share on other sites

7 hours ago, DimkaTsv said:

It is not duplication, but CPU's from different skews.

14700 - base model

14700k - overclockable model

14700f - base model without iGPU

14700kf - overclockable model without iGPU. 

There is also 14900K, 14900KF and 14900KS (which isn't sold in KFS variant). KS model is one that was pumped out to the brim (aka 6 gHz boost one iirc). But they are in limited supply and were released later.  

What is interesting though, is that i don't see much F models (ones without iGPU). And there are no base models as well, only overclockable ones. 

Sorry, I meant that the each SKU has multiple entries in the chart.

dupes.png

  • Like 2
Link to comment
Share on other sites

I had weird issues all last month with some new work PCs that would put a lot of pressure on the CPU like with programs creating PDFs and the like. The only way I found to fix it was to apply the latest BIOS for the motherboard manufacturers.

You can use tools like CPU-Z to help identify your Motherboard and what BIOS it is running, and a quick google search with that info should help you find the most up-to-date BIOS drivers from the manufacturer. If you have a Dell or HP system, you can always check their respective support sites for BIOS updates.

  • Like 2
Link to comment
Share on other sites

В 15.07.2024 в 21:35, sopibar сказал:

Sorry, I meant that the each SKU has multiple entries in the chart.

Oh, now i see what you meant. Could it be entries from DX11 and DX12?

Or some internal reporting differentiator, who knows? 

Or Windows vs Linux (nah for that percentages are definitely off). 

 

4 часа назад, RoStarGamerX сказал:

Does the 13600K/14600K or lower SKUs have this issue?

Right now doesn't seem so. But in future, who knows. Maybe they will be next on the line. But until then don't panic, but be alert on potential issues (basically for each issue evaluate possibility of it being a CPU). 

Edited by DimkaTsv
Link to comment
Share on other sites

I have faced several system restart issues only in Warframe, but the last time it physically killed my laptop and I was unable to repair it, but I need to buy a new laptop.

The strangest thing is that currently here in Brazil there is only one model of gaming laptop with Ryzen being sold, this is a leftover stock from two years ago and after it runs out we will only have Intel gen 13 and 14 on the shelves, no website is announcing the new AMD versions, I feel like something is being reserved. 

I was having a lot of problems with Intel Gen 12 and my laptop had no XMP settings or processing limitation. For me this is an old issue and the graph report is not showing all the processor models, the youtuber himself reports that it is a very random problem, there are reports on the internet of people reporting this problem in previous generations but only the 13th and 14th generation processors are Intel's concern.

Link to comment
Share on other sites

I have an i9 14900K and cant play a mission any longer than 20 minutes, iv reinstalled, verified and optimized the cache and i have no windows or gpu drivers up date! I just want to play the game. can anyone help me?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...