Jump to content
Whispers in the Walls: Share Bug Reports and Feedback Here! ×

Hunting Hitches


[DE]Glen
 Share

Recommended Posts

I've spent a few days over the last few weeks investigating some performance problems that some people have reported recently. Specifically I've been looking at small pauses or stutters that usually only last a few frames (maybe a tenth of a second at most); some people call these “frame drops” but I prefer to just call them hitches.

 

The tricky part about this problem is that we never see it to the degree that some people are experiencing. I personally put quite a lot of time into playing the game and although I've seen this problem on occasion it's usually only enough to notice maybe once in a full day of playing.

 

We have several members of the team dedicated to testing the game on low-end systems; they are constantly on the lookout for performance problems and work closely with the art and programming teams to optimize slow areas they find. A few weeks ago they discovered a performance issue that would occur whenever you received an in-game transmission from the Lotus or an enemy.

 

As soon as we heard about this issue we had our performance guys looking for it and they did not get anything as bad as people were reporting even on their min-spec machines. We worked with one user through our customer support desk and got a system in our lab set up with the same GPU and the same driver installed and still couldn't replicate his problems even on old dual-core machine!

 

Ruling out Suspects

 

We do have some ideas, though, and I'm going to describe one theory we have and what you can do to help test it. In short, assuming you've covered the obvious “exit other programs you've left running”:

 

1) Make sure you're running the latest drivers for your video card.

 

2) Don't mess with your driver control panel.

 

I want to stress the first point because in several situations we've had users insist that they were running the latest drivers only to have their DxDiag show that there are newer drivers available. We've had several success stories that seem to have required nothing more than installing updated drivers so if you're having problems please check carefully for updates.

 

Regarding your driver control panel (eg: Catalyst Control Center, NVIDIA Control Panel, etc): if you've tried to tune options in there please try resetting it to defaults – there should be an option to do this somewhere; if this miraculously fixes your hitches I want to hear about it, specifically what the bad option was!

 

Prior Bad Acts

 

Historically when we loaded into a new level we would make sure that all the textures in the target area were loaded before closing the loading screen (this helps avoid distracting texture pops); when I was adding diagnostics to measure for hitches caused by texture streaming I discovered two things: it wasn't causing any hitches that I could measure, and that pre-fetching wasn't working properly.

 

The fix for pre-fetching went out earlier this week and while it's possible that with it some people will experience less hitches right after entering the level there is no hard evidence linking this to the crime; if you're playing this weekend and your hitches are gone I'd love to hear about it!

 

The Scene of the Crime

 

There is one area I've been working on optimizing for several days now; you'll notice this hitch or stutter if you run up to a closed door and have it open to show you a new area with materials you haven't seen before (I'm testing this using the Earth defence map which is a pretty good test case for me). You actually get a similar stutter when the loading screen starts fading away but you usually don't notice (I added diagnostics to measure this so I know it's there).

 

The reason for this hitch will take some time to explain but before I go on I want to stress that the underlying technology has been the same in our engine for many years and shipped without complaint in three other games. We currently have no working theory that would explain why this suddenly got worse (and in my personal experience as a player I've been aware of it for many months and haven't seen it any more frequently). Nevertheless while investigating possible explanations for peoples' hitches we realized that this could be improved and are working on it.

 

Some people have speculated that hitches coincident with doors opening were caused by content being loaded; this is extremely unlikely as we do not stream worlds this way. Tiles may go to sleep but their geometry and low-resolution textures stay resident; furthermore we don't actually respect doors when deciding if something should sleep so any issues related to it would not occur right as the door opens.

 

The Prime Suspect

 

Modern GPUs use small programs called shaders to render almost everything you see in-game; these programs tell the video-card how to transform geometry, how to color pixels, how to apply lighting, etc. We build these shaders into a compact and optimized platform-independent format that we deliver with the game.

 

The problem is that there are many, many different kinds of GPUs out there and they all have different technology for running these shaders. When we upload a new shader to the GPU the graphics driver has to first translate the program into a dialect that the GPU understands; this translation can take time and, if you upload a whole bunch of shaders at once, this time can add up to a hitch.

 

Some drivers are actually really clever: on my system the DirectX 11 drivers will use multiple CPU cores to do this translation in the background where possible; the DirectX 9 drivers are not so clever and I hitch 1-10ms per shader uploaded.

 

The Criminal Record

 

Astute readers might be wondering at this point why we don't just upload all possible shaders while the loading screen is up; the problem is we have shaders for every possible situation that could arise (but often doesn't); a typical level might have upwards of 15,000 shaders ready to go at a moment's notice even though we might only use 300 to 500 of them depending on what's needed.

 

If we uploaded the whole library at once this would add a brutal amount of time to the loading screen: on my workstation this added a full minute to the loading-screen when running DirectX 9 or made my game run at 10 FPS for 2 minutes when running DirectX 11 (while shaders were translated in the background).

 

Instead of uploading all shaders at startup we upload shaders on-demand (kind of like the same way we stream textures as they're needed); this approach isn't great but it's definitely the lesser of two evils. My theory is that on some systems these little hitches are actually unusually long hitches.

 

What I've been working on for several days is various ways to upload these shaders ahead of time (ideally during the loading screen so that they can be translated in the background); the problem I've found is finding a balance between loading too much and wasting time at the loading screen or not loading enough and hitching.

 

I've been working on adding diagnostics to help catch this problem in the act; I didn't get a chance to finish them in time for a hotfix this week but I'm hoping to get them out soon. When I'm ready I'll post again to explain what you can do to help gather evidence and build a case. In the mean time please make sure your drivers are at the latest, reset your driver control panel to defaults, and rest assured that we are working on this problem.

 

Edit:

 

Shader upload seems to be substantially faster for me in Dx11 mode because the driver seems to most of the work asynchronously; I've been testing some prototypes here and the differences are quite significant on both AMD and NVIDIA hardware. So, in addition to the prescription of latest-drivers I would suggest you leave DirectX 11 enabled (if you have performance problems in DirectX 11 mode just turn off runtime tessellation and it should fix it).
Link to comment
Share on other sites

Very interesting read about some of the under the hood processes involved with how the game renders stuff. I wish I could help gather evidence when you'll explain how, but sadly I've never experienced these hitches so I'm not sure if I'd be able to gather any worthwhile reports or such.

 

I was wondering though, if/when users experience performance issues (hitches, etc), does it reflect in the EE.log? Or is the lack of of such a thing the reason why you're working on adding diagnostics?

Edited by Letter13
Link to comment
Share on other sites

This will probably seem like a really simplistic question since I'm not a graphics engine programmer guru, but...   If it takes so much time to translate your shaders through the video driver, can you capture the "compiled" version and save that for later use?  Stamp your compiled shader with the driver version it was compiled under, and if anything changes on the user's video hardware, rebuild your cache?

Link to comment
Share on other sites

I don't think I've ever experienced anything super horrendous hitch wise either, and I have a high end desktop, a low end newer laptop, and a medium power older laptop. I wonder if some of the reports are blown a little out of proportion due to frustration. Glad you have an idea of where the answers lie though.

Link to comment
Share on other sites

This will probably seem like a really simplistic question since I'm not a graphics engine programmer guru, but...   If it takes so much time to translate your shaders through the video driver, can you capture the "compiled" version and save that for later use?  Stamp your compiled shader with the driver version it was compiled under, and if anything changes on the user's video hardware, rebuild your cache?

 

Sadly the process occurs inside the driver -- we don't have any control or visibility into the system. The feature you're describing might be great except I'd have to work at NVIDIA or AMD to implement it.

Link to comment
Share on other sites

Hmm, I haven't really seen any hitches with my game personally, though occasionally I get a massive lag spike that drops me to 2 FPS for 10 seconds or so, but that mostly seems to be due to running stuff in the background.

 

Anyways anything that boosts performance is a great update in my books.

 

 

(I run Warframe on my mid-range Laptop, at 40-60 fps)

Edited by Sixty5
Link to comment
Share on other sites

The Scene of the Crime

 
There is one area I've been working on optimizing for several days now; you'll notice this hitch or stutter if you run up to a closed door and have it open to show you a new area with materials you haven't seen before (I'm testing this using the Earth defence map which is a pretty good test case for me). You actually get a similar stutter when the loading screen starts fading away but you usually don't notice (I added diagnostics to measure this so I know it's there).

 

An issue that I have had in small amounts for a long time is that when turning quickly, bringing a different part of the tile into view, or opening a door the other side is white. This has gotten far worse with this weeks patches, to the point where it happens every time. I assumed it had something to do with culling planes. I don't notice any stuttering along with it, but I do wonder if it has a connection.

Edited by egregiousRac
Link to comment
Share on other sites

Even with high end PC I've been experiencing rare uber lags (less than 1 FPS) for maybe 4 or 5 seconds and then going back to the normal 60fps.

I experienced something similar back before Nova's Mprime was optimized so that it didn't do that big 'crunch'

 

What sort of setup are you using? 

Link to comment
Share on other sites

I get repeatable horrible FPS in certain rooms, main culprits are the orokin room with the white tree in the middle and the 4 staircases with the colored pad "puzzle" that opens 1-2 treasure rooms, and in Orokin Derelict room that starts as a long corridor but it's borken in the middle and there is a path going down with a lot of tentacles around it, I got 50-60 FPS pretty solidly until I get those those rooms. Around 10 in those rooms, alone or with enemies.

 

And If WF has been running for a couple hours I get multi-second complete freezes in multiplayer as host, drops everyone else, never had it happen recently after a re-start of the game, I make it a point of re-starting the game before doing more involved runs.

Edited by KriLL3
Link to comment
Share on other sites

I get repeatable horrible FPS in certain rooms, main culprits are the orokin room with the white tree in the middle and the 4 staircases with the colored pad "puzzle" that opens 1-2 treasure rooms, and in Orokin Derelict room that starts as a long corridor but it's borken in the middle and there is a path going down with a lot of tentacles around it, I got 50-60 FPS pretty solidly until I get those those rooms. Around 10 in those rooms, alone or with enemies.

 

And If WF has been running for a couple hours I get multi-second complete freezes in multiplayer as host, drops everyone else, never had it happen recently after a re-start of the game, I make it a point of re-starting the game before doing more involved runs.

 

Have you got runtime tessellation enabled?

Link to comment
Share on other sites

I experienced something similar back before Nova's Mprime was optimized so that it didn't do that big 'crunch'

 

What sort of setup are you using? 

 

i5-3570k @ 5.13GHz

 

16G DDR3 Ram

 

64G SSD

 

GTX670 2GB

 

Windows 8

 

Those lags happen rarely and randomly...

 

 

EDIT: I run @ 1080p with everything maxed.

Edited by ChinaTercel
Link to comment
Share on other sites

I've seen similar events to ChinaTercel.  It seems to happen most often running ODD though I've also seen it in T3S and a couple of missions.  Usually the frame drop lasts for 3-5 seconds, more often on the lower end.  They seem random though they happen more often when there are large numbers of enemies on the screen.

 

My system: 

i5-3570k @ 4.2GHz

 

8GB DDR3-1333 RAM

 

64 GB SSD

 

Radeon HD 7870 2GB @ 1100mhz Core & 1250mhz memory clocks.

 

Windows 7

 

1080p everything maxed as well. 

Link to comment
Share on other sites

If we uploaded the whole library at once this would add a brutal amount of time to the loading screen: on my workstation this added a full minute to the loading-screen when running DirectX 9 or made my game run at 10 FPS for 2 minutes when running DirectX 11 (while shaders were translated in the background).

 

Why not make an option for people who are having this problem to do load it all at once? Some people wouldn't mind a longer load time if it meant less hitching. It would also help the people affected to see if that is actually the cause.

Link to comment
Share on other sites

Why not make an option for people who are having this problem to do load it all at once? Some people wouldn't mind a longer load time if it meant less hitching. It would also help the people affected to see if that is actually the cause.

I imagine one reason is that then anyone else in the game with them would have to wait while they load for three days. And the extra shaders might use up more GPU memory, I don't know how that works.

Link to comment
Share on other sites

 

Why not make an option for people who are having this problem to do load it all at once? Some people wouldn't mind a longer load time if it meant less hitching. It would also help the people affected to see if that is actually the cause.

I think the problem here would be that if you go into an online match, everyone has to sit on a loading screen for the same amount of time. If one person has this enabled and the other three don't it could quickly become a point of contention among the community.

Link to comment
Share on other sites

Thanks for the post, it's interesting to read about the methodology of bug/hitch hunting from someone who's probably never going to run out of optimisation issues to solve.

 

Do you from the top of your head what the specs of your highest low end testing system is?

Link to comment
Share on other sites

I get this problem as well. My setup is kinda old though:

 

AMD Phenom 2 X4 965 (3.4 GHz quad core)

Dual Nvidia GTX 470 (running in SLI mode, or it should be in SLI but every so often I find SLI disabled)

6 GB RAM

Windows 7 64-bit

Nvidia nForce 980a SLI Motherboard

 

Wait....

 

Just noticed the Nvidia control panel won't open and Device Manager says there's no driver for a Coprocessor, which Google says refers to a processor on the motherboard so I apparently need to update the chipset drivers.  

 

Downloading and installing drivers now.

 

EDIT: 

 

Ok, apparently after I originally bought the Nvidia motherboard and assembled my computer, Nvidia got forced out of the motherboard making business and there are no drivers on their website for that board.  Should I even try random other websites who claim to have the driver?  

 

I mean, bundled rootkits and all.

 

Edit^2:  I'm an idiot.  Legacy Drivers.

Edited by KnotOfMetal
Link to comment
Share on other sites

Question: The game has like 15,000 possible shaders, but only uses 500(on average)...

 

Why not pre-calculate them, and load the highest-probability ones during the loading screen, and then let it go?

 

That is, certain shaders obviously work with certain rooms/features. If the game decides the first 5 rooms generated will include enemies 1, 2, 3, then pre-load the shaders for those enemies, the first 5 rooms, and the tenno we're bringing in.

 

Then, have the game decide probabilistically what else to stream up as the players are going... what rooms can connect to the ones at the end of the currently generated map, what enemies could spawn there, and load for that, while the game is still chugging along. Obviously, it's possible for something unexpected to happen, which would then cause the current issues seen, to some degree, but for the most part, it seems entirely plausible to predict what's most likely to be ahead, and prepare for it silently.

 

This would make more sense than the current chunky situation. It might cause a slight performance drop early in the map(before the game's done figuring out what's possible), but instead of facing the large "hitches" at critical points, it would just be one long, very slight slow down, during the least-important section of the game(the first rooms are always empty anyway, until someone patrols in or an alarm goes off). I think a mild performance loss while walking empty rooms, done that way, would be superior to the "I need to flying leap over these grineer and now I'm freezing in mid-air" that happens now.

Link to comment
Share on other sites

Very interesting read about some of the under the hood processes involved with how the game renders stuff. I wish I could help gather evidence when you'll explain how, but sadly I've never experienced these hitches so I'm not sure if I'd be able to gather any worthwhile reports or such.

 

I was wondering though, if/when users experience performance issues (hitches, etc), does it reflect in the EE.log? Or is the lack of of such a thing the reason why you're working on adding diagnostics?

 

 

lol you hardly play man.

Link to comment
Share on other sites

Did anybody discuss pro's/cons of not removing some of these things from memory in the first place?  Be it textures from ram, shaders from gpu, or whatever; memory is a lot cheaper than processing power, and is fairly rarely a computers choke point (and if it is you probably shouldn't be playing on said settings anyway). 

Link to comment
Share on other sites

lol you hardly play man.

The hundreds of hours I've spent in game seem to disagree with your statement (though over the past few weeks I've been busy with final projects & papers for university, as well as preoccupied with Dark Souls 2). I've been very lucky in that I've never really been one of the users who're affected by performance issues. There have been some instances in the past (i.e. PhysX performance issues), but my experience with the game has been more or less completely smooth and hitch free on both my desktop and laptop.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

×
×
  • Create New...