Experiences with D912PXY — Guild Wars 2 Forums
Home

Experiences with D912PXY

KrHome.1920KrHome.1920 Member ✭✭✭✭
edited April 6, 2020 in Account & Technical Support

In basically every single performance thread in this subforum someone shows up and recommends this wrapper to improve the CPU performance of Guild Wars 2.

Personally I have always stated that a wrapper can't improve performance by definition as it puts another abstraction layer on top of the existing layers. A wrapper is usually only used to improve compatibility, e.g. when you want to run a DX9 game on Linux, you need a DX9 to Vulkan/OpenGL wrapper as Linux does not support DX9. This costs performance. But some performance cost may still be better than not being able to run the game at all at the targeted OS.

I've bought a new 8 Cores / 16 Threads CPU a few days ago and wanted to give DX912PXY a try since GW2 does not even utilize 40% of my new CPU. My expectations have been low regarding the fact that this is only a wrapper.

Long story short: The wrapper is a double edged sword and I would never recommend using it in general.

Results:

  • Heavy draw call limit, e.g. Lion's Arch, Heart of the Mists or any other place with lots of players but no combat action on the screen: huge performance gains by at least 33% or better
  • Heavy CPU calculation limit, e.g. WvW blob fights where the CPU has to do a lot of game logic stuff: no performance effect at all as the draw calls are not a limiting factor in these scenarios
  • GPU limit, e.g. solo exploring the Path of Fire maps at 4K resolution or with the ingame supersampling option enabled on a midrange GPU: huge performance loss by at least 20% or worse

The last two points are what I expected. If you want to maximize your framerate in Lion's Arch, then you can use this wrapper. But keep in mind that you will lose a significant amount of performance every time you have a GPU limit.

Comments

  • TinkTinkPOOF.9201TinkTinkPOOF.9201 Member ✭✭✭✭

    I have seen some people on here with lower end PCs claim large gains, though I have not seen any hard proof.

    Testing on my old build, 9900k @5.1GHz, 980Ti, I saw FPS cut in half at best, WvW before at spawn 160-180fps, after wrapper, FPS in the 60-80 range, however that was not the worst part, the bad part was the stuttering, which can happen and not show in a simple FPS counter. Combat was unplayable, FPS didn't move much, however the stuttering got worse. OC on CPU and GPU were also removed after seeing this, just in case the OCs were not stable with the wrapper, no change.

    "When you power creep the game and make it so that spam gameplay is nearly as effective as deep knowledge and nuance, the quality of players will decrease." -Exedore

  • Infusion.7149Infusion.7149 Member ✭✭✭✭
    edited April 6, 2020

    I don't use d912pxy because it's for GW2 only (use dxvk), but if you have stuttering it is highly likely you have not built a shader cache or you're hitting a RAM limit with your CPU. When there is a shader cache being built it will be stuttering regardless of if you use d912pxy or dxvk.

    If you read the actual github you would know this:
    https://github.com/megai2/d912pxy

    Pop-ins/slow loading/missing objects

    Troubleshooting

    This is normal for clean/first install as tool generates shader cache from ground up. After shader cache is generated, load times will be much faster. If you want to eleminate this problem once and for all, use PSO precompile and/or ready-to-use shader packs

    https://github.com/megai2/d912pxy/wiki/Reporting-performance-issues

    Check that you do not hit VRAM or RAM limits. If you hit it, do not report anything about performance.

    set nv_disable_throttle value to 1 in config file if you have NVidia GPU

    Be sure that you properly configured vertical sync/variable refresh rate behaivour/triple buffering both in game settings and in driver settings. This options define frame synchronization, making different frame limiting situations. All "off" should yield uncapped fps.

    https://github.com/megai2/d912pxy/wiki/HLSL-recompilation-and-loading

    Shader recompilation creates huge lag if implemented as is.
    Adding fact that DX12 uses monolitic PSO, which includes all shaders, there are even more objects to load, in difference to DX9.
    This makes DX9-style immediate shader load and even draw command execution too time consuming.
    d912pxy solves this problem by asynchronous shader recompilation and loading.
    This means that draw calls with newly loaded shader are working properly in DX9, while d912pxy skips them until they are loaded.

    TL;DR; d912pxy loads shaders asynchrounosly (sic) , because there is no efficient way to load them instantly.

    This can create some visual errors, but results in much better (200% of min. FPS boost) performance and smooth frame rates.

    In addition, there's variance based on AMD / Nvidia drivers so if you had issues it is likely because d9d12pxy uses async. Maxwell (900 series) GPUs have much lower asynchronous compute potential as they cannot process things concurrently. In fact it has been disabled in some cases , see : https://www.nvidia.com/en-us/geforce/forums/discover/236511/async-compute-disabled-at-driver-level-for-maxwell/

    For GCN-based GPUs from AMD, there's no doubt about it however since asynchronous compute has been present since 2012.

    If you really want an accurate comparison for d912pxy you need to use the flags: "perf_graph=1" to generate performance_graph for dx12 , "use_dx9=1" to generate performance_graph for dx9 .

    For D9VK (rolled into DXVK now) on Windows:

    Also not every 8 core 16 thread CPU is built the same. For example an Intel i9-9900k uses ringbus so each core has equal access to each other, i7-7820X uses mesh, and Ryzen 7 3800X / 3700X / 2700X uses Infinity Fabric which is similar to mesh but cores across CCX have slightly higher latency than ringbus while being similar in latency otherwise; each CPU has different cache sizes which also have an impact on performance. Unless you specify the CPU, the clockspeed , the memory used, and the NvME SSD / SATA SSD used (which will be a limitation for shaders) then just stating 8 core 16 thread is pointless. Not to mention if you are talking about GPU "bottlenecking" , it helps to specify which GPU you are using along with the driver version.

    If you're really skeptical about drawcall limits on Dx9 / dx11 you can get download 3dmark and use the api overhead test.

    edit: see also https://software.intel.com/en-us/articles/understanding-directx-multithreaded-rendering-performance-by-experiments

  • KrHome.1920KrHome.1920 Member ✭✭✭✭
    edited April 6, 2020

    @Infusion.7149 said:
    Also not every 8 core 16 thread CPU is built the same. For example an Intel i9-9900k uses ringbus so each core has equal access to each other, i7-7820X uses mesh, and Ryzen 7 3800X / 3700X / 2700X uses Infinity Fabric which is similar to mesh but cores across CCX have slightly higher latency than ringbus while being similar in latency otherwise; each CPU has different cache sizes which also have an impact on performance. Unless you specify the CPU, the clockspeed , the memory used, and the NvME SSD / SATA SSD used (which will be a limitation for shaders) then just stating 8 core 16 thread is pointless. Not to mention if you are talking about GPU "bottlenecking" , it helps to specify which GPU you are using along with the driver version.

    I did not post my PC specs as they do not matter. The differences between the architectures are negligible in terms of D912PXY (and you know that, if you have just half as much knowledge about the topic as you pretend). The only important information was that my CPU has tons of ressources additionally to the mainthread (which puts one CPU core at 100% load and creates the bottleneck), which is the critical point in this whole topic.

    Ryzen 7 3700X
    X570 Chipset
    32GiB DDR4 3200 CL16 Ram
    Radeon RX 590 (overclocked by 15%)

    The latest AGESA update, the latest Win10 x64 version, the latest chipset drivers and the latest GPU drivers are installed.

    The game is running at max details in various resolutions (to create a CPU or GPU limit on demand).

    And you know what: The results on my old Intel 6 core rig are exactly the same. Who would have thought that...

    In addition, there's variance based on AMD / Nvidia drivers so if you had issues it is likely because d9d12pxy uses async. Maxwell (900 series) GPUs have much lower asynchronous compute potential as they cannot process things concurrently.

    The draw call improvements of the "real" DX12 are not bound to asynchronous computing. If the wrapper relies on AC to produce its results then it works different than DX12 and the name of the wrapper is misleading. I would not be surprised if this is the case, because it is impossible to eliminate the mainthread (that's what DX12 does) with a wrapper.

  • well, as a linux user:
    plain wine: almost unplayable
    wine/mesa with gallium dx9 tracker: playable
    wine with dvxk: very smooth except during wvw blob fights.

    And that's it.

  • Rukario.1695Rukario.1695 Member ✭✭✭
    edited April 6, 2020

    Being a user of D912Pxy for the last year or so, along with working on ReShade to support it, I can't comprehend how your framerate is getting cut by any amount at any point - unless you had made no changes to your config.ini to match specifications for your GPU. The last few months I've been running at 4k with a GTX1080 and an i7-2700k @ 4.3GHz and I have never experienced this. The configurations, if done manually, will require some technical finesse and quite a lot of trial and error. There are users on the Discord server that can assist you to an extent with this.

    Note that, as another person pointed out, it sounds like you've likely not built a sizable cache to avoid the PSO Cache performance drops, having to save it to your current disk so it can simply be recalled later. This can significantly increase the amount of data that has to be processed, new, when encountering any new players which all have different skins and skills, flashes and effects. It all takes time to render, convert, and save to your HDD.

    Regarding performance, my frametimes at all points during gameplay (WvW, World Bosses, general exploration,) are all higher and significantly more stable than without D912Pxy - FPS is increased roughly by 8 at the lowest, upwards to +100 in a non-intensive environment. In my opinion, running standard DX9 GW2 feels like the equivalent to running a game at 30FPS when it allows native 60.

    I'm not going to tell you to keep using it, but your original post reads as rather biased with a hint of not knowing the configurations; I'm not sure if you're using the non-AVX, AVX, or AVX2 builds which are all separate, nor have you listed which revision of D912Pxy you're using; needless to say that being reluctant to share your own specs doesn't bode well for the sake of this argument, either.

    It doesn't seem like you want to find out what could be wrong, rather it seems you're jumping to the forum out of frustration - and I'll just say that the Developer and other contributors do not visit here very often, if at all. If you want to provide constructive feedback and see if anything can be done to alleviate any issues you're facing, I would recommend joining the Guild Wars 2 Development Community Discord Server.

    https://discord.gg/YatuAm

    This link expires in one day.

    More violence, less violets I say. I'm rich you know, because I watch the ledges.

  • KrHome.1920KrHome.1920 Member ✭✭✭✭
    edited April 6, 2020

    @Rukario.1695 said:
    Being a user of D912Pxy for the last year or so, along with working on ReShade to support it, I can't comprehend how your framerate is getting cut by any amount at any point

    You are not in the GPU limit, if you can't reproduce an fps decrease - which is not surprising since your CPU is 9 years old and very slow by today's standards. A 2700K combined with a 1080 is CPU bound pretty much 100% of the time, except you run the game at 8K resolution.

    DX9 (80 fps)

    D912PXY (68 fps)

  • @KrHome.1920 said:
    Personally I have always stated that a wrapper can't improve performance by definition as it puts another abstraction layer on top of the existing layers. A wrapper is usually only used to improve compatibility, e.g. when you want to run a DX9 game on Linux, you need a DX9 to Vulkan/OpenGL wrapper as Linux does not support DX9. This costs performance. But some performance cost may still be better than not being able to run the game at all at the targeted OS.

    While the general idea is that a wrapper creates an overhead, there are certain cases where wrappers can actually improve performance, see here:

  • Mack.3045Mack.3045 Member ✭✭✭

    It's good to get everyone's feedback.
    I've only noticed performance gains - no losses. Running a RX 5700xt
    I still find the D912Pxy superior to running the game natively in DX9 or using the vulkan wrapper - D9VK.

    Perhaps there's an impact for midrange GPU's as you've suggested when GPU bound. I'm sure some tweaking in the D912Pxy setting would help.
    Are you running the game with Vsync off both in-game and the Nvidia/AMD control panel ?

  • DemonSeed.3528DemonSeed.3528 Member ✭✭✭✭

    I am also having luck with d912pxy, have not tried D9VK yet, but it is definitely helping compared to without. i7-4790K with gtx1070. I use it with frame limits.

  • jokke.6239jokke.6239 Member ✭✭✭

    Wow, just wow. I just tried this out.
    I have a rather curious setup. I build a budget PC just before ryzen 3-4 years ago. I went with a budget CPU because to save money and be able to afford a GTX 1060, and was planning to upgrade to a better CPU later on, but never got around to it. And tbh it has been pretty sweet still in most games, but GW2 is another story.
    So if you have a weak CPU and okay'ish graphic card like me, this is SO worth it.

    Specs
    Pentium G4560
    GTX 1060 6 GB
    8 GB Ram

    Went from 30-55 fps in 1080p with a lot settings turned down, to 60 fps (have 60 cap on, cause 60 hz screen) in 1440p ultra settings.
    And just tested in in 4k Ultra. And I get 30-60 fps (yeah mostly just 30 but it jumps up and down, but not below 30, except for a world boss test I did, it dropped to 27 fps once).
    Thank you for this!

  • Junkpile.7439Junkpile.7439 Member ✭✭✭✭

    Anybody else dx12 user have this problem that game crash when you for example teleport in keep with big zerg in wvw?

    Low quality trolling since launch
    Seafarer's Rest EotM Hero

  • Astyrah.4015Astyrah.4015 Member ✭✭✭✭

    @Junkpile.7439 said:
    Anybody else dx12 user have this problem that game crash when you for example teleport in keep with big zerg in wvw?

    without the d912pxy logs we can only guess, my advise is to join their discord or in github, reproduce your crash and send them the logs/make a bug report over there.


    doing a quick guess (without knowing your system specs, etc.) i assume you might need to edit the default configuration and allow d912proxy to utilise more of your hardware -- but it's just better to ask the devs directly as this is more of a mod issue than an issue with GW2 itself

    Github
    https://github.com/megai2/d912pxy/wiki/Reporting-crashes

    Discord
    https://discord.gg/zqeHCEg