Jump to content
  • Sign Up

Want to use DirectX 12 [DX12] for GW2? Here's a guide on using the D912PXY on Windows 10


Recommended Posts

  • 3 weeks later...

Hi guys. Not sure were I should drop my question so I try here (were else?).I'm running D912PXY + some other addons trough GW2 Addon Manager, as well as GShade. All this works perfectly.

My problem is that I'm unable to game capture or window capture GW2.exe with OBS Studio. Capture screen is blank, then sometimes OBS crashes. The only way is screen capture, wich is annoying.Can anyone successfully capture GW2.exe with d912pxy + GShade trough OBS ?

Link to comment
Share on other sites

@Quadeard.8457 said:Hi guys. Not sure were I should drop my question so I try here (were else?).I'm running D912PXY + some other addons trough GW2 Addon Manager, as well as GShade. All this works perfectly.

My problem is that I'm unable to game capture or window capture GW2.exe with OBS Studio. Capture screen is blank, then sometimes OBS crashes. The only way is screen capture, wich is annoying.Can anyone successfully capture GW2.exe with d912pxy + GShade trough OBS ?

Hi, that would be an issue with OBS specifically and how it hooks into what's being rendered. I'd jump on the d912pxy discord channel and ask there. Otherwise ask the guys who author the OBS software.Good luck!

Link to comment
Share on other sites

@"Quadeard.8457" said:Hi guys. Not sure were I should drop my question so I try here (were else?).I'm running D912PXY + some other addons trough GW2 Addon Manager, as well as GShade. All this works perfectly.

My problem is that I'm unable to game capture or window capture GW2.exe with OBS Studio. Capture screen is blank, then sometimes OBS crashes. The only way is screen capture, wich is annoying.Can anyone successfully capture GW2.exe with d912pxy + GShade trough OBS ?OBS has several conflict issues. The one you are experiencing might just be another one.

https://obsproject.com/wiki/Known-Conflicts

Link to comment
Share on other sites

@Ashantara.8731 said:

@"Mack.3045" said:I'd jump on the reshade discord channel and check there

So you are telling me that everyone on here who is using ReShade is doing so
without
a "Skip UI" option? :o The text labels are so blurry, it's unbearable to the eye.

This is a bit late but I made my own UIMask .PNG file and replaced the blank one inside the "reshade-shader/textures" folder. You can use a program like GIMP (GNU Image Manipulation Program, these forums block the acronym :s ) and import a screenshot from the game, and then paint over the areas that you use for your static UI elements in white, and the rest of the image in black (I may have the colours reversed). It was tricky but with some tutorials using GnuIMP I found online I got it done.

Then while in game you set "UIMask-Top" at the top of the active effects you don't want to affect your UI, and UIMask-Bottom after said effects.

https://github.com/crosire/reshade-shaders/blob/master/Shaders/UIMask.fx

Link to comment
Share on other sites

@Quadeard.8457 said:Hi guys. Not sure were I should drop my question so I try here (were else?).I'm running D912PXY + some other addons trough GW2 Addon Manager, as well as GShade. All this works perfectly.

My problem is that I'm unable to game capture or window capture GW2.exe with OBS Studio. Capture screen is blank, then sometimes OBS crashes. The only way is screen capture, wich is annoying.Can anyone successfully capture GW2.exe with d912pxy + GShade trough OBS ?

Hello guys. I solved my problem !OBS uses BitBlt capture method for the game capture or default window capture. This method doesn't handle hardware accelerated app. The same happens when you try to capture Chromium or Discord for exemple.You have to capture the game with window capture and select Windows Graphics Capture (WGC) method. The downside is that WGC adds a yellow border to your game window (not to the stream though) and always records the cursor position. This is by design by Windows for security reasons.

Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...

@Polish Hammer.6820 said:So i had this installed before but it was an old version and I want to update to the newest one. However the zip file I download doesn't include an installer. How do I install the latest release? Windows defender didn't remove it either. It just isn't included.

Hello, i just downloaded a fresh zip of 2.4 and unpacked. The installer file is there. Not sure what is happening on your end !

Link to comment
Share on other sites

@Mack.3045 said:

@Polish Hammer.6820 said:So i had this installed before but it was an old version and I want to update to the newest one. However the zip file I download doesn't include an installer. How do I install the latest release? Windows defender didn't remove it either. It just isn't included.

Hello, i just downloaded a fresh zip of 2.4 and unpacked. The installer file is there. Not sure what is happening on your end !

I don't see install.exe file in new 2.4 build as well even before unpacking it (so it's for sure not deleted by defender):

Edit: NVM - By mistake, I downloaded source code instead of release... :#

Link to comment
Share on other sites

  • 2 weeks later...

@"alcopaul.2156" said:so the client hooks up to DX9 which hooks up to an Interface that hooks up to DX12 libraries and from there, you do the reverse just for the client to display graphics?

DX9 is backwards compatible and it is assumed that old software runs fast on new software, right?

Hi :)

Good question!

I'll give a in-depth technical explanation to answer your question and explain the mechanisms used to go from dx9>translation layer>dx12 render pipeline by using the d912pxy with GW2.

Dx912pxy and the D3D12 transitional layer.

What does this do?This translation layer provides the following high-level constructs or components (and more) for the GW2 engine to implement in the rendering pipeline.

Resource bindingThe D3D12 resource binding model is quite different from D3D9 and prior. Rather than having a flat array of resources set on the pipeline which map 1:1 with shader registers, D3D12 takes a more flexible approach which is also closer to modern hardware. The translation layer takes care of figuring out which registers a shader needs, managing root signatures, populating descriptor heaps/tables, and setting up null descriptors for unbound resources.

Resource renamingD3D9 and older have a concept of DISCARD CPU access patterns, where the CPU populates a resource, instructs the GPU to read from it, and then immediately populates new contents without waiting for the GPU to read the old ones. This pattern is typically implemented via a pattern called "renaming", where new memory is allocated during the DISCARD operation, and all future references to that resource in the API will point to the new memory rather than the old. The translation layer provides a separation of a resource from its "identity," which enables cheap swapping of the underlying memory of a resource for that of another one without having to recreate views or rebind them. It also provides easy access to rename operations (allocate new memory with the same properties as the current, and swap their identities).

Resource sub-allocation, pooling, and deferred destructionD3D9-style apps can destroy objects immediately after instructing the GPU to do something with them. D3D12 requires applications to hold on to memory and GPU objects until the GPU has finished accessing them. Additionally, D3D9 apps suffer no penalty from allocating small resources (e.g. 16-byte buffers), where D3D12 apps must recognize that such small allocations are infeasible and should be sub-allocated from larger resources. Furthermore, constantly creating and destroying resources is a common pattern in D3D9, but in D3D12 this can quickly become expensive. The translation layer handles all of these abstractions seamlessly.

Batching and threadingSince D3D9 patterns generally require applications to record all graphics commands on a single thread, there are often other CPU cores that are idle. To improve utilization, the translation layer provides a batching layer which can sit on top of the immediate context, moving the majority of work to a second thread so it can be parallelized. It also provides threadpool-based helpers for offloading PSO compilation to worker threads (d912pxy 2.4.1 uses a configurable PSO cache amount plus native DX12 caching at the driver level ) Combining these means that compilations can be kicked off at draw-time on the application thread, and only the batching thread needs to wait for them to be completed. Meanwhile, other PSO compilations are starting or completing, minimizing the wall clock time spent compiling shaders.

Residency management (memory management)This layer incorporates the open-source residency management library to improve utilization on low-memory systems.

The other component here to consider is user hardware.

If someone is already CPU/GPU bound in DX9 natively with GW2 then i wouldn't expect the D912pxy to help.The end-user also needs a GPU that supports/ is capable of rendering with the DX12 api.

If there's CPU/RAM/GPU/VRAM resources available/untapped, then definitely expect an uplift in performance. Ultimately, try it for yourself and see :)

Link to comment
Share on other sites

@Mack.3045 said:

@"alcopaul.2156" said:so the client hooks up to DX9 which hooks up to an Interface that hooks up to DX12 libraries and from there, you do the reverse just for the client to display graphics?

DX9 is backwards compatible and it is assumed that old software runs fast on new software, right?

Hi :)

Good question!

I'll give a in-depth technical explanation to answer your question and explain the mechanisms used to go from dx9>translation layer>dx12 render pipeline by using the d912pxy with GW2.

Dx912pxy and the D3D12 transitional layer.

What does this do?
This translation layer provides the following high-level constructs or components (and more) for the GW2 engine to implement in the rendering pipeline.

Resource binding
The D3D12 resource binding model is quite different from D3D9 and prior. Rather than having a flat array of resources set on the pipeline which map 1:1 with shader registers, D3D12 takes a more flexible approach which is also closer to modern hardware. The translation layer takes care of figuring out which registers a shader needs, managing root signatures, populating descriptor heaps/tables, and setting up null descriptors for unbound resources.

Resource renaming
D3D9 and older have a concept of DISCARD CPU access patterns, where the CPU populates a resource, instructs the GPU to read from it, and then immediately populates new contents without waiting for the GPU to read the old ones. This pattern is typically implemented via a pattern called "renaming", where new memory is allocated during the DISCARD operation, and all future references to that resource in the API will point to the new memory rather than the old. The translation layer provides a separation of a resource from its "identity," which enables cheap swapping of the underlying memory of a resource for that of another one without having to recreate views or rebind them. It also provides easy access to rename operations (allocate new memory with the same properties as the current, and swap their identities).

Resource sub-allocation, pooling, and deferred destruction
D3D9-style apps can destroy objects immediately after instructing the GPU to do something with them. D3D12 requires applications to hold on to memory and GPU objects until the GPU has finished accessing them. Additionally, D3D9 apps suffer no penalty from allocating small resources (e.g. 16-byte buffers), where D3D12 apps must recognize that such small allocations are infeasible and should be sub-allocated from larger resources. Furthermore, constantly creating and destroying resources is a common pattern in D3D9, but in D3D12 this can quickly become expensive. The translation layer handles all of these abstractions seamlessly.

Batching and threading
Since D3D9 patterns generally require applications to record all graphics commands on a single thread, there are often other CPU cores that are idle. To improve utilization, the translation layer provides a batching layer which can sit on top of the immediate context, moving the majority of work to a second thread so it can be parallelized. It also provides threadpool-based helpers for offloading PSO compilation to worker threads (d912pxy 2.4.1 uses a configurable PSO cache amount plus native DX12 caching at the driver level ) Combining these means that compilations can be kicked off at draw-time on the application thread, and only the batching thread needs to wait for them to be completed. Meanwhile, other PSO compilations are starting or completing, minimizing the wall clock time spent compiling shaders.

Residency management (memory management)
This layer incorporates the open-source residency management library to improve utilization on low-memory systems.

The other component here to consider is user hardware.

If someone is already CPU/GPU bound in DX9 natively with GW2 then i wouldn't expect the D912pxy to help.The end-user also needs a GPU that supports/ is capable of rendering with the DX12 api.

If there's CPU/RAM/GPU/VRAM resources available/untapped, then definitely expect an uplift in performance. Ultimately, try it for yourself and see :)

I like the part that it goes multicore when rendering.

and by your definition, there's lots of going on with the DX12 and your interface/translation layer says it handles them all meticulously.

but say if the client goes DX9 -> interface/translation layer -> DX12, will the graphics rendering be from DX12 to the interface/translation layer and finally to the GW2 client?

Link to comment
Share on other sites

@alcopaul.2156 said:

@alcopaul.2156 said:so the client hooks up to DX9 which hooks up to an Interface that hooks up to DX12 libraries and from there, you do the reverse just for the client to display graphics?

DX9 is backwards compatible and it is assumed that old software runs fast on new software, right?

Hi :)

Good question!

I'll give a in-depth technical explanation to answer your question and explain the mechanisms used to go from dx9>translation layer>dx12 render pipeline by using the d912pxy with GW2.

Dx912pxy and the D3D12 transitional layer.

What does this do?
This translation layer provides the following high-level constructs or components (and more) for the GW2 engine to implement in the rendering pipeline.

Resource binding
The D3D12 resource binding model is quite different from D3D9 and prior. Rather than having a flat array of resources set on the pipeline which map 1:1 with shader registers, D3D12 takes a more flexible approach which is also closer to modern hardware. The translation layer takes care of figuring out which registers a shader needs, managing root signatures, populating descriptor heaps/tables, and setting up null descriptors for unbound resources.

Resource renaming
D3D9 and older have a concept of DISCARD CPU access patterns, where the CPU populates a resource, instructs the GPU to read from it, and then immediately populates new contents without waiting for the GPU to read the old ones. This pattern is typically implemented via a pattern called "renaming", where new memory is allocated during the DISCARD operation, and all future references to that resource in the API will point to the new memory rather than the old. The translation layer provides a separation of a resource from its "identity," which enables cheap swapping of the underlying memory of a resource for that of another one without having to recreate views or rebind them. It also provides easy access to rename operations (allocate new memory with the same properties as the current, and swap their identities).

Resource sub-allocation, pooling, and deferred destruction
D3D9-style apps can destroy objects immediately after instructing the GPU to do something with them. D3D12 requires applications to hold on to memory and GPU objects until the GPU has finished accessing them. Additionally, D3D9 apps suffer no penalty from allocating small resources (e.g. 16-byte buffers), where D3D12 apps must recognize that such small allocations are infeasible and should be sub-allocated from larger resources. Furthermore, constantly creating and destroying resources is a common pattern in D3D9, but in D3D12 this can quickly become expensive. The translation layer handles all of these abstractions seamlessly.

Batching and threading
Since D3D9 patterns generally require applications to record all graphics commands on a single thread, there are often other CPU cores that are idle. To improve utilization, the translation layer provides a batching layer which can sit on top of the immediate context, moving the majority of work to a second thread so it can be parallelized. It also provides threadpool-based helpers for offloading PSO compilation to worker threads (d912pxy 2.4.1 uses a configurable PSO cache amount plus native DX12 caching at the driver level ) Combining these means that compilations can be kicked off at draw-time on the application thread, and only the batching thread needs to wait for them to be completed. Meanwhile, other PSO compilations are starting or completing, minimizing the wall clock time spent compiling shaders.

Residency management (memory management)
This layer incorporates the open-source residency management library to improve utilization on low-memory systems.

The other component here to consider is user hardware.

If someone is already CPU/GPU bound in DX9 natively with GW2 then i wouldn't expect the D912pxy to help.The end-user also needs a GPU that supports/ is capable of rendering with the DX12 api.

If there's CPU/RAM/GPU/VRAM resources available/untapped, then definitely expect an uplift in performance. Ultimately, try it for yourself and see :)

I like the part that it goes multicore when rendering.

and by your definition, there's lots of going on with the DX12 and your interface/translation layer says it handles them all meticulously.

but say if the client goes DX9 -> interface/translation layer -> DX12, will the graphics rendering be from DX12 to the interface/translation layer and finally to the GW2 client?

No, the rendering pipeline is GW2 Client, Game DAT, DX9, D912PXY >DX12API - your monitor :)

Link to comment
Share on other sites

@Mack.3045 said:

@alcopaul.2156 said:so the client hooks up to DX9 which hooks up to an Interface that hooks up to DX12 libraries and from there, you do the reverse just for the client to display graphics?

DX9 is backwards compatible and it is assumed that old software runs fast on new software, right?

Hi :)

Good question!

I'll give a in-depth technical explanation to answer your question and explain the mechanisms used to go from dx9>translation layer>dx12 render pipeline by using the d912pxy with GW2.

Dx912pxy and the D3D12 transitional layer.

What does this do?
This translation layer provides the following high-level constructs or components (and more) for the GW2 engine to implement in the rendering pipeline.

Resource binding
The D3D12 resource binding model is quite different from D3D9 and prior. Rather than having a flat array of resources set on the pipeline which map 1:1 with shader registers, D3D12 takes a more flexible approach which is also closer to modern hardware. The translation layer takes care of figuring out which registers a shader needs, managing root signatures, populating descriptor heaps/tables, and setting up null descriptors for unbound resources.

Resource renaming
D3D9 and older have a concept of DISCARD CPU access patterns, where the CPU populates a resource, instructs the GPU to read from it, and then immediately populates new contents without waiting for the GPU to read the old ones. This pattern is typically implemented via a pattern called "renaming", where new memory is allocated during the DISCARD operation, and all future references to that resource in the API will point to the new memory rather than the old. The translation layer provides a separation of a resource from its "identity," which enables cheap swapping of the underlying memory of a resource for that of another one without having to recreate views or rebind them. It also provides easy access to rename operations (allocate new memory with the same properties as the current, and swap their identities).

Resource sub-allocation, pooling, and deferred destruction
D3D9-style apps can destroy objects immediately after instructing the GPU to do something with them. D3D12 requires applications to hold on to memory and GPU objects until the GPU has finished accessing them. Additionally, D3D9 apps suffer no penalty from allocating small resources (e.g. 16-byte buffers), where D3D12 apps must recognize that such small allocations are infeasible and should be sub-allocated from larger resources. Furthermore, constantly creating and destroying resources is a common pattern in D3D9, but in D3D12 this can quickly become expensive. The translation layer handles all of these abstractions seamlessly.

Batching and threading
Since D3D9 patterns generally require applications to record all graphics commands on a single thread, there are often other CPU cores that are idle. To improve utilization, the translation layer provides a batching layer which can sit on top of the immediate context, moving the majority of work to a second thread so it can be parallelized. It also provides threadpool-based helpers for offloading PSO compilation to worker threads (d912pxy 2.4.1 uses a configurable PSO cache amount plus native DX12 caching at the driver level ) Combining these means that compilations can be kicked off at draw-time on the application thread, and only the batching thread needs to wait for them to be completed. Meanwhile, other PSO compilations are starting or completing, minimizing the wall clock time spent compiling shaders.

Residency management (memory management)
This layer incorporates the open-source residency management library to improve utilization on low-memory systems.

The other component here to consider is user hardware.

If someone is already CPU/GPU bound in DX9 natively with GW2 then i wouldn't expect the D912pxy to help.The end-user also needs a GPU that supports/ is capable of rendering with the DX12 api.

If there's CPU/RAM/GPU/VRAM resources available/untapped, then definitely expect an uplift in performance. Ultimately, try it for yourself and see :)

I like the part that it goes multicore when rendering.

and by your definition, there's lots of going on with the DX12 and your interface/translation layer says it handles them all meticulously.

but say if the client goes DX9 -> interface/translation layer -> DX12, will the graphics rendering be from DX12 to the interface/translation layer and finally to the GW2 client?

No, the rendering pipeline is GW2 Client, Game DAT, DX9, D912PXY >DX12API - your monitor :)

that's pretty much the flow that i wanted to see. ;)

Link to comment
Share on other sites

  • 1 month later...

@alcopaul.2156 said:I like the part that it goes multicore when rendering.It doesn't. Whoever says that is simply lying.

My CPU utilisation looks exactly the same with and without D912PXY.

D912PXY is a wrapper that adds a local (your hard drive / SSD) shader cache to the game. This is where all the performance comes from. The game normally would compile shaders at runtime. This causes a lot of performance issues as it puts even more load on the already heaviliy utilized mainthread.

The author calls this DX12 because a shader cache is part of DX12, which means nothing as older games used these caches across any API. Nobody would have stopped you from programming a shader cache for your DX9 or 11 game.

The problem with shader caches is their inflexibility. This means each time you install a new graphics card or even a new graphics driver this cache can cause issues and has to be rebuilt to maintain stability and functionality. That's why compiling shaders at runtime can be a good idea. The benefits can actually outweight the drawbacks.

D912PXY specifically lowers your GPU performance as it adds another layer in the rendering pipeline. Most users won't notice that as they are CPU bound in the game, but it is worth mentioning.

Link to comment
Share on other sites

@KrHome.1920 said:

@alcopaul.2156 said:I like the part that it goes multicore when rendering.It doesn't. Whoever says that is simply lying.

My CPU utilisation looks exactly the same with and without D912PXY.

D912PXY is a wrapper that adds a local (your hard drive / SSD) shader cache to the game. This is where all the performance comes from. The game normally would compile shaders at runtime. This causes a lot of performance issues as it puts even more load on the already heaviliy utilized mainthread.

The author calls this DX12 because a shader cache is part of DX12, which means nothing as older games used these caches across any API. Nobody would have stopped you from programming a shader cache for your DX9 or 11 game.

The problem with shader caches is their inflexibility. This means each time you install a new graphics card or even a new graphics driver this cache can cause issues and has to be rebuilt to maintain stability and functionality. That's why compiling shaders at runtime can be a good idea. The benefits can actually outweight the drawbacks.

D912PXY specifically lowers your GPU performance as it adds another layer in the rendering pipeline. Most users won't notice that as they are CPU bound in the game, but it is worth mentioning.

Please stop spreading misinformation on this thread. I suggest you become better educated on the matter.Quite simply, your subjective experience with the d912pxy is not quantifiable to make blanket statements to represent other's experience.The Asynchronous shader component is simply one aspect of d912pxy. If you want to discuss this further,I suggest you speak with Megai on the dx912pxy discord channel. I hope you are familiar with C++ programming and coding.

Regards.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...