NVIDIA DLSS 5 Takes 2D Frame and Motion Vectors as Input
Below, you can see the complete Q&A exchange between Daniel Owen and Jacob Freeman, which was nicely compiled by VideoCardz. Numbers represent questions from Daniel Owen, while answers are provided by NVIDIA’s Jacob Freeman.
1: My biggest question is whether or not this model is effectively taking a single 2D frame as an input (with motion vectors) to create the output frame? This then relates to some clarifying questions about the information you provided, which I ask below:
Answer: Yes, DLSS 5 takes a 2D frame plus motion vectors as input.2: You say the underlying geometry and textures are unchanged. Is the model actually aware of things like 3D geometry, 3D depth, etc, or is it essentially just looking at a 2D image and then reinterpreting it similar to taking a screenshot of a game and asking an AI model to enhance the screenshot to make it more realistic?
Answer: DLSS 5 is trained end to end to understand complex scene semantics such as characters, hair, fabric and translucent skin, along with environmental lighting conditions like front-lit, back-lit or overcast, all by analyzing a single frame.4. You say that “With DLSS 5 the underlying geometry and textures are unchanged.” However, in one of the publicly available examples, it looks like a character grew hair in a place where it didn’t exist before. This seems to be a change to the model geometry and/or textures rather than just an enhancement of the lighting. In other words, it seems like an AI image generator reinterpreting the character, which can cause actual details to be changed. It also appears like some characters gain makeup that the original model did not seem to be wearing, which seems like changing the underlying texture.
Answer: The underlying geometry is unchanged. Also worth mentioning this is a very early preview of the tech.4. You say it enhances PBR properties on materials. Is it aware of the PBR properties by reading them from the game engine in some way, or is it just “looking” at the output image and inferring them. In other words, it is my understanding that game artists traditionally tell a shader what specific parts of a 3D model look like by giving it properties like metallic, roughness, normal maps (to simulate “bumpiness”), etc. Is DLSS 5 taking those artist created values as inputs, or is it inferring them from something more like the final output image.
Answer: DLSS 5 only takes the rendered frame and motion vectors as inputs. Materials are inferred from the rendered frame.5. When you describe the tools the developers have to control the model, it does not sound like there is actually a lot of control over the DLSS5 model’s interpretation of the artistic intent of a scene beyond color grading (or turning DLSS 5 “off” or “down” on some or all elements of the scene). For example, in the opening scene of Resident Evil Requiem, Grace is a character with a lot of trauma, who is reluctantly on her way to investigate a murder in the same location where her mother was murdered several years earlier. With DLSS 5 Off, I get more of that impression of the character. With DLSS 5 on, while she does seem to have more photo realistic lighting and skin, it also looks like she is wearing makeup that was not present in the original version. This gives off a very different impression of the character and emotional impact of the scene. Can developers actually control anything about the output image besides color grading? If they see the DLSS5 output and want to tweak it, like asking it not to apply makeup to the character, are they able to do so?
Answer: Developers will have detailed controls such as intensity and color grading. Artists can use these controls to adjust blending, contrast, saturation and gamma, and determine where and how enhancements are applied to maintain the games’ unique aesthetic. Developers can also mask specific objects or areas to be excluded from enhancement. We continue to talk to developers to understand all the ways they would like to control the technology. Ultimately, we see DLSS 5 as a tool for them to achieve their artistic vision, rather than be limited by the capabilities of traditional real-time rendering.6. Is this model limited to screen space? Or does it have any awareness of the environment outside of what is visible in an individual game frame?
Answer: DLSS 5 only takes the rendered frame and motion vectors as inputs.7. With the introduction of DirectML to DirectX allowing for hardware agnostic acceleration of ML tasks in games, does NVIDIA plan to continue with vendor locked versions of their machine learning based game technology like DLSS 5? Or will NVIDIA allow these models to run through DirectML on any hardware?
Answer: We have nothing to announce on this one at this time.
First Appeared on
Source link