12.1 overview of this chapter
All the previous chapters describe the rendering system of UE based on the delayed rendering pipeline at the PC end, especially Analyze the unreal rendering system (04) - delay rendering pipeline The flow and steps of the delay rendering pipeline on the PC side are described in detail.
As long as this article describes the rendering pipeline at the mobile end of the UE, it will finally compare the rendering differences between the mobile end and the PC end, as well as special optimization measures. This chapter mainly describes the following contents of UE rendering system:
- Main processes and steps of fmobilescenerer.
- Forward and delayed rendering pipelines at the mobile end.
- Light, shadow and shadow at the mobile end.
- The similarities and differences between mobile terminal and PC terminal, as well as the special optimization skills involved.
In particular, the UE source code analyzed in this article has been upgraded to 4.27.1. Students who need to watch the source code synchronously should pay attention to the update.
If you want to open the mobile end rendering pipeline in the UE editor of the PC, you can select the following menu:
Wait until the Shader is compiled, and the preview effect of the mobile end will be in the viewport of the UE editor.
12.1.1 characteristics of mobile equipment
Compared with the PC desktop platform, the mobile terminal has significant differences in size, power, hardware performance and many other aspects, which are specifically reflected in:
-
Smaller size. The portability of the mobile terminal requires that the whole machine equipment must be lightweight and can be placed in the palm or pocket, so the whole machine can only be limited to a very small volume.
-
Limited energy and power. Limited by battery storage technology, the current mainstream lithium battery is generally 10000 Ma, but the resolution and image quality of mobile devices are getting higher and higher. In order to meet the long enough endurance and heat dissipation restrictions, the overall power of mobile devices must be strictly controlled, usually within 5w.
-
Limited heat dissipation. PC devices can usually be equipped with cooling fans or even water cooling systems, while mobile devices do not have these active cooling methods and can only rely on heat conduction. If the heat dissipation is improper, the CPU and GPU will actively reduce the frequency to run with very limited performance, so as to avoid damage to equipment components due to overheating.
-
Limited hardware performance. The performance of various components of mobile devices (CPU, bandwidth, memory, GPU, etc.) is only one tenth of that of PC devices.
Performance comparison chart of mainstream PC devices (NV GV100-400-A1 Titan V) and mainstream mobile devices (Samsung Exynos 9 8895) in 2018. Many hardware performance of mobile devices is only one tenth of that of PC devices, but the resolution is close to half of that of PC devices, which highlights the challenges and dilemmas of mobile devices.
By 2020, the performance of mainstream mobile devices is as follows:
-
Special hardware architecture. For example, CPU and GPU share memory storage devices, which are called coupled architecture, and GPU's TB(D)R architecture, which are designed to complete as many operations as possible in low power consumption.
Comparison diagram of decoupled hardware architecture of PC device and coupled hardware architecture of mobile device.
In addition, unlike the CPU and GPU on the PC side, which purely pursue computing Performance, there are three indicators to measure the Performance of the mobile end: Performance, Power and Area, commonly known as PPA. (figure below)
There are three basic parameters to measure mobile devices: Performance, Area and Power. Among them, Compute Density involves Performance and Area, and energy consumption ratio involves Performance and capacity consumption. The greater the better.
With the rise of mobile devices, XR devices are an important development branch of mobile devices. At present, there are XR devices with different sizes, functions and application scenarios:
Various forms of XR equipment.
With the recent explosion of Metaverse and the change of the name of FaceBook to Meta, and technology giants such as Apple, MicroSoft, NVidia and Google are stepping up the layout of future oriented immersive experience, XR devices, as the carrier and entrance closest to the imagination of Metaverse, naturally become a brand-new track with great potential to appear Big Macs in the future.
12.2 UE mobile terminal rendering characteristics
This chapter describes the rendering characteristics of UE4.27 on the mobile terminal.
12.2.1 Feature Level
The UE supports the following graphical API s at the mobile terminal:
Feature Level | explain |
---|---|
OpenGL ES 3.1 | The default feature level of Android system. You can configure specific material parameters in project settings (Project Settings > platforms > Android material quality - es31). |
Android Vulkan | The high-end renderer that can be used for some specific Android devices supports Vulkan 1.2 API. Vulkan with lightweight design concept will be more efficient than OpenGL in most cases. |
Metal 2.0 | Feature level dedicated to iOS devices. You can configure material parameters in Project Settings > platforms > iOS material quality. |
In the current mainstream Android devices, better performance can be obtained by using Vulkan. The reason is that Vulkan's lightweight design concept enables applications such as UE to perform optimization more accurately. The following is a comparison table of Vulkan and OpenGL:
Vulkan | OpenGL |
---|---|
Based on the state of the object, there is no global state. | A single global state machine. |
All state concepts are placed in the command buffer. | The state is bound to a single context. |
Can be multi-threaded coding. | Rendering operations can only be performed sequentially. |
It can accurately and explicitly manipulate the memory and synchronization of GPU. | The memory and synchronization details of the GPU are usually hidden by the driver. |
The driver has no runtime error detection, but there is a verification layer for developers. | Extensive runtime error detection. |
On the Windows platform, the UE editor can also start the simulators of OpenGL, Vulkan and Metal to preview the effect during the editor, but it may be different from the actual running device screen, so this function cannot be completely relied on.
Before opening Vulkan, you need to configure some parameters in the project. See the official document for details Android Vulkan Mobile Renderer.
In addition, the OpenGL support under windows has been removed in previous versions of UE. Although the simulation option of OpenGL still exists in the UE editor, the bottom layer is actually rendered with D3D.
12.2.2 Deferred Shading
Delayed shading of UE is a function added only in 4.26, which enables developers to realize more complex light and shadow effects on the mobile terminal, such as high-quality reflection, multi dynamic lighting, decals and advanced lighting features.
Up: forward rendering; Bottom: delay rendering.
If you want to enable delayed rendering on the mobile terminal, you need to add the r.Mobile.ShadingPath=1 field in DefaultEngine.ini under the project configuration directory, and then restart the editor.
12.2.3 Ground Truth Ambient Occlusion
Ground Truth Ambient Occlusion (GTAO) is an ambient occlusion technology close to the real world. It is a kind of shadow compensation. It can mask some indirect light, so as to obtain good soft shadow effect.
Turn on the effect of GTAO. Note that when the robot approaches the wall, it will leave a gradual soft shadow effect on the wall.
To enable GTAO, you need to check the following options:
In addition, GTAO depends on the option of Mobile HDR. In order to enable it on the corresponding target device, you also need to add the r.Mobile.AmbientOcclusionQuality field in the configuration of [Platform]Scalability.ini, and the value must be greater than 0, otherwise GTAO will be disabled.
It is worth noting that GTAO has performance problems on Mali devices because their maximum number of Compute Shader threads is less than 1024.
12.2.4 Dynamic Lighting and Shadow
The light source characteristics realized by UE at the mobile end include:
- HDR illumination in linear space.
- Lighting map with direction (considering normal).
- The sun (horizontal light) supports distance field shadows + resolved specular highlights.
- IBL illumination: each object samples the nearest reflection catcher without parallax correction.
- Dynamic objects can correctly receive light and cast shadows.
The type, quantity, shadow and other information of dynamic light sources supported by the UE mobile terminal are as follows:
Light source type | Maximum quantity | shadow | describe |
---|---|---|---|
Parallel light | 1 | CSM | CSM is level 2 by default and supports level 4 at most. |
Point source | 4 | I won't support it | Point light shadow requires cube shadow map, while the technology of single Pass rendering cube shadow (OnePassPointLightShadow) requires GS (SM5). |
Spotlight | 4 | support | It is disabled by default and needs to be enabled in the project. |
Area light | 0 | I won't support it | Dynamic area lighting effects are not currently supported. |
Dynamic spotlights need to be explicitly turned on in the project configuration:
In the pixel shader of mobile BasePass, spotlight shadow map shares the same texture sampler with CSM, and spotlight shadow and CSM use the same shadow map atlas. CSM can ensure that there is enough space, and the spotlights will be sorted by shadow resolution.
By default, the maximum number of visible shadows is limited to 8, but the upper limit can be changed by changing the value of r.mobile.maxvisiblemovablespotlightsshow. The resolution of spotlight shadows is based on screen size and r.Shadow.TexelsPerPixelSpotlight.
The total number of local lights (point lights and spotlights) in the forward render path cannot exceed 4.
The mobile terminal also supports a special shadow mode, that is, Modulated Shadows, which can only be used for fixed directional lights. The effect picture with modulation shadow turned on is as follows:
Modulation shadows also support changing shadow color and blending ratio:
Left: dynamic shadow; Right: modulation shadow.
The shadow at the mobile end also supports the setting of self shadow, shadow quality (r.shadowquality), depth offset and other parameters.
In addition, the mobile terminal uses GGX specular reflection by default. If you want to switch to the traditional specular shading model, you can modify it in the following configuration:
12.2.5 Pixel Projected Reflection
UE has made an optimized version of SSR for the mobile terminal, called Pixel Projected Reflection (PPR), which is also the core idea of reusing screen space pixels.
PPR renderings.
In order to turn on the PPR effect, the following conditions need to be met:
-
Turn on the MobileHDR option.
-
r. The value of mobile.pixelprojectedreflectionquality is greater than 0.
-
Set Project Settings > mobile and set the planar reflection mode to the correct mode:
The Planar Reflection Mode has three options:
- Usual: the function of planar reflection Actor is the same on all platforms.
- MobilePPR: the planar reflection Actor works normally on PC / host platforms, but uses PPR rendering on mobile platforms.
- MobilePPRExclusive: the planar reflection Actor will only be used for PPR on mobile platforms, leaving room for PC and Console projects to use traditional SSR.
By default, only high-end mobile devices can turn on r.Mobile.PixelProjectedReflectionQuality in [Project]Scalability.ini.
12.2.6 Mesh Auto-Instancing
Grid drawing pipeline at PC end Automatic mesh instance and merge rendering have been supported, which can greatly improve rendering performance. 4.27 this feature has been supported on the mobile terminal.
If you want to open it, you need to open DefaultEngine.ini under the project configuration directory and add the following fields:
r.Mobile.SupportGPUScene=1 r.Mobile.UseGPUSceneTexture=1
Restart the editor and wait for the Shader to compile to preview the effect.
Due to the need for gpuscenetext support, and the maximum Uniform Buffer of the Mali device is only 64kb, so it cannot support enough space. Therefore, the Mali device will use texture instead of buffer to store GPUScene data.
However, there are some limitations:
-
Automatic instantiation on mobile devices is mainly conducive to CPU intensive projects rather than GPU intensive projects. Although enabling automatic instantiation is unlikely to harm GPU intensive projects, it is unlikely to see significant performance improvements with it.
-
If a game or application requires a lot of memory, it may be better to turn off r.mobile.usegpuscenetext and use the buffer because it does not work properly on the Mali device.
You can also turn off r.mobile.usegpuscenetext for Mali devices, while devices from other GPU manufacturers are in normal use.
The effectiveness of automatic instantiation largely depends on the exact specification and positioning of the project. It is recommended to create a build with automatic instantiation enabled and conduct a summary analysis to determine whether substantive performance improvement will be seen.
12.2.7 Post Processing
Because mobile devices have such restrictive factors as slower dependent texture read, limited hardware features, special hardware architecture, additional rendering target resolution, limited bandwidth and so on, post-processing on mobile devices will consume performance, and some extreme cases will jam the rendering pipeline.
Nevertheless, in some games or applications with high image quality requirements, they still rely heavily on the strong expressiveness of post-processing to take several steps for high quality. UE will not restrict developers from using post-processing.
In order to enable post-processing, you must first enable the MobileHDR option:
After enabling post-processing, you can set various post-processing effects in the Post Process Volume.
The post-processing supported on the mobile terminal includes Mobile Tonemapper, Color Grading, Lens, Bloom, Dirt Mask, Auto Exposure, Lens Flares, Depth of Field, etc.
In order to obtain better performance, the official suggestion is to only open Bloom and TAA on the mobile terminal.
12.2.8 other characteristics and limitations
- Reflection Capture Compression
The mobile terminal supports the compression of the Reflection Capture Component, which can reduce the memory and bandwidth of the Reflection Capture runtime and improve the rendering efficiency. It needs to be enabled in the project configuration:
When enabled, ETC2 is used for compression by default. In addition, you can adjust for each Reflection Capture Component:
- Material properties
Materials on mobile platforms (property level Open ES 3.1) use the same node based creation process as other platforms, and most nodes are supported on the mobile end.
The material properties supported by the mobile platform include BaseColor, Roughness, Metallic, spectral, Normal, Emissive and reflection, but do not support Scene Color expression, Tessellation input and subsurface scattering shading model.
There are some limitations on the materials supported by mobile platforms:
- Due to hardware limitations, only 16 texture samplers can be used.
- Only DefaultLit and Unlit shading models are available.
- Custom UVs should be used to avoid relying on texture reading (no mathematical calculation of texture UV).
- Translucent and Masked materials are very energy consuming, so it is recommended to use opaque materials as much as possible.
- Depth fade can be used in translucent materials on iOS platforms, but it is not supported on platforms where the hardware does not support obtaining data from the depth buffer, resulting in unacceptable performance costs.
The material properties panel has some special options for the mobile end:
These attributes are described as follows:
-
Mobile separate transparency: whether to turn on a separate translucent rendering texture at the mobile end.
-
Use Full Precision: whether to Use Full Precision. If yes, it can reduce bandwidth occupation and energy consumption and improve performance, but there may be defects of distant objects:
Left: full precision material; Right: semi precision material, the sun in the distance has a defect.
-
Use Lightmap Directionality: whether to turn on the directionality of the lightmap. If checked, the direction and pixel normals of the lightmap will be considered, but the performance consumption will be improved.
-
Use Alpha to Coverage: whether MSAA anti aliasing is enabled for Masked materials. If checked, MSAA will be enabled.
-
Full rough: whether it is completely rough. If checked, it will greatly improve the rendering efficiency of this material.
In addition, the grid types supported by the mobile terminal include:
- Skeletal Mesh
- Static Mesh
- Landscape
- CPU particle sprites, particle meshes
None of the above types are supported. Other restrictions include:
- A single mesh can only be up to 65k, because the vertex index is only 16 bits.
- The number of bones in a single Skeletal Mesh must be less than 75 because of the limitation of hardware performance.
12.3 FMobileSceneRenderer
Fmobilescenereader inherits from fscenereader, which is responsible for the scene rendering process of the mobile terminal, while the PC terminal also inherits from fscenereader FDeferredShadingSceneRenderer . Their inheritance relationship is as follows:
Fdeferredshadingscenerer has been mentioned in the previous articles. Its rendering process is particularly complex, including complex light and shadow and rendering steps. In contrast, the logic and steps of fmobilescenerer will be much simpler. The following is the screenshot of RenderDoc:
The above mainly includes InitViews, ShadowDepths, PrePass, BasePass, OcclusionTest, shadowprojectonopaque, translucence, PostProcessing and other steps. These steps exist on the PC side, but the implementation process may be different. See the analysis in the following chapters.
12.3.1 renderer main flow
The main process of the scene renderer on the mobile end also occurs in fmobilescenerer:: render. The code and analysis are as follows:
// Engine\Source\Runtime\Renderer\Private\MobileShadingRenderer.cpp void FMobileSceneRenderer::Render(FRHICommandListImmediate& RHICmdList) { // Update element scene information. Scene->UpdateAllPrimitiveSceneInfos(RHICmdList); // Prepare the rendered area of the view PrepareViewRectsForRendering(RHICmdList); // Prepare sky atmosphere data if (ShouldRenderSkyAtmosphere(Scene, ViewFamily.EngineShowFlags)) { for (int32 LightIndex = 0; LightIndex < NUM_ATMOSPHERE_LIGHTS; ++LightIndex) { if (Scene->AtmosphereLights[LightIndex]) { PrepareSunLightProxy(*Scene->GetSkyAtmosphereSceneInfo(), LightIndex, *Scene->AtmosphereLights[LightIndex]); } } } else { Scene->ResetAtmosphereLightsProperties(); } if(!ViewFamily.EngineShowFlags.Rendering) { return; } // Wait for occlusion rejection test WaitOcclusionTests(RHICmdList); FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); // Initialize the view, find visible entities, and prepare RT and buffer data for rendering InitViews(RHICmdList); if (GRHINeedsExtraDeletionLatency || !GRHICommandList.Bypass()) { QUICK_SCOPE_CYCLE_COUNTER(STAT_FMobileSceneRenderer_PostInitViewsFlushDel); // Occlusion queries may be suspended, so it is best to let RHI thread and GPU work while waiting. In addition, when RHI thread is executed, this is the only place to process pending deletion FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThreadFlushResources); } GEngine->GetPreRenderDelegate().Broadcast(); // Commit global dynamic buffer before rendering starts DynamicIndexBuffer.Commit(); DynamicVertexBuffer.Commit(); DynamicReadBuffer.Commit(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_SceneSim)); if (ViewFamily.bLateLatchingEnabled) { BeginLateLatching(RHICmdList); } FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList); // Working with virtual textures if (bUseVirtualTexturing) { SCOPED_GPU_STAT(RHICmdList, VirtualTextureUpdate); FVirtualTextureSystem::Get().Update(RHICmdList, FeatureLevel, Scene); // Clear virtual texture feedback to default value FUnorderedAccessViewRHIRef FeedbackUAV = SceneContext.GetVirtualTextureFeedbackUAV(); RHICmdList.Transition(FRHITransitionInfo(FeedbackUAV, ERHIAccess::SRVMask, ERHIAccess::UAVMask)); RHICmdList.ClearUAVUint(FeedbackUAV, FUintVector4(~0u, ~0u, ~0u, ~0u)); RHICmdList.Transition(FRHITransitionInfo(FeedbackUAV, ERHIAccess::UAVMask, ERHIAccess::UAVMask)); RHICmdList.BeginUAVOverlap(FeedbackUAV); } // Sorted lighting information FSortedLightSetSceneInfo SortedLightSet; // Delay rendering if (bDeferredShading) { // Collect and sort lights GatherAndSortLights(SortedLightSet); int32 NumReflectionCaptures = Views[0].NumBoxReflectionCaptures + Views[0].NumSphereReflectionCaptures; bool bCullLightsToGrid = (NumReflectionCaptures > 0 || GMobileUseClusteredDeferredShading != 0); FRDGBuilder GraphBuilder(RHICmdList); // Calculate the light grid ComputeLightGrid(GraphBuilder, bCullLightsToGrid, SortedLightSet); GraphBuilder.Execute(); } // Generate sky / atmosphere LUT const bool bShouldRenderSkyAtmosphere = ShouldRenderSkyAtmosphere(Scene, ViewFamily.EngineShowFlags); if (bShouldRenderSkyAtmosphere) { FRDGBuilder GraphBuilder(RHICmdList); RenderSkyAtmosphereLookUpTables(GraphBuilder); GraphBuilder.Execute(); } // Inform the special effects system to prepare the scene for rendering if (FXSystem && ViewFamily.EngineShowFlags.Particles) { FXSystem->PreRender(RHICmdList, NULL, !Views[0].bIsPlanarReflection); if (FGPUSortManager* GPUSortManager = FXSystem->GetGPUSortManager()) { GPUSortManager->OnPreRender(RHICmdList); } } // Polling occlusion culling requests FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Shadows)); // Render shadows RenderShadowDepthMaps(RHICmdList); FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); // Collect a list of views TArray<const FViewInfo*> ViewList; for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++) { ViewList.Add(&Views[ViewIndex]); } // Render custom depth if (bShouldRenderCustomDepth) { FRDGBuilder GraphBuilder(RHICmdList); FSceneTextureShaderParameters SceneTextures = CreateSceneTextureShaderParameters(GraphBuilder, Views[0].GetFeatureLevel(), ESceneTextureSetupMode::None); RenderCustomDepthPass(GraphBuilder, SceneTextures); GraphBuilder.Execute(); } // Render depth PrePass if (bIsFullPrepassEnabled) { // SDF and AO require full PrePass depth FRHIRenderPassInfo DepthPrePassRenderPassInfo( SceneContext.GetSceneDepthSurface(), EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil); DepthPrePassRenderPassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch(); DepthPrePassRenderPassInfo.bOcclusionQueries = DepthPrePassRenderPassInfo.NumOcclusionQueries != 0; RHICmdList.BeginRenderPass(DepthPrePassRenderPassInfo, TEXT("DepthPrepass")); RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass)); // Render full depth PrePass RenderPrePass(RHICmdList); // Submit occlusion culling RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion)); RenderOcclusion(RHICmdList); RHICmdList.EndRenderPass(); // SDF shadow if (bRequiresDistanceFieldShadowingPass) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderSDFShadowing); RenderSDFShadowing(RHICmdList); } // HZB. if (bShouldRenderHZB) { RenderHZB(RHICmdList, SceneContext.SceneDepthZ); } // AO. if (bRequiresAmbientOcclusionPass) { RenderAmbientOcclusion(RHICmdList, SceneContext.SceneDepthZ); } } FRHITexture* SceneColor = nullptr; // Delay rendering if (bDeferredShading) { SceneColor = RenderDeferred(RHICmdList, ViewList, SortedLightSet); } // Forward rendering else { SceneColor = RenderForward(RHICmdList, ViewList); } // Render speed buffer if (bShouldRenderVelocities) { FRDGBuilder GraphBuilder(RHICmdList); FRDGTextureMSAA SceneDepthTexture = RegisterExternalTextureMSAA(GraphBuilder, SceneContext.SceneDepthZ); FRDGTextureRef VelocityTexture = TryRegisterExternalTexture(GraphBuilder, SceneContext.SceneVelocity); if (VelocityTexture != nullptr) { AddClearRenderTargetPass(GraphBuilder, VelocityTexture); } // Speed buffer for rendering movable objects AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_Velocity)); RenderVelocities(GraphBuilder, SceneDepthTexture.Resolve, VelocityTexture, FSceneTextureShaderParameters(), EVelocityPass::Opaque, false); AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_AfterVelocity)); // Speed buffer for rendering transparent objects AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_TranslucentVelocity)); RenderVelocities(GraphBuilder, SceneDepthTexture.Resolve, VelocityTexture, GetSceneTextureShaderParameters(CreateMobileSceneTextureUniformBuffer(GraphBuilder, EMobileSceneTextureSetupMode::SceneColor)), EVelocityPass::Translucent, false); GraphBuilder.Execute(); } // Deal with the logic after scene rendering { FRendererModule& RendererModule = static_cast<FRendererModule&>(GetRendererModule()); FRDGBuilder GraphBuilder(RHICmdList); RendererModule.RenderPostOpaqueExtensions(GraphBuilder, Views, SceneContext); if (FXSystem && Views.IsValidIndex(0)) { AddUntrackedAccessPass(GraphBuilder, [this](FRHICommandListImmediate& RHICmdList) { check(RHICmdList.IsOutsideRenderPass()); FXSystem->PostRenderOpaque( RHICmdList, Views[0].ViewUniformBuffer, nullptr, nullptr, Views[0].AllowGPUParticleUpdate() ); if (FGPUSortManager* GPUSortManager = FXSystem->GetGPUSortManager()) { GPUSortManager->OnPostRenderOpaque(RHICmdList); } }); } GraphBuilder.Execute(); } // Flush / commit command buffer if (bSubmitOffscreenRendering) { RHICmdList.SubmitCommandsHint(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } // Convert the scene color to SRV for subsequent steps to read if (!bGammaSpace || bRenderToSceneColor) { RHICmdList.Transition(FRHITransitionInfo(SceneColor, ERHIAccess::Unknown, ERHIAccess::SRVMask)); } if (bDeferredShading) { // Releases the original reference on the scene render target SceneContext.AdjustGBufferRefCount(RHICmdList, -1); } RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Post)); // Work with virtual textures if (bUseVirtualTexturing) { SCOPED_GPU_STAT(RHICmdList, VirtualTextureUpdate); // No pass after this should make VT page requests RHICmdList.EndUAVOverlap(SceneContext.VirtualTextureFeedbackUAV); RHICmdList.Transition(FRHITransitionInfo(SceneContext.VirtualTextureFeedbackUAV, ERHIAccess::UAVMask, ERHIAccess::SRVMask)); TArray<FIntRect, TInlineAllocator<4>> ViewRects; ViewRects.AddUninitialized(Views.Num()); for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ++ViewIndex) { ViewRects[ViewIndex] = Views[ViewIndex].ViewRect; } FVirtualTextureFeedbackBufferDesc Desc; Desc.Init2D(SceneContext.GetBufferSizeXY(), ViewRects, SceneContext.GetVirtualTextureFeedbackScale()); SubmitVirtualTextureFeedbackBuffer(RHICmdList, SceneContext.VirtualTextureFeedback, Desc); } FMemMark Mark(FMemStack::Get()); FRDGBuilder GraphBuilder(RHICmdList); FRDGTextureRef ViewFamilyTexture = TryCreateViewFamilyTexture(GraphBuilder, ViewFamily); // Parsing scene if (ViewFamily.bResolveScene) { if (!bGammaSpace || bRenderToSceneColor) { // Complete rendering of each view or full stereo buffer (if enabled) { RDG_EVENT_SCOPE(GraphBuilder, "PostProcessing"); SCOPE_CYCLE_COUNTER(STAT_FinishRenderViewTargetTime); TArray<TRDGUniformBufferRef<FMobileSceneTextureUniformParameters>, TInlineAllocator<1, SceneRenderingAllocator>> MobileSceneTexturesPerView; MobileSceneTexturesPerView.SetNumZeroed(Views.Num()); const auto SetupMobileSceneTexturesPerView = [&]() { for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ++ViewIndex) { EMobileSceneTextureSetupMode SetupMode = EMobileSceneTextureSetupMode::SceneColor; if (Views[ViewIndex].bCustomDepthStencilValid) { SetupMode |= EMobileSceneTextureSetupMode::CustomDepth; } if (bShouldRenderVelocities) { SetupMode |= EMobileSceneTextureSetupMode::SceneVelocity; } MobileSceneTexturesPerView[ViewIndex] = CreateMobileSceneTextureUniformBuffer(GraphBuilder, SetupMode); } }; SetupMobileSceneTexturesPerView(); FMobilePostProcessingInputs PostProcessingInputs; PostProcessingInputs.ViewFamilyTexture = ViewFamilyTexture; // Post render effects for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++) { RDG_EVENT_SCOPE_CONDITIONAL(GraphBuilder, Views.Num() > 1, "View%d", ViewIndex); PostProcessingInputs.SceneTextures = MobileSceneTexturesPerView[ViewIndex]; AddMobilePostProcessingPasses(GraphBuilder, Views[ViewIndex], PostProcessingInputs, NumMSAASamples > 1); } } } } GEngine->GetPostRenderDelegate().Broadcast(); RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_SceneEnd)); if (bShouldRenderVelocities) { SceneContext.SceneVelocity.SafeRelease(); } if (ViewFamily.bLateLatchingEnabled) { EndLateLatching(RHICmdList, Views[0]); } RenderFinish(GraphBuilder, ViewFamilyTexture); GraphBuilder.Execute(); // Polling occlusion culling requests FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); }
Yes Analyze the unreal rendering system (04) - delay rendering pipeline Students in this chapter should know that the scene rendering process on the mobile terminal simplifies many steps, which is equivalent to a subset of the scene renderer on the PC terminal. Of course, in order to adapt to the unique GPU hardware architecture of the mobile terminal, the scene rendering of the mobile terminal is also different from that of the PC terminal. It will be analyzed in detail later. The main steps and processes of the mobile end scenario are as follows:
As for the above flow chart, the following points need to be explained:
- The flowchart nodes bDeferredShading and bDeferredShading2 are the same variable. The main purpose of distinguishing them here is to prevent mermaid syntax drawing errors.
- Nodes with * are conditional and non inevitable steps.
UE4.26 adds the delayed rendering pipeline of the mobile terminal, so the above code has the forward rendering branch RenderForward and the delayed rendering branch renderderrred, which return the rendering result SceneColor.
The mobile terminal also supports rendering features such as primitive GPU scene, SDF shadow, AO, sky atmosphere, virtual texture, occlusion elimination and so on.
Since UE4.26, rendering system has been widely used RDG system The scene renderer on the mobile end is no exception. A total of several FRDGBuilder instances are declared in the above code, which are used to calculate the light source lattice, render the sky atmosphere LUT, customize the depth, speed buffer, render post event, post-processing, etc. they are relatively independent function modules or rendering stages.
12.3.2 RenderForward
RenderForward is responsible for the branch of forward rendering in the mobile scene renderer. Its code and analysis are as follows:
FRHITexture* FMobileSceneRenderer::RenderForward(FRHICommandListImmediate& RHICmdList, const TArrayView<const FViewInfo*> ViewList) { const FViewInfo& View = *ViewList[0]; FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList); FRHITexture* SceneColor = nullptr; FRHITexture* SceneColorResolve = nullptr; FRHITexture* SceneDepth = nullptr; ERenderTargetActions ColorTargetAction = ERenderTargetActions::Clear_Store; EDepthStencilTargetActions DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_DontStoreDepthStencil; // Enable mobile MSAA bool bMobileMSAA = NumMSAASamples > 1 && SceneContext.GetSceneColorSurface()->GetNumSamples() > 1; // Whether to enable the mobile terminal multi attempt mode static const auto CVarMobileMultiView = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("vr.MobileMultiView")); const bool bIsMultiViewApplication = (CVarMobileMultiView && CVarMobileMultiView->GetValueOnAnyThread() != 0); // The rendering branch of gamma space if (bGammaSpace && !bRenderToSceneColor) { // If MSAA is turned on, the rendered texture (including scene color and resolution texture) is obtained from the SceneContext if (bMobileMSAA) { SceneColor = SceneContext.GetSceneColorSurface(); SceneColorResolve = ViewFamily.RenderTarget->GetRenderTargetTexture(); ColorTargetAction = ERenderTargetActions::Clear_Resolve; RHICmdList.Transition(FRHITransitionInfo(SceneColorResolve, ERHIAccess::Unknown, ERHIAccess::RTV | ERHIAccess::ResolveDst)); } // Non MSAA, get render texture from view family else { SceneColor = ViewFamily.RenderTarget->GetRenderTargetTexture(); RHICmdList.Transition(FRHITransitionInfo(SceneColor, ERHIAccess::Unknown, ERHIAccess::RTV)); } SceneDepth = SceneContext.GetSceneDepthSurface(); } // Linear space or render to scene texture else { SceneColor = SceneContext.GetSceneColorSurface(); if (bMobileMSAA) { SceneColorResolve = SceneContext.GetSceneColorTexture(); ColorTargetAction = ERenderTargetActions::Clear_Resolve; RHICmdList.Transition(FRHITransitionInfo(SceneColorResolve, ERHIAccess::Unknown, ERHIAccess::RTV | ERHIAccess::ResolveDst)); } else { SceneColorResolve = nullptr; ColorTargetAction = ERenderTargetActions::Clear_Store; } SceneDepth = SceneContext.GetSceneDepthSurface(); if (bRequiresMultiPass) { // store targets after opaque so translucency render pass can be restarted ColorTargetAction = ERenderTargetActions::Clear_Store; DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil; } if (bKeepDepthContent) { // store depth if post-processing/capture needs it DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil; } } // Depth texture state of prepass if (bIsFullPrepassEnabled) { ERenderTargetActions DepthTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetDepthActions(DepthTargetAction))); ERenderTargetActions StencilTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetStencilActions(DepthTargetAction))); DepthTargetAction = MakeDepthStencilTargetActions(DepthTarget, StencilTarget); } FRHITexture* ShadingRateTexture = nullptr; if (!View.bIsSceneCapture && !View.bIsReflectionCapture) { TRefCountPtr<IPooledRenderTarget> ShadingRateTarget = GVRSImageManager.GetMobileVariableRateShadingImage(ViewFamily); if (ShadingRateTarget.IsValid()) { ShadingRateTexture = ShadingRateTarget->GetRenderTargetItem().ShaderResourceTexture; } } // Scene color rendering Pass information FRHIRenderPassInfo SceneColorRenderPassInfo( SceneColor, ColorTargetAction, SceneColorResolve, SceneDepth, DepthTargetAction, nullptr, // we never resolve scene depth on mobile ShadingRateTexture, VRSRB_Sum, FExclusiveDepthStencil::DepthWrite_StencilWrite ); SceneColorRenderPassInfo.SubpassHint = ESubpassHint::DepthReadSubpass; if (!bIsFullPrepassEnabled) { SceneColorRenderPassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch(); SceneColorRenderPassInfo.bOcclusionQueries = SceneColorRenderPassInfo.NumOcclusionQueries != 0; } // If the scene color is not multi view, but the application is, you need to render the multi view as single view to the shader SceneColorRenderPassInfo.MultiViewCount = View.bIsMobileMultiViewEnabled ? 2 : (bIsMultiViewApplication ? 1 : 0); // Start rendering scene colors RHICmdList.BeginRenderPass(SceneColorRenderPassInfo, TEXT("SceneColorRendering")); if (GIsEditor && !View.bIsSceneCapture) { DrawClearQuad(RHICmdList, Views[0].BackgroundColor); } if (!bIsFullPrepassEnabled) { RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass)); // Render depth pre pass RenderPrePass(RHICmdList); } RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Opaque)); // Render BasePass: opaque and masked objects RenderMobileBasePass(RHICmdList, ViewList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); //Render debug mode #if !(UE_BUILD_SHIPPING || UE_BUILD_TEST) if (ViewFamily.UseDebugViewPS()) { // Here we use the base pass depth result to get z culling for opaque and masque. // The color needs to be cleared at this point since shader complexity renders in additive. DrawClearQuad(RHICmdList, FLinearColor::Black); RenderMobileDebugView(RHICmdList, ViewList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } #endif // !(UE_BUILD_SHIPPING || UE_BUILD_TEST) const bool bAdrenoOcclusionMode = CVarMobileAdrenoOcclusionMode.GetValueOnRenderThread() != 0; if (!bIsFullPrepassEnabled) { // Occlusion Culling if (!bAdrenoOcclusionMode) { // Submit occlusion culling RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion)); RenderOcclusion(RHICmdList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } } // Post event to handle plug-in rendering { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(ViewExtensionPostRenderBasePass); QUICK_SCOPE_CYCLE_COUNTER(STAT_FMobileSceneRenderer_ViewExtensionPostRenderBasePass); for (int32 ViewExt = 0; ViewExt < ViewFamily.ViewExtensions.Num(); ++ViewExt) { for (int32 ViewIndex = 0; ViewIndex < ViewFamily.Views.Num(); ++ViewIndex) { ViewFamily.ViewExtensions[ViewExt]->PostRenderBasePass_RenderThread(RHICmdList, Views[ViewIndex]); } } } // If you need to render reflections from transparent objects or pixel projections, you need to split pass if (bRequiresMultiPass || bRequiresPixelProjectedPlanarRelfectionPass) { RHICmdList.EndRenderPass(); } RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Translucency)); // Reopen the transparent render channel if necessary if (bRequiresMultiPass || bRequiresPixelProjectedPlanarRelfectionPass) { check(RHICmdList.IsOutsideRenderPass()); // If the current hardware does not support reading and writing the same depth buffer, the scene depth is copied ConditionalResolveSceneDepth(RHICmdList, View); if (bRequiresPixelProjectedPlanarRelfectionPass) { const FPlanarReflectionSceneProxy* PlanarReflectionSceneProxy = Scene ? Scene->GetForwardPassGlobalPlanarReflection() : nullptr; RenderPixelProjectedReflection(RHICmdList, SceneContext, PlanarReflectionSceneProxy); FRHITransitionInfo TranslucentRenderPassTransitions[] = { FRHITransitionInfo(SceneColor, ERHIAccess::SRVMask, ERHIAccess::RTV), FRHITransitionInfo(SceneDepth, ERHIAccess::SRVMask, ERHIAccess::DSVWrite) }; RHICmdList.Transition(MakeArrayView(TranslucentRenderPassTransitions, UE_ARRAY_COUNT(TranslucentRenderPassTransitions))); } DepthTargetAction = EDepthStencilTargetActions::LoadDepthStencil_DontStoreDepthStencil; FExclusiveDepthStencil::Type ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilRead; if (bModulatedShadowsInUse) { ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilWrite; } // The opaque mesh used for moving end pixel projection reflection must write the depth to the depth RT because the mesh is rendered only once (if the quality level is lower than or equal to BestPerformance) if (IsMobilePixelProjectedReflectionEnabled(View.GetShaderPlatform()) && GetMobilePixelProjectedReflectionQuality() == EMobilePixelProjectedReflectionQuality::BestPerformance) { ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite; } if (bKeepDepthContent && !bMobileMSAA) { DepthTargetAction = EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil; } #if PLATFORM_HOLOLENS if (bShouldRenderDepthToTranslucency) { ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite; } #endif // Transparent object rendering Pass FRHIRenderPassInfo TranslucentRenderPassInfo( SceneColor, SceneColorResolve ? ERenderTargetActions::Load_Resolve : ERenderTargetActions::Load_Store, SceneColorResolve, SceneDepth, DepthTargetAction, nullptr, ShadingRateTexture, VRSRB_Sum, ExclusiveDepthStencil ); TranslucentRenderPassInfo.NumOcclusionQueries = 0; TranslucentRenderPassInfo.bOcclusionQueries = false; TranslucentRenderPassInfo.SubpassHint = ESubpassHint::DepthReadSubpass; // Start rendering translucent objects RHICmdList.BeginRenderPass(TranslucentRenderPassInfo, TEXT("SceneColorTranslucencyRendering")); } // The scene depth is read-only and can be obtained RHICmdList.NextSubpass(); if (!View.bIsPlanarReflection) { // Render decals if (ViewFamily.EngineShowFlags.Decals) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals); RenderDecals(RHICmdList); } // Render modulated shadow casting if (ViewFamily.EngineShowFlags.DynamicShadows) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderShadowProjections); RenderModulatedShadowProjections(RHICmdList); } } // Paint translucent if (ViewFamily.EngineShowFlags.Translucency) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency); SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime); RenderTranslucency(RHICmdList, ViewList); FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } if (!bIsFullPrepassEnabled) { // Adreno occlusion culling mode if (bAdrenoOcclusionMode) { RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion)); // flush RHICmdList.SubmitCommandsHint(); bSubmitOffscreenRendering = false; // submit once // Issue occlusion queries RenderOcclusion(RHICmdList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } } // Precomputed tone mapping before MSAA is parsed (valid only on iOS) if (!bGammaSpace) { PreTonemapMSAA(RHICmdList); } // End scene color rendering RHICmdList.EndRenderPass(); // Optimize the analytical texture of the returned scene color (only when MSAA is turned on) return SceneColorResolve ? SceneColorResolve : SceneColor; }
The main steps of forward rendering at the mobile end are similar to those at the PC end, rendering PrePass, BasePass, special rendering (decals, AO, occlusion culling, etc.) and translucent objects in turn. Their flow chart is as follows:
Among them, occlusion removal is related to GPU manufacturers. For example, Qualcomm Adreno series GPU chips require between Flush rendering instructions and Switch FBO:
Render Opaque -> Render Translucent -> Flush -> Render Queries -> Switch FBO
Then UE also follows the special requirements of Adreno series chips and makes special treatment for its occlusion removal.
Adreno series chips support the rendering of Bin and ordinary Direct mixed modes of TBDR architecture, and will automatically switch to Direct mode during occlusion query to reduce the overhead of occlusion query. If the query is not submitted between the Flush rendering instruction and the Switch FBO, the whole rendering pipeline will be stuck and the rendering performance will be degraded.
MSAA is the preferred anti aliasing for UE forward rendering at the mobile end due to its natural hardware support and a good balance between effect and efficiency. Therefore, there are many logic to deal with MSAA in the above code, including color, depth texture and its resource status. If MSAA is enabled, by default, the scene color is parsed in RHICmdList.EndRenderPass() (at the same time, the data on the chip block is written back to the system video memory), so as to obtain the anti aliasing texture. MSAA on the mobile terminal is not enabled by default, but can be set in the following interface:
Forward rendering supports both Gamma space and HDR (linear space) color space modes. If it is a linear space, steps such as tone mapping are required in post rendering. The default is HDR, which can be changed in the project configuration:
bRequiresMultiPass of the above code indicates whether a special render Pass is required to draw a translucent object. Its value is determined by the following code:
// Engine\Source\Runtime\Renderer\Private\MobileShadingRenderer.cpp bool FMobileSceneRenderer::RequiresMultiPass(FRHICommandListImmediate& RHICmdList, const FViewInfo& View) const { // Vulkan uses subpasses if (IsVulkanPlatform(ShaderPlatform)) { return false; } // All iOS support frame_buffer_fetch if (IsMetalMobilePlatform(ShaderPlatform)) { return false; } if (IsMobileDeferredShadingEnabled(ShaderPlatform)) { // TODO: add GL support return true; } // Some Androids support frame_buffer_fetch if (IsAndroidOpenGLESPlatform(ShaderPlatform) && (GSupportsShaderFramebufferFetch || GSupportsShaderDepthStencilFetch)) { return false; } // Always render reflection capture in single pass if (View.bIsPlanarReflection || View.bIsSceneCapture) { return false; } // Always render LDR in single pass if (!IsMobileHDR()) { return false; } // MSAA depth can't be sampled or resolved, unless we are on PC (no vulkan) if (NumMSAASamples > 1 && !IsSimulatedPlatform(ShaderPlatform)) { return false; } return true; }
Similar but different in meaning are the bIsMultiViewApplication and bIsMobileMultiViewEnabled tags, indicating whether to turn on multi view rendering and the number of multi views. It is only used for VR, which is determined by console variable vr.MobileMultiView, graphics API and other factors. MultiView is used in XR to optimize rendering twice. It has two modes: Basic and Advanced:
MultiView comparison chart for optimizing rendering such as VR. Above: for rendering without MultiView mode, two eyes submit drawing instructions respectively; Medium: basic MultiView mode, reuse submission instructions, and copy one more Command List at GPU layer; Lower: Advanced MultiView mode, which can reuse DC, Command List and geometric information.
bKeepDepthContent indicates whether to keep the depth content and determines its code:
bKeepDepthContent = bRequiresMultiPass || bForceDepthResolve || bRequiresPixelProjectedPlanarRelfectionPass || bSeparateTranslucencyActive || Views[0].bIsReflectionCapture || (bDeferredShading && bPostProcessUsesSceneDepth) || bShouldRenderVelocities || bIsFullPrepassEnabled; // The depth with MSAA is never reserved bKeepDepthContent = (NumMSAASamples > 1 ? false : bKeepDepthContent);
The above code also reveals a special rendering method of plane reflection at the mobile end: Pixel Projected Reflection (PPR). Its implementation principle is similar to SSR, but requires less data, only scene color, depth buffer and reflection area. Its core steps:
- Calculate the mirror position of all pixels of the scene color in the reflection plane.
- Test whether the reflection of the pixel is within the reflection area.
- Rays are cast to mirror pixel locations.
- Test whether the intersection is within the reflection area.
- If the intersection point is found, calculate the mirror position of the pixel on the screen.
- Writes the color of the mirrored pixel at the intersection.
- If the intersection in the reflection area is blocked by other objects, the reflection at this position is eliminated.
PPR effect list.
PPR can be set in the project configuration:
12.3.3 RenderDeferred
UE adds a delayed rendering branch to the mobile end rendering pipeline on 4.26, and improves and optimizes it on 4.27. Whether the mobile terminal turns on the delayed coloring feature is determined by the following code:
// Engine\Source\Runtime\RenderCore\Private\RenderUtils.cpp bool IsMobileDeferredShadingEnabled(const FStaticShaderPlatform Platform) { // Disable delayed shading for OpenGL if (IsOpenGLPlatform(Platform)) { // needs MRT framebuffer fetch or PLS return false; } // The console variable "r.Mobile.ShadingPath" should be 1 static auto* MobileShadingPathCvar = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.Mobile.ShadingPath")); return MobileShadingPathCvar->GetValueOnAnyThread() == 1; }
Simply put, it is a non OpenGL graphics API and the console variable r.Mobile.ShadingPath is set to 1.
r.Mobile.ShadingPath cannot be set dynamically in the editor. It can only be opened by adding the following fields in the project root directory / Config/DefaultEngine.ini:
[/Script/Engine.RendererSettings]
r.Mobile.ShadingPath=1
After adding the above fields, restart the UE editor and wait for the shader compilation to preview the delayed shading effect of the mobile terminal.
The following is the code and analysis of the delayed rendering branch fmobilescenerer:: renderdeferred:
FRHITexture* FMobileSceneRenderer::RenderDeferred(FRHICommandListImmediate& RHICmdList, const TArrayView<const FViewInfo*> ViewList, const FSortedLightSetSceneInfo& SortedLightSet) { FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList); // Prepare GBuffer FRHITexture* ColorTargets[4] = { SceneContext.GetSceneColorSurface(), SceneContext.GetGBufferATexture().GetReference(), SceneContext.GetGBufferBTexture().GetReference(), SceneContext.GetGBufferCTexture().GetReference() }; // Whether RHI needs to store GBuffer in GPU system memory and shade in a separate render channel ERenderTargetActions GBufferAction = bRequiresMultiPass ? ERenderTargetActions::Clear_Store : ERenderTargetActions::Clear_DontStore; EDepthStencilTargetActions DepthAction = bKeepDepthContent ? EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil : EDepthStencilTargetActions::ClearDepthStencil_DontStoreDepthStencil; // RT's load/store action ERenderTargetActions ColorTargetsAction[4] = {ERenderTargetActions::Clear_Store, GBufferAction, GBufferAction, GBufferAction}; if (bIsFullPrepassEnabled) { ERenderTargetActions DepthTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetDepthActions(DepthAction))); ERenderTargetActions StencilTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetStencilActions(DepthAction))); DepthAction = MakeDepthStencilTargetActions(DepthTarget, StencilTarget); } FRHIRenderPassInfo BasePassInfo = FRHIRenderPassInfo(); int32 ColorTargetIndex = 0; for (; ColorTargetIndex < UE_ARRAY_COUNT(ColorTargets); ++ColorTargetIndex) { BasePassInfo.ColorRenderTargets[ColorTargetIndex].RenderTarget = ColorTargets[ColorTargetIndex]; BasePassInfo.ColorRenderTargets[ColorTargetIndex].ResolveTarget = nullptr; BasePassInfo.ColorRenderTargets[ColorTargetIndex].ArraySlice = -1; BasePassInfo.ColorRenderTargets[ColorTargetIndex].MipIndex = 0; BasePassInfo.ColorRenderTargets[ColorTargetIndex].Action = ColorTargetsAction[ColorTargetIndex]; } if (MobileRequiresSceneDepthAux(ShaderPlatform)) { BasePassInfo.ColorRenderTargets[ColorTargetIndex].RenderTarget = SceneContext.SceneDepthAux->GetRenderTargetItem().ShaderResourceTexture.GetReference(); BasePassInfo.ColorRenderTargets[ColorTargetIndex].ResolveTarget = nullptr; BasePassInfo.ColorRenderTargets[ColorTargetIndex].ArraySlice = -1; BasePassInfo.ColorRenderTargets[ColorTargetIndex].MipIndex = 0; BasePassInfo.ColorRenderTargets[ColorTargetIndex].Action = GBufferAction; ColorTargetIndex++; } BasePassInfo.DepthStencilRenderTarget.DepthStencilTarget = SceneContext.GetSceneDepthSurface(); BasePassInfo.DepthStencilRenderTarget.ResolveTarget = nullptr; BasePassInfo.DepthStencilRenderTarget.Action = DepthAction; BasePassInfo.DepthStencilRenderTarget.ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite; BasePassInfo.SubpassHint = ESubpassHint::DeferredShadingSubpass; if (!bIsFullPrepassEnabled) { BasePassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch(); BasePassInfo.bOcclusionQueries = BasePassInfo.NumOcclusionQueries != 0; } BasePassInfo.ShadingRateTexture = nullptr; BasePassInfo.bIsMSAA = false; BasePassInfo.MultiViewCount = 0; RHICmdList.BeginRenderPass(BasePassInfo, TEXT("BasePassRendering")); if (GIsEditor && !Views[0].bIsSceneCapture) { DrawClearQuad(RHICmdList, Views[0].BackgroundColor); } // Depth PrePass if (!bIsFullPrepassEnabled) { RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass)); // Depth pre-pass RenderPrePass(RHICmdList); } // BasePass: opaque and hollow objects RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Opaque)); RenderMobileBasePass(RHICmdList, ViewList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); // Occlusion culling if (!bIsFullPrepassEnabled) { // Issue occlusion queries RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion)); RenderOcclusion(RHICmdList); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } // Non multi Pass mode if (!bRequiresMultiPass) { // The next sub Pass: SSceneColor + GBuffer write, SceneDepth read only RHICmdList.NextSubpass(); // Render decals if (ViewFamily.EngineShowFlags.Decals) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals); RenderDecals(RHICmdList); } // The next sub Pass: SceneColor is written, and SceneDepth is read-only RHICmdList.NextSubpass(); // Delay light shading MobileDeferredShadingPass(RHICmdList, *Scene, ViewList, SortedLightSet); // Paint translucent if (ViewFamily.EngineShowFlags.Translucency) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency); SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime); RenderTranslucency(RHICmdList, ViewList); FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } // End rendering Pass RHICmdList.EndRenderPass(); } // Multi Pass mode (mobile terminal simulated by PC device) else { // End Sub pass RHICmdList.NextSubpass(); RHICmdList.NextSubpass(); RHICmdList.EndRenderPass(); // SceneColor + GBuffer write, SceneDepth is read only { for (int32 Index = 0; Index < UE_ARRAY_COUNT(ColorTargets); ++Index) { BasePassInfo.ColorRenderTargets[Index].Action = ERenderTargetActions::Load_Store; } BasePassInfo.DepthStencilRenderTarget.Action = EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil; BasePassInfo.DepthStencilRenderTarget.ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilRead; BasePassInfo.SubpassHint = ESubpassHint::None; BasePassInfo.NumOcclusionQueries = 0; BasePassInfo.bOcclusionQueries = false; RHICmdList.BeginRenderPass(BasePassInfo, TEXT("AfterBasePass")); // Render decals if (ViewFamily.EngineShowFlags.Decals) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals); RenderDecals(RHICmdList); } RHICmdList.EndRenderPass(); } // SceneColor write, SceneDepth is read only { FRHIRenderPassInfo ShadingPassInfo( SceneContext.GetSceneColorSurface(), ERenderTargetActions::Load_Store, nullptr, SceneContext.GetSceneDepthSurface(), EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil, nullptr, nullptr, VRSRB_Passthrough, FExclusiveDepthStencil::DepthRead_StencilWrite ); ShadingPassInfo.NumOcclusionQueries = 0; ShadingPassInfo.bOcclusionQueries = false; RHICmdList.BeginRenderPass(ShadingPassInfo, TEXT("MobileShadingPass")); // Delay light shading MobileDeferredShadingPass(RHICmdList, *Scene, ViewList, SortedLightSet); // Paint translucent if (ViewFamily.EngineShowFlags.Translucency) { CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency); SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime); RenderTranslucency(RHICmdList, ViewList); FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries(); RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); } RHICmdList.EndRenderPass(); } } return ColorTargets[0]; }
It can be seen from the above that the delayed rendering pipeline at the mobile end is similar to that of the PC. first render the BasePass to obtain the geometric information of GBuffer, and then perform lighting calculation. Their flow chart is as follows:
Of course, there are some differences from PC. the most obvious is that the mobile terminal uses SubPass rendering adapted to TB(D)R architecture, so that when rendering PrePass depth, BasePass and lighting calculation, the mobile terminal keeps the scene color, depth, GBuffer and other information in the buffer of on chip, so as to improve rendering efficiency and reduce equipment energy consumption.
12.3.3.1 MobileDeferredShadingPass
The process of delaying lighting rendering is performed by MobileDeferredShadingPass:
void MobileDeferredShadingPass( FRHICommandListImmediate& RHICmdList, const FScene& Scene, const TArrayView<const FViewInfo*> PassViews, const FSortedLightSetSceneInfo &SortedLightSet) { SCOPED_DRAW_EVENT(RHICmdList, MobileDeferredShading); const FViewInfo& View0 = *PassViews[0]; FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList); // Create a Uniform Buffer FUniformBufferRHIRef PassUniformBuffer = CreateMobileSceneTextureUniformBuffer(RHICmdList); FUniformBufferStaticBindings GlobalUniformBuffers(PassUniformBuffer); SCOPED_UNIFORM_BUFFER_GLOBAL_BINDINGS(RHICmdList, GlobalUniformBuffers); // Set the viewport RHICmdList.SetViewport(View0.ViewRect.Min.X, View0.ViewRect.Min.Y, 0.0f, View0.ViewRect.Max.X, View0.ViewRect.Max.Y, 1.0f); // The default material for lighting FCachedLightMaterial DefaultMaterial; DefaultMaterial.MaterialProxy = UMaterial::GetDefaultMaterial(MD_LightFunction)->GetRenderProxy(); DefaultMaterial.Material = DefaultMaterial.MaterialProxy->GetMaterialNoFallback(ERHIFeatureLevel::ES3_1); check(DefaultMaterial.Material); // Draws a directional light RenderDirectLight(RHICmdList, Scene, View0, DefaultMaterial); if (GMobileUseClusteredDeferredShading == 0) { // Render simple non clustered lights RenderSimpleLights(RHICmdList, Scene, PassViews, SortedLightSet, DefaultMaterial); } // Render non clustered local lights int32 NumLights = SortedLightSet.SortedLights.Num(); int32 StandardDeferredStart = SortedLightSet.SimpleLightsEnd; if (GMobileUseClusteredDeferredShading != 0) { StandardDeferredStart = SortedLightSet.ClusteredSupportedEnd; } // Render local lights for (int32 LightIdx = StandardDeferredStart; LightIdx < NumLights; ++LightIdx) { const FSortedLightSceneInfo& SortedLight = SortedLightSet.SortedLights[LightIdx]; const FLightSceneInfo& LightSceneInfo = *SortedLight.LightSceneInfo; RenderLocalLight(RHICmdList, Scene, View0, LightSceneInfo, DefaultMaterial); } }
Next, continue to analyze the interfaces for rendering different types of lights:
// Engine\Source\Runtime\Renderer\Private\MobileDeferredShadingPass.cpp // Render directional light static void RenderDirectLight(FRHICommandListImmediate& RHICmdList, const FScene& Scene, const FViewInfo& View, const FCachedLightMaterial& DefaultLightMaterial) { FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList); // Find the first directional light FLightSceneInfo* DirectionalLight = nullptr; for (int32 ChannelIdx = 0; ChannelIdx < UE_ARRAY_COUNT(Scene.MobileDirectionalLights) && !DirectionalLight; ChannelIdx++) { DirectionalLight = Scene.MobileDirectionalLights[ChannelIdx]; } // Render state FGraphicsPipelineStateInitializer GraphicsPSOInit; RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit); // Increase self illumination to SceneColor GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One>::GetRHI(); GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI(); // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit); GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState< false, CF_Always, true, CF_Equal, SO_Keep, SO_Keep, SO_Keep, false, CF_Always, SO_Keep, SO_Keep, SO_Keep, GET_STENCIL_MOBILE_SM_MASK(0x7), 0x00>::GetRHI(); // 4 bits for shading models // Process VS TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap); const FMaterialRenderProxy* LightFunctionMaterialProxy = nullptr; if (View.Family->EngineShowFlags.LightFunctions && DirectionalLight) { LightFunctionMaterialProxy = DirectionalLight->Proxy->GetLightFunctionMaterial(); } FMobileDirectLightFunctionPS::FPermutationDomain PermutationVector = FMobileDirectLightFunctionPS::BuildPermutationVector(View, DirectionalLight != nullptr); FCachedLightMaterial LightMaterial; TShaderRef<FMobileDirectLightFunctionPS> PixelShader; GetLightMaterial(DefaultLightMaterial, LightFunctionMaterialProxy, PermutationVector.ToDimensionValueId(), LightMaterial, PixelShader); GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI; GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader(); GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader(); GraphicsPSOInit.PrimitiveType = PT_TriangleList; SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit); // Process PS FMobileDirectLightFunctionPS::FParameters PassParameters; PassParameters.Forward = View.ForwardLightingResources->ForwardLightDataUniformBuffer; PassParameters.MobileDirectionalLight = Scene.UniformBuffers.MobileDirectionalLightUniformBuffers[1]; PassParameters.ReflectionCaptureData = Scene.UniformBuffers.ReflectionCaptureUniformBuffer; FReflectionUniformParameters ReflectionUniformParameters; SetupReflectionUniformParameters(View, ReflectionUniformParameters); PassParameters.ReflectionsParameters = CreateUniformBufferImmediate(ReflectionUniformParameters, UniformBuffer_SingleDraw); PassParameters.LightFunctionParameters = FVector4(1.0f, 1.0f, 0.0f, 0.0f); if (DirectionalLight) { const bool bUseMovableLight = DirectionalLight && !DirectionalLight->Proxy->HasStaticShadowing(); PassParameters.LightFunctionParameters2 = FVector(DirectionalLight->Proxy->GetLightFunctionFadeDistance(), DirectionalLight->Proxy->GetLightFunctionDisabledBrightness(), bUseMovableLight ? 1.0f : 0.0f); const FVector Scale = DirectionalLight->Proxy->GetLightFunctionScale(); // Switch x and z so that z of the user specified scale affects the distance along the light direction const FVector InverseScale = FVector(1.f / Scale.Z, 1.f / Scale.Y, 1.f / Scale.X); PassParameters.WorldToLight = DirectionalLight->Proxy->GetWorldToLight() * FScaleMatrix(FVector(InverseScale)); } FMobileDirectLightFunctionPS::SetParameters(RHICmdList, PixelShader, View, LightMaterial.MaterialProxy, *LightMaterial.Material, PassParameters); RHICmdList.SetStencilRef(StencilRef); const FIntPoint TargetSize = SceneContext.GetBufferSizeXY(); // Draw with a full screen rectangle DrawRectangle( RHICmdList, 0, 0, View.ViewRect.Width(), View.ViewRect.Height(), View.ViewRect.Min.X, View.ViewRect.Min.Y, View.ViewRect.Width(), View.ViewRect.Height(), FIntPoint(View.ViewRect.Width(), View.ViewRect.Height()), TargetSize, VertexShader); } // Render simple lights in non - clustered mode static void RenderSimpleLights( FRHICommandListImmediate& RHICmdList, const FScene& Scene, const TArrayView<const FViewInfo*> PassViews, const FSortedLightSetSceneInfo &SortedLightSet, const FCachedLightMaterial& DefaultMaterial) { const FSimpleLightArray& SimpleLights = SortedLightSet.SimpleLights; const int32 NumViews = PassViews.Num(); const FViewInfo& View0 = *PassViews[0]; // Process VS TShaderMapRef<TDeferredLightVS<true>> VertexShader(View0.ShaderMap); TShaderRef<FMobileRadialLightFunctionPS> PixelShaders[2]; { const FMaterialShaderMap* MaterialShaderMap = DefaultMaterial.Material->GetRenderingThreadShaderMap(); FMobileRadialLightFunctionPS::FPermutationDomain PermutationVector; PermutationVector.Set<FMobileRadialLightFunctionPS::FSpotLightDim>(false); PermutationVector.Set<FMobileRadialLightFunctionPS::FIESProfileDim>(false); PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(false); PixelShaders[0] = MaterialShaderMap->GetShader<FMobileRadialLightFunctionPS>(PermutationVector); PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(true); PixelShaders[1] = MaterialShaderMap->GetShader<FMobileRadialLightFunctionPS>(PermutationVector); } // Set PSO FGraphicsPipelineStateInitializer GraphicsPSOLight[2]; { SetupSimpleLightPSO(RHICmdList, View0, VertexShader, PixelShaders[0], GraphicsPSOLight[0]); SetupSimpleLightPSO(RHICmdList, View0, VertexShader, PixelShaders[1], GraphicsPSOLight[1]); } // Set template buffer FGraphicsPipelineStateInitializer GraphicsPSOLightMask; { RHICmdList.ApplyCachedRenderTargets(GraphicsPSOLightMask); GraphicsPSOLightMask.PrimitiveType = PT_TriangleList; GraphicsPSOLightMask.BlendState = TStaticBlendStateWriteMask<CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE>::GetRHI(); GraphicsPSOLightMask.RasterizerState = View0.bReverseCulling ? TStaticRasterizerState<FM_Solid, CM_CCW>::GetRHI() : TStaticRasterizerState<FM_Solid, CM_CW>::GetRHI(); // set stencil to 1 where depth test fails GraphicsPSOLightMask.DepthStencilState = TStaticDepthStencilState< false, CF_DepthNearOrEqual, true, CF_Always, SO_Keep, SO_Replace, SO_Keep, false, CF_Always, SO_Keep, SO_Keep, SO_Keep, 0x00, STENCIL_SANDBOX_MASK>::GetRHI(); GraphicsPSOLightMask.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4(); GraphicsPSOLightMask.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader(); GraphicsPSOLightMask.BoundShaderState.PixelShaderRHI = nullptr; } // Traverse the list of all simple lights and perform shading calculations for (int32 LightIndex = 0; LightIndex < SimpleLights.InstanceData.Num(); LightIndex++) { const FSimpleLightEntry& SimpleLight = SimpleLights.InstanceData[LightIndex]; for (int32 ViewIndex = 0; ViewIndex < NumViews; ViewIndex++) { const FViewInfo& View = *PassViews[ViewIndex]; const FSimpleLightPerViewEntry& SimpleLightPerViewData = SimpleLights.GetViewDependentData(LightIndex, ViewIndex, NumViews); const FSphere LightBounds(SimpleLightPerViewData.Position, SimpleLight.Radius); if (NumViews > 1) { // set viewports only we we have more than one // otherwise it is set at the start of the pass RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 1.0f); } // Render a light mask SetGraphicsPipelineState(RHICmdList, GraphicsPSOLightMask); VertexShader->SetSimpleLightParameters(RHICmdList, View, LightBounds); RHICmdList.SetStencilRef(1); StencilingGeometry::DrawSphere(RHICmdList); // Render lights FMobileRadialLightFunctionPS::FParameters PassParameters; FDeferredLightUniformStruct DeferredLightUniformsValue; SetupSimpleDeferredLightParameters(SimpleLight, SimpleLightPerViewData, DeferredLightUniformsValue); PassParameters.DeferredLightUniforms = TUniformBufferRef<FDeferredLightUniformStruct>::CreateUniformBufferImmediate(DeferredLightUniformsValue, EUniformBufferUsage::UniformBuffer_SingleFrame); PassParameters.IESTexture = GWhiteTexture->TextureRHI; PassParameters.IESTextureSampler = GWhiteTexture->SamplerStateRHI; if (SimpleLight.Exponent == 0) { SetGraphicsPipelineState(RHICmdList, GraphicsPSOLight[1]); FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShaders[1], View, DefaultMaterial.MaterialProxy, *DefaultMaterial.Material, PassParameters); } else { SetGraphicsPipelineState(RHICmdList, GraphicsPSOLight[0]); FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShaders[0], View, DefaultMaterial.MaterialProxy, *DefaultMaterial.Material, PassParameters); } VertexShader->SetSimpleLightParameters(RHICmdList, View, LightBounds); // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit); RHICmdList.SetStencilRef(StencilRef); // Render light sources (point light and spotlight) with spheres to quickly eliminate pixels outside the influence of light sources StencilingGeometry::DrawSphere(RHICmdList); } } } // Render local lights static void RenderLocalLight( FRHICommandListImmediate& RHICmdList, const FScene& Scene, const FViewInfo& View, const FLightSceneInfo& LightSceneInfo, const FCachedLightMaterial& DefaultLightMaterial) { if (!LightSceneInfo.ShouldRenderLight(View)) { return; } // Ignore nonlocal lights (lights other than lights and spotlights) const uint8 LightType = LightSceneInfo.Proxy->GetLightType(); const bool bIsSpotLight = LightType == LightType_Spot; const bool bIsPointLight = LightType == LightType_Point; if (!bIsSpotLight && !bIsPointLight) { return; } // Draw a lighting template if (GMobileUseLightStencilCulling != 0) { RenderLocalLight_StencilMask(RHICmdList, Scene, View, LightSceneInfo); } // Handle IES illumination bool bUseIESTexture = false; FTexture* IESTextureResource = GWhiteTexture; if (View.Family->EngineShowFlags.TexturedLightProfiles && LightSceneInfo.Proxy->GetIESTextureResource()) { IESTextureResource = LightSceneInfo.Proxy->GetIESTextureResource(); bUseIESTexture = true; } FGraphicsPipelineStateInitializer GraphicsPSOInit; RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit); GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI(); GraphicsPSOInit.PrimitiveType = PT_TriangleList; const FSphere LightBounds = LightSceneInfo.Proxy->GetBoundingSphere(); // Sets the light rasterization and depth state if (GMobileUseLightStencilCulling != 0) { SetLocalLightRasterizerAndDepthState_StencilMask(GraphicsPSOInit, View); } else { SetLocalLightRasterizerAndDepthState(GraphicsPSOInit, View, LightBounds); } // Set VS TShaderMapRef<TDeferredLightVS<true>> VertexShader(View.ShaderMap); const FMaterialRenderProxy* LightFunctionMaterialProxy = nullptr; if (View.Family->EngineShowFlags.LightFunctions) { LightFunctionMaterialProxy = LightSceneInfo.Proxy->GetLightFunctionMaterial(); } FMobileRadialLightFunctionPS::FPermutationDomain PermutationVector; PermutationVector.Set<FMobileRadialLightFunctionPS::FSpotLightDim>(bIsSpotLight); PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(LightSceneInfo.Proxy->IsInverseSquared()); PermutationVector.Set<FMobileRadialLightFunctionPS::FIESProfileDim>(bUseIESTexture); FCachedLightMaterial LightMaterial; TShaderRef<FMobileRadialLightFunctionPS> PixelShader; GetLightMaterial(DefaultLightMaterial, LightFunctionMaterialProxy, PermutationVector.ToDimensionValueId(), LightMaterial, PixelShader); GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4(); GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader(); GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader(); SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit); VertexShader->SetParameters(RHICmdList, View, &LightSceneInfo); // Set PS FMobileRadialLightFunctionPS::FParameters PassParameters; PassParameters.DeferredLightUniforms = TUniformBufferRef<FDeferredLightUniformStruct>::CreateUniformBufferImmediate(GetDeferredLightParameters(View, LightSceneInfo), EUniformBufferUsage::UniformBuffer_SingleFrame); PassParameters.IESTexture = IESTextureResource->TextureRHI; PassParameters.IESTextureSampler = IESTextureResource->SamplerStateRHI; const float TanOuterAngle = bIsSpotLight ? FMath::Tan(LightSceneInfo.Proxy->GetOuterConeAngle()) : 1.0f; PassParameters.LightFunctionParameters = FVector4(TanOuterAngle, 1.0f /*ShadowFadeFraction*/, bIsSpotLight ? 1.0f : 0.0f, bIsPointLight ? 1.0f : 0.0f); PassParameters.LightFunctionParameters2 = FVector(LightSceneInfo.Proxy->GetLightFunctionFadeDistance(), LightSceneInfo.Proxy->GetLightFunctionDisabledBrightness(), 0.0f); const FVector Scale = LightSceneInfo.Proxy->GetLightFunctionScale(); // Switch x and z so that z of the user specified scale affects the distance along the light direction const FVector InverseScale = FVector(1.f / Scale.Z, 1.f / Scale.Y, 1.f / Scale.X); PassParameters.WorldToLight = LightSceneInfo.Proxy->GetWorldToLight() * FScaleMatrix(FVector(InverseScale)); FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShader, View, LightMaterial.MaterialProxy, *LightMaterial.Material, PassParameters); // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit); RHICmdList.SetStencilRef(StencilRef); // Point lights are drawn with spheres if (LightType == LightType_Point) { StencilingGeometry::DrawSphere(RHICmdList); } // Spotlights are drawn with cones else // LightType_Spot { StencilingGeometry::DrawCone(RHICmdList); } }
When drawing a light source, it is divided into three steps according to the type of light source: directional light, non clustered simple light source and local light source (point light and spotlight). It should be noted that the mobile terminal only supports the calculation of the default lighting model (MSM_DefaultLit), and other advanced lighting models (hair, sub surface scattering, varnish, eyes, cloth, etc.) are not supported temporarily.
When drawing directional lights, only one can be drawn at most. The full screen rectangular drawing is adopted, and several levels of CSM shadows are supported.
When drawing non clustered simple lights, whether point lights or spotlights, they are drawn with spheres and do not support shadows.
When drawing a local light source, it will be much more complex. First draw the local light source template buffer, then set the rasterization and depth state, and then draw the light source. The point light source is drawn by sphere and does not support shadow; The spotlight adopts cone drawing and can support shadows. By default, the spotlight does not support dynamic light and shadow calculation and needs to be turned on in the project configuration:
In addition, whether to turn on the template to eliminate pixels with disjoint light sources is determined by GMobileUseLightStencilCulling, and GMobileUseLightStencilCulling is determined by r.Mobile.UseLightStencilCulling, which is 1 by default (i.e. on). The template buffer code for rendering light source is as follows:
static void RenderLocalLight_StencilMask(FRHICommandListImmediate& RHICmdList, const FScene& Scene, const FViewInfo& View, const FLightSceneInfo& LightSceneInfo) { const uint8 LightType = LightSceneInfo.Proxy->GetLightType(); FGraphicsPipelineStateInitializer GraphicsPSOInit; // Apply cached RT (color / depth, etc.) RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit); GraphicsPSOInit.PrimitiveType = PT_TriangleList; // Disable all RT writes GraphicsPSOInit.BlendState = TStaticBlendStateWriteMask<CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE>::GetRHI(); GraphicsPSOInit.RasterizerState = View.bReverseCulling ? TStaticRasterizerState<FM_Solid, CM_CCW>::GetRHI() : TStaticRasterizerState<FM_Solid, CM_CW>::GetRHI(); // If the depth test fails, the write template buffer value is 1 GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState< false, CF_DepthNearOrEqual, true, CF_Always, SO_Keep, SO_Replace, SO_Keep, false, CF_Always, SO_Keep, SO_Keep, SO_Keep, 0x00, // Note that only the sandbox bit dedicated to Pass is written, that is, the bit with index 0 of the template buffer STENCIL_SANDBOX_MASK>::GetRHI(); // The VS for drawing the light source template is TDeferredLightVS TShaderMapRef<TDeferredLightVS<true> > VertexShader(View.ShaderMap); GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4(); GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader(); // PS is empty GraphicsPSOInit.BoundShaderState.PixelShaderRHI = nullptr; SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit); VertexShader->SetParameters(RHICmdList, View, &LightSceneInfo); // The template value is 1 RHICmdList.SetStencilRef(1); // Draw with different shapes according to different light sources if (LightType == LightType_Point) { StencilingGeometry::DrawSphere(RHICmdList); } else // LightType_Spot { StencilingGeometry::DrawCone(RHICmdList); } }
For each local light source, first draw the Mask within the light source range, and then calculate the illumination of the pixels that pass the Stencil test (Early-Z). The specific analysis process takes the spotlight in the following figure as an example:
Upper: a spotlight waiting for rendering in the scene; Medium: the template Mask (white area) drawn by the template Pass marks the pixels with closer depth that overlap the shape of the spotlight in the screen space; Below: the effect of lighting calculation on effective pixels.
The DepthStencil status used for lighting calculation of effective pixels is as follows:
The pixel performing illumination must be within the light source shape, and the pixels outside the light source shape will be eliminated. The template Pass marks the pixels closer to the depth of the light source shape (pixels outside the light source shape body). The light source drawing Pass eliminates the pixels marked by the template Pass through the template test, and then finds the pixels in the light source shape body through the depth test, so as to improve the lighting calculation efficiency.
This light source template clipping technology at the mobile end and the Unity speech of Siggraph2020 Deferred Shading in Unity URP The mentioned template based lighting calculation is similar (the idea is consistent, but the practice may not be exactly the same). This paper also proposes a geometry simulation that is more suitable for the shape of the light source:
As well as comparing the performance of various light source calculation methods on PC and mobile terminal, the following is the comparison diagram of Mali GPU:
The performance comparison of Mali Gpu using different lighting rendering technologies shows that the lighting algorithm based on template clipping is better than the conventional and blocking algorithms on the mobile end.
It is worth mentioning that the light source template clipping technology combined with GPU's Early-Z technology will greatly improve the lighting rendering performance. The current mainstream mobile GPU supports Early-Z technology, which also lays a foundation for the application of light source template clipping.
There may be room for improvement in the light source clipping algorithm currently implemented by UE. For example, the pixels facing back to the light source (shown in the red box below) can not be calculated. (however, how to quickly and effectively find pixels with backlight is another problem)
12.3.3.2 MobileBasePassShader
This section mainly describes the shader s involved in the BasePass of the mobile terminal, including VS and PS. First look at VS:
// Engine\Shaders\Private\MobileBasePassVertexShader.usf (......) struct FMobileShadingBasePassVSToPS { FVertexFactoryInterpolantsVSToPS FactoryInterpolants; FMobileBasePassInterpolantsVSToPS BasePassInterpolants; float4 Position : SV_POSITION; }; #define FMobileShadingBasePassVSOutput FMobileShadingBasePassVSToPS #define VertexFactoryGetInterpolants VertexFactoryGetInterpolantsVSToPS // VS main entrance void Main( FVertexFactoryInput Input , out FMobileShadingBasePassVSOutput Output #if INSTANCED_STEREO , uint InstanceId : SV_InstanceID , out uint LayerIndex : SV_RenderTargetArrayIndex #elif MOBILE_MULTI_VIEW , in uint ViewId : SV_ViewID #endif ) { // Stereo view mode #if INSTANCED_STEREO const uint EyeIndex = GetEyeIndex(InstanceId); ResolvedView = ResolveView(EyeIndex); LayerIndex = EyeIndex; Output.BasePassInterpolants.MultiViewId = float(EyeIndex); // Multi view mode #elif MOBILE_MULTI_VIEW #if COMPILER_GLSL_ES3_1 const int MultiViewId = int(ViewId); ResolvedView = ResolveView(uint(MultiViewId)); Output.BasePassInterpolants.MultiViewId = float(MultiViewId); #else ResolvedView = ResolveView(ViewId); Output.BasePassInterpolants.MultiViewId = float(ViewId); #endif #else ResolvedView = ResolveView(); #endif // Initialize the packed interpolation data #if PACK_INTERPOLANTS float4 PackedInterps[NUM_VF_PACKED_INTERPOLANTS]; UNROLL for(int i = 0; i < NUM_VF_PACKED_INTERPOLANTS; ++i) { PackedInterps[i] = 0; } #endif // Process vertex factory data FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input); float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates); float4 WorldPosition = WorldPositionExcludingWPO; // Get the vertex data of the material, process the coordinates, etc half3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates); FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal); half3 WorldPositionOffset = GetMaterialWorldPositionOffset(VertexParameters); WorldPosition.xyz += WorldPositionOffset; float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition); Output.Position = mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip); Output.BasePassInterpolants.PixelPosition = WorldPosition; #if USE_WORLD_POSITION_EXCLUDING_SHADER_OFFSETS Output.BasePassInterpolants.PixelPositionExcludingWPO = WorldPositionExcludingWPO.xyz; #endif // Crop face #if USE_PS_CLIP_PLANE Output.BasePassInterpolants.OutClipDistance = dot(ResolvedView.GlobalClippingPlane, float4(WorldPosition.xyz - ResolvedView.PreViewTranslation.xyz, 1)); #endif // Vertex fog #if USE_VERTEX_FOG float4 VertexFog = CalculateHeightFog(WorldPosition.xyz - ResolvedView.TranslatedWorldCameraOrigin); #if PROJECT_SUPPORT_SKY_ATMOSPHERE && MATERIAL_IS_SKY==0 // Do not apply aerial perpsective on sky materials if (ResolvedView.SkyAtmosphereApplyCameraAerialPerspectiveVolume > 0.0f) { const float OneOverPreExposure = USE_PREEXPOSURE ? ResolvedView.OneOverPreExposure : 1.0f; // Sample the aerial perspective (AP). It is also blended under the VertexFog parameter. VertexFog = GetAerialPerspectiveLuminanceTransmittanceWithFogOver( ResolvedView.RealTimeReflectionCapture, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeSizeAndInvSize, Output.Position, WorldPosition.xyz*CM_TO_SKY_UNIT, ResolvedView.TranslatedWorldCameraOrigin*CM_TO_SKY_UNIT, View.CameraAerialPerspectiveVolume, View.CameraAerialPerspectiveVolumeSampler, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthResolutionInv, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthResolution, ResolvedView.SkyAtmosphereAerialPerspectiveStartDepthKm, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthSliceLengthKm, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthSliceLengthKmInv, OneOverPreExposure, VertexFog); } #endif #if PACK_INTERPOLANTS PackedInterps[0] = VertexFog; #else Output.BasePassInterpolants.VertexFog = VertexFog; #endif // PACK_INTERPOLANTS #endif // USE_VERTEX_FOG (......) // Obtain the data to be interpolated Output.FactoryInterpolants = VertexFactoryGetInterpolants(Input, VFIntermediates, VertexParameters); Output.BasePassInterpolants.PixelPosition.w = Output.Position.w; // Pack interpolation data #if PACK_INTERPOLANTS VertexFactoryPackInterpolants(Output.FactoryInterpolants, PackedInterps); #endif // PACK_INTERPOLANTS #if !OUTPUT_MOBILE_HDR && COMPILER_GLSL_ES3_1 Output.Position.y *= -1; #endif }
As can be seen from the above, view instances will be processed differently according to different stereo rendering, multi view and normal modes. Vertex fog is supported, but it is turned off by default and needs to be turned on in the project configuration.
There is a packed interpolation mode to compress the interpolation consumption and bandwidth between VS and PS. Enable macro PACK_INTERPOLANTS decides that it is defined as follows:
// Engine\Shaders\Private\MobileBasePassCommon.ush #define PACK_INTERPOLANTS (USE_VERTEX_FOG && NUM_VF_PACKED_INTERPOLANTS > 0 && (ES3_1_PROFILE))
In other words, only when vertex fog is enabled, vertex factory packing interpolation data exists, and opengres3.1 shading platform is available, can the packing interpolation feature be enabled. Compared with the VS of BasePass on the PC side, the mobile side has been greatly simplified, which can be simply considered as a small subset of the PC side. Continue to analyze PS:
// Engine\Shaders\Private\MobileBasePassVertexShader.usf #include "Common.ush" // Various macro definitions #define MobileSceneTextures MobileBasePass.SceneTextures #define EyeAdaptationStruct MobileBasePass (......) // Pre normalized capture of the scene closest to the rendered object (not supported for completely rough materials) #if !FULLY_ROUGH #if HQ_REFLECTIONS #define MAX_HQ_REFLECTIONS 3 TextureCube ReflectionCubemap0; SamplerState ReflectionCubemapSampler0; TextureCube ReflectionCubemap1; SamplerState ReflectionCubemapSampler1; TextureCube ReflectionCubemap2; SamplerState ReflectionCubemapSampler2; // x,y,z - inverted average brightness for 0, 1, 2; w - sky cube texture max mips. float4 ReflectionAverageBrigtness; float4 ReflectanceMaxValueRGBM; float4 ReflectionPositionsAndRadii[MAX_HQ_REFLECTIONS]; #if ALLOW_CUBE_REFLECTIONS float4x4 CaptureBoxTransformArray[MAX_HQ_REFLECTIONS]; float4 CaptureBoxScalesArray[MAX_HQ_REFLECTIONS]; #endif #endif #endif // Reflector / IBL and other interfaces half4 GetPlanarReflection(float3 WorldPosition, half3 WorldNormal, half Roughness); half MobileComputeMixingWeight(half IndirectIrradiance, half AverageBrightness, half Roughness); half3 GetLookupVectorForBoxCaptureMobile(half3 ReflectionVector, ...); half3 GetLookupVectorForSphereCaptureMobile(half3 ReflectionVector, ...); void GatherSpecularIBL(FMaterialPixelParameters MaterialParameters, ...); void BlendReflectionCaptures(FMaterialPixelParameters MaterialParameters, ...) half3 GetImageBasedReflectionLighting(FMaterialPixelParameters MaterialParameters, ...); // Other interfaces half3 FrameBufferBlendOp(half4 Source); bool UseCSM(); void ApplyPixelDepthOffsetForMobileBasePass(inout FMaterialPixelParameters MaterialParameters, FPixelMaterialInputs PixelMaterialInputs, out float OutDepth); // Cumulative dynamic point light #if MAX_DYNAMIC_POINT_LIGHTS > 0 void AccumulateLightingOfDynamicPointLight( FMaterialPixelParameters MaterialParameters, FMobileShadingModelContext ShadingModelContext, FGBufferData GBuffer, float4 LightPositionAndInvRadius, float4 LightColorAndFalloffExponent, float4 SpotLightDirectionAndSpecularScale, float4 SpotLightAnglesAndSoftTransitionScaleAndLightShadowType, #if SUPPORT_SPOTLIGHTS_SHADOW FPCFSamplerSettings Settings, float4 SpotLightShadowSharpenAndShadowFadeFraction, float4 SpotLightShadowmapMinMax, float4x4 SpotLightShadowWorldToShadowMatrix, #endif inout half3 Color) { uint LightShadowType = SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.w; float FadedShadow = 1.0f; // Calculate spotlight shadows #if SUPPORT_SPOTLIGHTS_SHADOW if ((LightShadowType & LightShadowType_Shadow) == LightShadowType_Shadow) { float4 HomogeneousShadowPosition = mul(float4(MaterialParameters.AbsoluteWorldPosition, 1), SpotLightShadowWorldToShadowMatrix); float2 ShadowUVs = HomogeneousShadowPosition.xy / HomogeneousShadowPosition.w; if (all(ShadowUVs >= SpotLightShadowmapMinMax.xy && ShadowUVs <= SpotLightShadowmapMinMax.zw)) { // Clamp pixel depth in light space for shadowing opaque, because areas of the shadow depth buffer that weren't rendered to will have been cleared to 1 // We want to force the shadow comparison to result in 'unshadowed' in that case, regardless of whether the pixel being shaded is in front or behind that plane float LightSpacePixelDepthForOpaque = min(HomogeneousShadowPosition.z, 0.99999f); Settings.SceneDepth = LightSpacePixelDepthForOpaque; Settings.TransitionScale = SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.z; half Shadow = MobileShadowPCF(ShadowUVs, Settings); Shadow = saturate((Shadow - 0.5) * SpotLightShadowSharpenAndShadowFadeFraction.x + 0.5); FadedShadow = lerp(1.0f, Square(Shadow), SpotLightShadowSharpenAndShadowFadeFraction.y); } } #endif // Calculate illumination if ((LightShadowType & ValidLightType) != 0) { float3 ToLight = LightPositionAndInvRadius.xyz - MaterialParameters.AbsoluteWorldPosition; float DistanceSqr = dot(ToLight, ToLight); float3 L = ToLight * rsqrt(DistanceSqr); half3 PointH = normalize(MaterialParameters.CameraVector + L); half PointNoL = max(0, dot(MaterialParameters.WorldNormal, L)); half PointNoH = max(0, dot(MaterialParameters.WorldNormal, PointH)); // Calculates the attenuation of the light source float Attenuation; if (LightColorAndFalloffExponent.w == 0) { // Sphere falloff (technically just 1/d2 but this avoids inf) Attenuation = 1 / (DistanceSqr + 1); float LightRadiusMask = Square(saturate(1 - Square(DistanceSqr * (LightPositionAndInvRadius.w * LightPositionAndInvRadius.w)))); Attenuation *= LightRadiusMask; } else { Attenuation = RadialAttenuation(ToLight * LightPositionAndInvRadius.w, LightColorAndFalloffExponent.w); } #if PROJECT_MOBILE_ENABLE_MOVABLE_SPOTLIGHTS if ((LightShadowType & LightShadowType_SpotLight) == LightShadowType_SpotLight) { Attenuation *= SpotAttenuation(L, -SpotLightDirectionAndSpecularScale.xyz, SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.xy) * FadedShadow; } #endif // Accumulate lighting results #if !FULLY_ROUGH FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, PointNoL, MaterialParameters.CameraVector, PointH, PointNoH); Color += min(65000.0, (Attenuation) * LightColorAndFalloffExponent.rgb * (1.0 / PI) * (Lighting.Diffuse + Lighting.Specular * SpotLightDirectionAndSpecularScale.w)); #else Color += (Attenuation * PointNoL) * LightColorAndFalloffExponent.rgb * (1.0 / PI) * ShadingModelContext.DiffuseColor; #endif } } #endif (......) // Calculate indirect illumination half ComputeIndirect(VTPageTableResult LightmapVTPageTableResult, FVertexFactoryInterpolantsVSToPS Interpolants, float3 DiffuseDir, FMobileShadingModelContext ShadingModelContext, out half IndirectIrradiance, out half3 Color) { //To keep IndirectLightingCache conherence with PC, initialize the IndirectIrradiance to zero. IndirectIrradiance = 0; Color = 0; // Indirect diffuse reflection #if LQ_TEXTURE_LIGHTMAP float2 LightmapUV0, LightmapUV1; uint LightmapDataIndex; GetLightMapCoordinates(Interpolants, LightmapUV0, LightmapUV1, LightmapDataIndex); half4 LightmapColor = GetLightMapColorLQ(LightmapVTPageTableResult, LightmapUV0, LightmapUV1, LightmapDataIndex, DiffuseDir); Color += LightmapColor.rgb * ShadingModelContext.DiffuseColor * View.IndirectLightingColorScale; IndirectIrradiance = LightmapColor.a; #elif CACHED_POINT_INDIRECT_LIGHTING #if MATERIALBLENDING_MASKED || MATERIALBLENDING_SOLID // Apply normals to translucent objects FThreeBandSHVectorRGB PointIndirectLighting; PointIndirectLighting.R.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[0]; PointIndirectLighting.R.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[0]; PointIndirectLighting.R.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[0]; PointIndirectLighting.G.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[1]; PointIndirectLighting.G.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[1]; PointIndirectLighting.G.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[1]; PointIndirectLighting.B.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[2]; PointIndirectLighting.B.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[2]; PointIndirectLighting.B.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[2]; FThreeBandSHVector DiffuseTransferSH = CalcDiffuseTransferSH3(DiffuseDir, 1); // Calculates diffuse illumination with normal effects added half3 DiffuseGI = max(half3(0, 0, 0), DotSH3(PointIndirectLighting, DiffuseTransferSH)); IndirectIrradiance = Luminance(DiffuseGI); Color += ShadingModelContext.DiffuseColor * DiffuseGI * View.IndirectLightingColorScale; #else // Translucency uses non directional, diffuse reflection is packaged in xyz, and has been on the cpu side except PI and SH diffuse reflection half3 PointIndirectLighting = IndirectLightingCache.IndirectLightingSHSingleCoefficient.rgb; half3 DiffuseGI = PointIndirectLighting; IndirectIrradiance = Luminance(DiffuseGI); Color += ShadingModelContext.DiffuseColor * DiffuseGI * View.IndirectLightingColorScale; #endif #endif return IndirectIrradiance; } // PS main entrance PIXELSHADER_EARLYDEPTHSTENCIL void Main( FVertexFactoryInterpolantsVSToPS Interpolants , FMobileBasePassInterpolantsVSToPS BasePassInterpolants , in float4 SvPosition : SV_Position OPTIONAL_IsFrontFace , out half4 OutColor : SV_Target0 #if DEFERRED_SHADING_PATH , out half4 OutGBufferA : SV_Target1 , out half4 OutGBufferB : SV_Target2 , out half4 OutGBufferC : SV_Target3 #endif #if USE_SCENE_DEPTH_AUX , out float OutSceneDepthAux : SV_Target4 #endif #if OUTPUT_PIXEL_DEPTH_OFFSET , out float OutDepth : SV_Depth #endif ) { #if MOBILE_MULTI_VIEW ResolvedView = ResolveView(uint(BasePassInterpolants.MultiViewId)); #else ResolvedView = ResolveView(); #endif #if USE_PS_CLIP_PLANE clip(BasePassInterpolants.OutClipDistance); #endif // Decompress the packed interpolation data #if PACK_INTERPOLANTS float4 PackedInterpolants[NUM_VF_PACKED_INTERPOLANTS]; VertexFactoryUnpackInterpolants(Interpolants, PackedInterpolants); #endif #if COMPILER_GLSL_ES3_1 && !OUTPUT_MOBILE_HDR && !MOBILE_EMULATION // LDR Mobile needs screen vertical flipped SvPosition.y = ResolvedView.BufferSizeAndInvSize.y - SvPosition.y - 1; #endif // Gets the pixel properties of the material FMaterialPixelParameters MaterialParameters = GetMaterialPixelParameters(Interpolants, SvPosition); FPixelMaterialInputs PixelMaterialInputs; { float4 ScreenPosition = SvPositionToResolvedScreenPosition(SvPosition); float3 WorldPosition = BasePassInterpolants.PixelPosition.xyz; float3 WorldPositionExcludingWPO = BasePassInterpolants.PixelPosition.xyz; #if USE_WORLD_POSITION_EXCLUDING_SHADER_OFFSETS WorldPositionExcludingWPO = BasePassInterpolants.PixelPositionExcludingWPO; #endif CalcMaterialParametersEx(MaterialParameters, PixelMaterialInputs, SvPosition, ScreenPosition, bIsFrontFace, WorldPosition, WorldPositionExcludingWPO); #if FORCE_VERTEX_NORMAL // Quality level override of material's normal calculation, can be used to avoid normal map reads etc. MaterialParameters.WorldNormal = MaterialParameters.TangentToWorld[2]; MaterialParameters.ReflectionVector = ReflectionAboutCustomWorldNormal(MaterialParameters, MaterialParameters.WorldNormal, false); #endif } // Pixel depth offset #if OUTPUT_PIXEL_DEPTH_OFFSET ApplyPixelDepthOffsetForMobileBasePass(MaterialParameters, PixelMaterialInputs, OutDepth); #endif // Mask material #if !EARLY_Z_PASS_ONLY_MATERIAL_MASKING //Clip if the blend mode requires it. GetMaterialCoverageAndClipping(MaterialParameters, PixelMaterialInputs); #endif // Calculate and cache GBuffer data to prevent subsequent multiple texture applications FGBufferData GBuffer = (FGBufferData)0; GBuffer.WorldNormal = MaterialParameters.WorldNormal; GBuffer.BaseColor = GetMaterialBaseColor(PixelMaterialInputs); GBuffer.Metallic = GetMaterialMetallic(PixelMaterialInputs); GBuffer.Specular = GetMaterialSpecular(PixelMaterialInputs); GBuffer.Roughness = GetMaterialRoughness(PixelMaterialInputs); GBuffer.ShadingModelID = GetMaterialShadingModel(PixelMaterialInputs); half MaterialAO = GetMaterialAmbientOcclusion(PixelMaterialInputs); // Apply AO #if APPLY_AO half4 GatheredAmbientOcclusion = Texture2DSample(AmbientOcclusionTexture, AmbientOcclusionSampler, SvPositionToBufferUV(SvPosition)); MaterialAO *= GatheredAmbientOcclusion.r; #endif GBuffer.GBufferAO = MaterialAO; // Since the minimum standard value that can be represented by IEEE 754 (FP16) is 2^-24 = 5.96e-8, and the subsequent roughness involves the calculation of 1.0 / Roughness^4, in order to prevent division error, it is necessary to ensure that roughness ^ 4 > = 5.96e-8. Here, directly Clamp the roughness to 0.015625(0.015625^4 = 5.96e-8) // In addition, in order to match the delayed rendering on the PC side (the roughness is stored in the 8-bit value), it is also automatically clamped to 1.0 GBuffer.Roughness = max(0.015625, GetMaterialRoughness(PixelMaterialInputs)); // Initialize the mobile end shading model context FMobileShadingModelContext FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0; ShadingModelContext.Opacity = GetMaterialOpacity(PixelMaterialInputs); // Thin layer transparency #if MATERIAL_SHADINGMODEL_THIN_TRANSLUCENT (......) #endif half3 Color = 0; // Custom data half CustomData0 = GetMaterialCustomData0(MaterialParameters); half CustomData1 = GetMaterialCustomData1(MaterialParameters); InitShadingModelContext(ShadingModelContext, GBuffer, MaterialParameters.SvPosition, MaterialParameters.CameraVector, CustomData0, CustomData1); float3 DiffuseDir = MaterialParameters.WorldNormal; // Hair model #if MATERIAL_SHADINGMODEL_HAIR (......) #endif // Lightmap virtual texture VTPageTableResult LightmapVTPageTableResult = (VTPageTableResult)0.0f; #if LIGHTMAP_VT_ENABLED { float2 LightmapUV0, LightmapUV1; uint LightmapDataIndex; GetLightMapCoordinates(Interpolants, LightmapUV0, LightmapUV1, LightmapDataIndex); LightmapVTPageTableResult = LightmapGetVTSampleInfo(LightmapUV0, LightmapDataIndex, SvPosition.xy); } #endif #if LIGHTMAP_VT_ENABLED // This must occur after CalcMaterialParameters(), which is required to initialize the VT feedback mechanism // Lightmap request is always the first VT sample in the shader StoreVirtualTextureFeedback(MaterialParameters.VirtualTextureFeedback, 0, LightmapVTPageTableResult.PackedRequest); #endif // Calculate indirect light half IndirectIrradiance; half3 IndirectColor; ComputeIndirect(LightmapVTPageTableResult, Interpolants, DiffuseDir, ShadingModelContext, IndirectIrradiance, IndirectColor); Color += IndirectColor; // Precomputed shadow map half Shadow = GetPrimaryPrecomputedShadowMask(LightmapVTPageTableResult, Interpolants).r; #if DEFERRED_SHADING_PATH float4 OutGBufferD; float4 OutGBufferE; float4 OutGBufferF; float4 OutGBufferVelocity = 0; GBuffer.IndirectIrradiance = IndirectIrradiance; GBuffer.PrecomputedShadowFactors.r = Shadow; // Encode GBuffer data EncodeGBuffer(GBuffer, OutGBufferA, OutGBufferB, OutGBufferC, OutGBufferD, OutGBufferE, OutGBufferF, OutGBufferVelocity); #else #if !MATERIAL_SHADINGMODEL_UNLIT // daylight. #if ENABLE_SKY_LIGHT half3 SkyDiffuseLighting = GetSkySHDiffuseSimple(MaterialParameters.WorldNormal); half3 DiffuseLookup = SkyDiffuseLighting * ResolvedView.SkyLightColor.rgb; IndirectIrradiance += Luminance(DiffuseLookup); #endif Color *= MaterialAO; IndirectIrradiance *= MaterialAO; float ShadowPositionZ = 0; #if DIRECTIONAL_LIGHT_CSM && !MATERIAL_SHADINGMODEL_SINGLELAYERWATER // CSM shadows if (UseCSM()) { half ShadowMap = MobileDirectionalLightCSM(MaterialParameters.ScreenPosition.xy, MaterialParameters.ScreenPosition.w, ShadowPositionZ); #if ALLOW_STATIC_LIGHTING Shadow = min(ShadowMap, Shadow); #else Shadow = ShadowMap; #endif } #endif /* DIRECTIONAL_LIGHT_CSM */ // Distance field shadows #if APPLY_DISTANCE_FIELD if (ShadowPositionZ == 0) { Shadow = Texture2DSample(MobileBasePass.ScreenSpaceShadowMaskTexture, MobileBasePass.ScreenSpaceShadowMaskSampler, SvPositionToBufferUV(SvPosition)).x; } #endif half NoL = max(0, dot(MaterialParameters.WorldNormal, MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz)); half3 H = normalize(MaterialParameters.CameraVector + MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz); half NoH = max(0, dot(MaterialParameters.WorldNormal, H)); // Directional light + IBL #if FULLY_ROUGH Color += (Shadow * NoL) * MobileDirectionalLight.DirectionalLightColor.rgb * ShadingModelContext.DiffuseColor; #else FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, NoL, MaterialParameters.CameraVector, H, NoH); // Mobiledirectionallight.directionallightdistancefademandspectral scale. Z saves the spectral scale of the directional light Color += (Shadow) * MobileDirectionalLight.DirectionalLightColor.rgb * (Lighting.Diffuse + Lighting.Specular * MobileDirectionalLight.DirectionalLightDistanceFadeMADAndSpecularScale.z); // Hair coloring #if !(MATERIAL_SINGLE_SHADINGMODEL && MATERIAL_SHADINGMODEL_HAIR) (......) #endif #endif /* FULLY_ROUGH */ // Local light sources, up to 4 #if MAX_DYNAMIC_POINT_LIGHTS > 0 && !MATERIAL_SHADINGMODEL_SINGLELAYERWATER if(NumDynamicPointLights > 0) { #if SUPPORT_SPOTLIGHTS_SHADOW FPCFSamplerSettings Settings; Settings.ShadowDepthTexture = DynamicSpotLightShadowTexture; Settings.ShadowDepthTextureSampler = DynamicSpotLightShadowSampler; Settings.ShadowBufferSize = DynamicSpotLightShadowBufferSize; Settings.bSubsurface = false; Settings.bTreatMaxDepthUnshadowed = false; Settings.DensityMulConstant = 0; Settings.ProjectionDepthBiasParameters = 0; #endif AccumulateLightingOfDynamicPointLight(MaterialParameters, ...); if (MAX_DYNAMIC_POINT_LIGHTS > 1 && NumDynamicPointLights > 1) { AccumulateLightingOfDynamicPointLight(MaterialParameters, ...); if (MAX_DYNAMIC_POINT_LIGHTS > 2 && NumDynamicPointLights > 2) { AccumulateLightingOfDynamicPointLight(MaterialParameters, ...); if (MAX_DYNAMIC_POINT_LIGHTS > 3 && NumDynamicPointLights > 3) { AccumulateLightingOfDynamicPointLight(MaterialParameters, ...); } } } } #endif // Sky light #if ENABLE_SKY_LIGHT #if MATERIAL_TWOSIDED && LQ_TEXTURE_LIGHTMAP if (NoL == 0) { #endif #if MATERIAL_SHADINGMODEL_SINGLELAYERWATER ShadingModelContext.WaterDiffuseIndirectLuminance += SkyDiffuseLighting; #endif Color += SkyDiffuseLighting * half3(ResolvedView.SkyLightColor.rgb) * ShadingModelContext.DiffuseColor * MaterialAO; #if MATERIAL_TWOSIDED && LQ_TEXTURE_LIGHTMAP } #endif #endif #endif /* !MATERIAL_SHADINGMODEL_UNLIT */ #if MATERIAL_SHADINGMODEL_SINGLELAYERWATER (......) #endif // MATERIAL_SHADINGMODEL_SINGLELAYERWATER #endif// DEFERRED_SHADING_PATH // Handles vertex fog half4 VertexFog = half4(0, 0, 0, 1); #if USE_VERTEX_FOG #if PACK_INTERPOLANTS VertexFog = PackedInterpolants[0]; #else VertexFog = BasePassInterpolants.VertexFog; #endif #endif // Self luminous. half3 Emissive = GetMaterialEmissive(PixelMaterialInputs); #if MATERIAL_SHADINGMODEL_THIN_TRANSLUCENT Emissive *= TopMaterialCoverage; #endif Color += Emissive; #if !MATERIAL_SHADINGMODEL_UNLIT && MOBILE_EMULATION Color = lerp(Color, ShadingModelContext.DiffuseColor, ResolvedView.UnlitViewmodeMask); #endif // Combine fog color to output color #if MATERIALBLENDING_ALPHACOMPOSITE || MATERIAL_SHADINGMODEL_SINGLELAYERWATER OutColor = half4(Color * VertexFog.a + VertexFog.rgb * ShadingModelContext.Opacity, ShadingModelContext.Opacity); #elif MATERIALBLENDING_ALPHAHOLDOUT // not implemented for holdout OutColor = half4(Color * VertexFog.a + VertexFog.rgb * ShadingModelContext.Opacity, ShadingModelContext.Opacity); #elif MATERIALBLENDING_TRANSLUCENT OutColor = half4(Color * VertexFog.a + VertexFog.rgb, ShadingModelContext.Opacity); #elif MATERIALBLENDING_ADDITIVE OutColor = half4(Color * (VertexFog.a * ShadingModelContext.Opacity.x), 0.0f); #elif MATERIALBLENDING_MODULATE half3 FoggedColor = lerp(half3(1, 1, 1), Color, VertexFog.aaa * VertexFog.aaa); OutColor = half4(FoggedColor, ShadingModelContext.Opacity); #else OutColor.rgb = Color * VertexFog.a + VertexFog.rgb; #if !MATERIAL_USE_ALPHA_TO_COVERAGE // Scene color alpha is not used yet so we set it to 1 OutColor.a = 1.0; #if OUTPUT_MOBILE_HDR // Store depth in FP16 alpha. This depth value can be fetched during translucency or sampled in post-processing OutColor.a = SvPosition.z; #endif #else half MaterialOpacityMask = GetMaterialMaskInputRaw(PixelMaterialInputs); OutColor.a = GetMaterialMask(PixelMaterialInputs) / max(abs(ddx(MaterialOpacityMask)) + abs(ddy(MaterialOpacityMask)), 0.0001f) + 0.5f; #endif #endif #if !MATERIALBLENDING_MODULATE && USE_PREEXPOSURE OutColor.rgb *= ResolvedView.PreExposure; #endif #if MATERIAL_IS_SKY OutColor.rgb = min(OutColor.rgb, Max10BitsFloat.xxx * 0.5f); #endif #if USE_SCENE_DEPTH_AUX OutSceneDepthAux = SvPosition.z; #endif // Process the alpha of the color #if USE_EDITOR_COMPOSITING && (MOBILE_EMULATION) // Editor primitive depth testing OutColor.a = 1.0; #if MATERIALBLENDING_MASKED // some material might have an opacity value OutColor.a = GetMaterialMaskInputRaw(PixelMaterialInputs); #endif clip(OutColor.a - GetMaterialOpacityMaskClipValue()); #else #if OUTPUT_GAMMA_SPACE OutColor.rgb = sqrt(OutColor.rgb); #endif #endif #if NUM_VIRTUALTEXTURE_SAMPLES || LIGHTMAP_VT_ENABLED FinalizeVirtualTextureFeedback( MaterialParameters.VirtualTextureFeedback, MaterialParameters.SvPosition, ShadingModelContext.Opacity, View.FrameNumber, View.VTFeedbackBuffer ); #endif }
The processing process of BasePassPS at the mobile terminal is complex and has many steps, mainly including decompressing interpolation data, obtaining and calculating material properties, calculating and adjusting GBuffer, processing or adjusting GBuffer data, calculating the lighting of forward rendering branches (horizontal light, local light), calculating distance field, CSM and other shadows, calculating sky light, and processing static light, indirect light and IBL, Calculate the fog effect, and deal with special coloring models such as water body, hair and thin layer transparency.
Since the minimum value that can be represented by the standard 16 bit floating-point number (FP16) is \ (\ cfrac {1.0} {2 ^ {24} = 5.96 \ cdot 10 ^ {- 8} \), and the subsequent illumination calculation involves the 4th power operation of roughness (\ (\ cfrac{1.0}{\text{Roughness}^4} \)), in order to prevent the division error, the roughness needs to be intercepted to \ (0.015625 \) (\ (0.015625^4 = 5.96 \cdot 10^{-8} \).
GBuffer.Roughness = max(0.015625, GetMaterialRoughness(PixelMaterialInputs));
This also warns us that we need to pay special attention to and control the data accuracy when developing the rendering features of the mobile terminal, otherwise various wonderful picture abnormalities often occur in low-end devices due to insufficient data accuracy.
Although there are many codes above, they are controlled by many macros. The code required to actually render a single material may be only a small subset of them. For example, four local light sources are supported by default, but if it can be set to 2 or less in the engineering configuration (below), the actual executed light source instructions are much less.
If it is a forward rendering branch, many processing of GBuffer will be ignored; If it is a delayed rendering branch, the calculation of directional light and local light source will be ignored and executed by the shader of delayed rendering Pass.
The following is an analysis of the important interface EncodeGBuffer:
void EncodeGBuffer( FGBufferData GBuffer, out float4 OutGBufferA, out float4 OutGBufferB, out float4 OutGBufferC, out float4 OutGBufferD, out float4 OutGBufferE, out float4 OutGBufferVelocity, float QuantizationBias = 0 // -0.5 to 0.5 random float. Used to bias quantization. ) { if (GBuffer.ShadingModelID == SHADINGMODELID_UNLIT) { OutGBufferA = 0; SetGBufferForUnlit(OutGBufferB); OutGBufferC = 0; OutGBufferD = 0; OutGBufferE = 0; } else { // GBufferA: octahedral compressed normal, pre calculated shadow factor, object by object data #if MOBILE_DEFERRED_SHADING OutGBufferA.rg = UnitVectorToOctahedron( normalize(GBuffer.WorldNormal) ) * 0.5f + 0.5f; OutGBufferA.b = GBuffer.PrecomputedShadowFactors.x; OutGBufferA.a = GBuffer.PerObjectGBufferData; #else (......) #endif // GBufferB: metallicity, high luminosity, roughness, coloring model, other Mask OutGBufferB.r = GBuffer.Metallic; OutGBufferB.g = GBuffer.Specular; OutGBufferB.b = GBuffer.Roughness; OutGBufferB.a = EncodeShadingModelIdAndSelectiveOutputMask(GBuffer.ShadingModelID, GBuffer.SelectiveOutputMask); // GBufferC: basic color, AO or indirect light OutGBufferC.rgb = EncodeBaseColor( GBuffer.BaseColor ); #if ALLOW_STATIC_LIGHTING // No space for AO. Multiply IndirectIrradiance by AO instead of storing. OutGBufferC.a = EncodeIndirectIrradiance(GBuffer.IndirectIrradiance * GBuffer.GBufferAO) + QuantizationBias * (1.0 / 255.0); #else OutGBufferC.a = GBuffer.GBufferAO; #endif OutGBufferD = GBuffer.CustomData; OutGBufferE = GBuffer.PrecomputedShadowFactors; } #if WRITES_VELOCITY_TO_GBUFFER OutGBufferVelocity = GBuffer.Velocity; #else OutGBufferVelocity = 0; #endif }
Under the default lighting model (DefaultLit), BasePass outputs the following textures:
12.3.3.3 MobileDeferredShading
The VS of delayed lighting on the mobile terminal is the same as that on the PC terminal. Both are DeferredLightVertexShaders.usf, but the PS is different. MobileDeferredShading.usf is used. Since VS is the same as PC and there is no special operation, it will be ignored here. If you are interested, you can see the section of Chapter 5 5.5.3.1 DeferredLightVertexShader.
The following is a direct analysis of PS Code:
// Engine\Shaders\Private\MobileDeferredShading.usf (......) // Mobile end light source data structure struct FMobileLightData { float3 Position; float InvRadius; float3 Color; float FalloffExponent; float3 Direction; float2 SpotAngles; float SourceRadius; float SpecularScale; bool bInverseSquared; bool bSpotLight; }; // Get GBuffer data void FetchGBuffer(in float2 UV, out float4 GBufferA, out float4 GBufferB, out float4 GBufferC, out float4 GBufferD, out float SceneDepth) { // Vulkan's child pass gets data #if VULKAN_PROFILE GBufferA = VulkanSubpassFetch1(); GBufferB = VulkanSubpassFetch2(); GBufferC = VulkanSubpassFetch3(); GBufferD = 0; SceneDepth = ConvertFromDeviceZ(VulkanSubpassDepthFetch()); // The sub pass of Metal gets data #elif METAL_PROFILE GBufferA = SubpassFetchRGBA_1(); GBufferB = SubpassFetchRGBA_2(); GBufferC = SubpassFetchRGBA_3(); GBufferD = 0; SceneDepth = ConvertFromDeviceZ(SubpassFetchR_4()); // The sub pass of other platforms (DX, OpenGL) obtains data #else GBufferA = Texture2DSampleLevel(MobileSceneTextures.GBufferATexture, MobileSceneTextures.GBufferATextureSampler, UV, 0); GBufferB = Texture2DSampleLevel(MobileSceneTextures.GBufferBTexture, MobileSceneTextures.GBufferBTextureSampler, UV, 0); GBufferC = Texture2DSampleLevel(MobileSceneTextures.GBufferCTexture, MobileSceneTextures.GBufferCTextureSampler, UV, 0); GBufferD = 0; SceneDepth = ConvertFromDeviceZ(Texture2DSampleLevel(MobileSceneTextures.SceneDepthTexture, MobileSceneTextures.SceneDepthTextureSampler, UV, 0).r); #endif } // Decompress GBuffer data FGBufferData DecodeGBufferMobile( float4 InGBufferA, float4 InGBufferB, float4 InGBufferC, float4 InGBufferD) { FGBufferData GBuffer; GBuffer.WorldNormal = OctahedronToUnitVector( InGBufferA.xy * 2.0f - 1.0f ); GBuffer.PrecomputedShadowFactors = InGBufferA.z; GBuffer.PerObjectGBufferData = InGBufferA.a; GBuffer.Metallic = InGBufferB.r; GBuffer.Specular = InGBufferB.g; GBuffer.Roughness = max(0.015625, InGBufferB.b); // Note: must match GetShadingModelId standalone function logic // Also Note: SimpleElementPixelShader directly sets SV_Target2 ( GBufferB ) to indicate unlit. // An update there will be required if this layout changes. GBuffer.ShadingModelID = DecodeShadingModelId(InGBufferB.a); GBuffer.SelectiveOutputMask = DecodeSelectiveOutputMask(InGBufferB.a); GBuffer.BaseColor = DecodeBaseColor(InGBufferC.rgb); #if ALLOW_STATIC_LIGHTING GBuffer.GBufferAO = 1; GBuffer.IndirectIrradiance = DecodeIndirectIrradiance(InGBufferC.a); #else GBuffer.GBufferAO = InGBufferC.a; GBuffer.IndirectIrradiance = 1; #endif GBuffer.CustomData = HasCustomGBufferData(GBuffer.ShadingModelID) ? InGBufferD : 0; return GBuffer; } // Direct illumination half3 GetDirectLighting( FMobileLightData LightData, FMobileShadingModelContext ShadingModelContext, FGBufferData GBuffer, float3 WorldPosition, half3 CameraVector) { half3 DirectLighting = 0; float3 ToLight = LightData.Position - WorldPosition; float DistanceSqr = dot(ToLight, ToLight); float3 L = ToLight * rsqrt(DistanceSqr); // Light attenuation float Attenuation = 0.0; if (LightData.bInverseSquared) { // Sphere falloff (technically just 1/d2 but this avoids inf) Attenuation = 1.0f / (DistanceSqr + 1.0f); Attenuation *= Square(saturate(1 - Square(DistanceSqr * Square(LightData.InvRadius)))); } else { Attenuation = RadialAttenuation(ToLight * LightData.InvRadius, LightData.FalloffExponent); } // Spotlight attenuation if (LightData.bSpotLight) { Attenuation *= SpotAttenuation(L, -LightData.Direction, LightData.SpotAngles); } // If the attenuation is not 0, direct illumination is calculated if (Attenuation > 0.0) { half3 H = normalize(CameraVector + L); half NoL = max(0.0, dot(GBuffer.WorldNormal, L)); half NoH = max(0.0, dot(GBuffer.WorldNormal, H)); FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, NoL, CameraVector, H, NoH); DirectLighting = (Lighting.Diffuse + Lighting.Specular * LightData.SpecularScale) * (LightData.Color * (1.0 / PI) * Attenuation); } return DirectLighting; } // Illumination function half ComputeLightFunctionMultiplier(float3 WorldPosition); // Using light grids to add local lighting, dynamic shadows are not supported because a per light shadow map is required half3 GetLightGridLocalLighting(const FCulledLightsGridData InLightGridData, ...); // PS main entrance for directional light void MobileDirectLightPS( noperspective float4 UVAndScreenPos : TEXCOORD0, float4 SvPosition : SV_POSITION, out half4 OutColor : SV_Target0) { // Recover (read) GBuffer data FGBufferData GBuffer = (FGBufferData)0; float SceneDepth = 0; { float4 GBufferA = 0; float4 GBufferB = 0; float4 GBufferC = 0; float4 GBufferD = 0; FetchGBuffer(UVAndScreenPos.xy, GBufferA, GBufferB, GBufferC, GBufferD, SceneDepth); GBuffer = DecodeGBufferMobile(GBufferA, GBufferB, GBufferC, GBufferD); } // Calculate the base vector float2 ScreenPos = UVAndScreenPos.zw; float3 WorldPosition = mul(float4(ScreenPos * SceneDepth, SceneDepth, 1), View.ScreenToWorld).xyz; half3 CameraVector = normalize(View.WorldCameraOrigin - WorldPosition); half NoV = max(0, dot(GBuffer.WorldNormal, CameraVector)); half3 ReflectionVector = GBuffer.WorldNormal * (NoV * 2.0) - CameraVector; half3 Color = 0; // Check movable light param to determine if we should be using precomputed shadows half Shadow = LightFunctionParameters2.z > 0.0f ? 1.0f : GBuffer.PrecomputedShadowFactors.r; // CSM shadows #if APPLY_CSM float ShadowPositionZ = 0; float4 ScreenPosition = SvPositionToScreenPosition(float4(SvPosition.xyz,SceneDepth)); float ShadowMap = MobileDirectionalLightCSM(ScreenPosition.xy, SceneDepth, ShadowPositionZ); Shadow = min(ShadowMap, Shadow); #endif // Shading model context FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0; { half DielectricSpecular = 0.08 * GBuffer.Specular; ShadingModelContext.DiffuseColor = GBuffer.BaseColor - GBuffer.BaseColor * GBuffer.Metallic; // 1 mad ShadingModelContext.SpecularColor = (DielectricSpecular - DielectricSpecular * GBuffer.Metallic) + GBuffer.BaseColor * GBuffer.Metallic; // 2 mad // BRDF of computing environment ShadingModelContext.SpecularColor = GetEnvBRDF(ShadingModelContext.SpecularColor, GBuffer.Roughness, NoV); } // Local light source float2 LocalPosition = SvPosition.xy - View.ViewRectMin.xy; uint GridIndex = ComputeLightGridCellIndex(uint2(LocalPosition.x, LocalPosition.y), SceneDepth); // Cluster light source #if USE_CLUSTERED { const uint EyeIndex = 0; const FCulledLightsGridData CulledLightGridData = GetCulledLightsGrid(GridIndex, EyeIndex); Color += GetLightGridLocalLighting(CulledLightGridData, ShadingModelContext, GBuffer, WorldPosition, CameraVector, EyeIndex, 0); } #endif // Calculate the directional light half NoL = max(0, dot(GBuffer.WorldNormal, MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz)); half3 H = normalize(CameraVector + MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz); half NoH = max(0, dot(GBuffer.WorldNormal, H)); FMobileDirectLighting Lighting; Lighting.Specular = ShadingModelContext.SpecularColor * CalcSpecular(GBuffer.Roughness, NoH); Lighting.Diffuse = ShadingModelContext.DiffuseColor; Color += (Shadow * NoL) * MobileDirectionalLight.DirectionalLightColor.rgb * (Lighting.Diffuse + Lighting.Specular * MobileDirectionalLight.DirectionalLightDistanceFadeMADAndSpecularScale.z); // Process reflections (IBL, reflection catcher) #if APPLY_REFLECTION uint NumCulledEntryIndex = (ForwardLightData.NumGridCells + GridIndex) * NUM_CULLED_LIGHTS_GRID_STRIDE; uint NumLocalReflectionCaptures = min(ForwardLightData.NumCulledLightsGrid[NumCulledEntryIndex + 0], ForwardLightData.NumReflectionCaptures); uint DataStartIndex = ForwardLightData.NumCulledLightsGrid[NumCulledEntryIndex + 1]; float3 SpecularIBL = CompositeReflectionCapturesAndSkylight( 1.0f, WorldPosition, ReflectionVector,//RayDirection, GBuffer.Roughness, GBuffer.IndirectIrradiance, 1.0f, 0.0f, NumLocalReflectionCaptures, DataStartIndex, 0, true); Color += SpecularIBL * ShadingModelContext.SpecularColor; #elif APPLY_SKY_REFLECTION float SkyAverageBrightness = 1.0f; float3 SpecularIBL = GetSkyLightReflection(ReflectionVector, GBuffer.Roughness, SkyAverageBrightness); SpecularIBL *= ComputeMixingWeight(GBuffer.IndirectIrradiance, SkyAverageBrightness, GBuffer.Roughness); Color += SpecularIBL * ShadingModelContext.SpecularColor; #endif // Diffuse reflection of sky light half3 SkyDiffuseLighting = GetSkySHDiffuseSimple(GBuffer.WorldNormal); Color+= SkyDiffuseLighting * half3(View.SkyLightColor.rgb) * ShadingModelContext.DiffuseColor * GBuffer.GBufferAO; half LightAttenuation = ComputeLightFunctionMultiplier(WorldPosition); #if USE_PREEXPOSURE // MobileHDR applies PreExposure in tonemapper LightAttenuation *= View.PreExposure; #endif OutColor.rgb = Color.rgb * LightAttenuation; OutColor.a = 1; } // PS main entrance of local light source void MobileRadialLightPS( float4 InScreenPosition : TEXCOORD0, float4 SVPos : SV_POSITION, out half4 OutColor : SV_Target0 ) { FGBufferData GBuffer = (FGBufferData)0; float SceneDepth = 0; { float2 ScreenUV = InScreenPosition.xy / InScreenPosition.w * View.ScreenPositionScaleBias.xy + View.ScreenPositionScaleBias.wz; float4 GBufferA = 0; float4 GBufferB = 0; float4 GBufferC = 0; float4 GBufferD = 0; FetchGBuffer(ScreenUV, GBufferA, GBufferB, GBufferC, GBufferD, SceneDepth); GBuffer = DecodeGBufferMobile(GBufferA, GBufferB, GBufferC, GBufferD); } // With a perspective projection, the clip space position is NDC * Clip.w // With an orthographic projection, clip space is the same as NDC float2 ClipPosition = InScreenPosition.xy / InScreenPosition.w * (View.ViewToClip[3][3] < 1.0f ? SceneDepth : 1.0f); float3 WorldPosition = mul(float4(ClipPosition, SceneDepth, 1), View.ScreenToWorld).xyz; half3 CameraVector = normalize(View.WorldCameraOrigin - WorldPosition); half NoV = max(0, dot(GBuffer.WorldNormal, CameraVector)); // Assemble the light source data structure FMobileLightData LightData = (FMobileLightData)0; { LightData.Position = DeferredLightUniforms.Position; LightData.InvRadius = DeferredLightUniforms.InvRadius; LightData.Color = DeferredLightUniforms.Color; LightData.FalloffExponent = DeferredLightUniforms.FalloffExponent; LightData.Direction = DeferredLightUniforms.Direction; LightData.SpotAngles = DeferredLightUniforms.SpotAngles; LightData.SpecularScale = 1.0; LightData.bInverseSquared = INVERSE_SQUARED_FALLOFF; LightData.bSpotLight = IS_SPOT_LIGHT; } FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0; { half DielectricSpecular = 0.08 * GBuffer.Specular; ShadingModelContext.DiffuseColor = GBuffer.BaseColor - GBuffer.BaseColor * GBuffer.Metallic; // 1 mad ShadingModelContext.SpecularColor = (DielectricSpecular - DielectricSpecular * GBuffer.Metallic) + GBuffer.BaseColor * GBuffer.Metallic; // 2 mad // Computing environment BRDF ShadingModelContext.SpecularColor = GetEnvBRDF(ShadingModelContext.SpecularColor, GBuffer.Roughness, NoV); } // Calculate direct light half3 Color = GetDirectLighting(LightData, ShadingModelContext, GBuffer, WorldPosition, CameraVector); // IES, illumination function half LightAttenuation = ComputeLightProfileMultiplier(WorldPosition, DeferredLightUniforms.Position, -DeferredLightUniforms.Direction, DeferredLightUniforms.Tangent); LightAttenuation*= ComputeLightFunctionMultiplier(WorldPosition); #if USE_PREEXPOSURE // MobileHDR applies PreExposure in tonemapper LightAttenuation*= View.PreExposure; #endif OutColor.rgb = Color * LightAttenuation; OutColor.a = 1; }
It can be seen from the above that the PS of directional light and local light source are different entrances, mainly because they are quite different. The illumination of directional light is calculated directly at the main entrance, with reflection (IBL, catcher) and sky light diffuse reflection calculated; The local light source will build a light source structure, enter the direct light calculation function, and finally deal with the unique IES and lighting function of the local light source.
In addition, when acquiring GBuffer, the unique read mode of SubPass is adopted, and different coloring platforms are different:
// Vulkan [[vk::input_attachment_index(1)]] SubpassInput<float4> GENERATED_SubpassFetchAttachment0; #define VulkanSubpassFetch0() GENERATED_SubpassFetchAttachment0.SubpassLoad() // Metal Texture2D<float4> gl_LastFragDataRGBA_1; #define SubpassFetchRGBA_1() gl_LastFragDataRGBA_1.Load(uint3(0, 0, 0), 0) // DX / OpenGL Texture2DSampleLevel(GBufferATexture, GBufferATextureSampler, UV, 0);
Team recruitment
The blogger's team is developing a new immersive experience product with UE4. It is in urgent need of all heroes to join in and work together for great cause. The following positions are urgently sought:
- UE logic development.
- UE engine program.
- UE graphics rendering.
- TA (technology and Art).
Requirements: enthusiastic about technology, solid technical foundation, good communication and cooperation skills, UE use experience or mobile terminal development experience is preferred.
If you are interested or want to know more, please add the blogger's wechat: 81079389 (indicate the blog Park job), or send your resume to the blogger's email: 81079389#qq.com (# replace with @).
Waiting for heroes to meet.
Special note
- Part 1 ends, and Part 2 includes:
- Mobile rendering technology
- Mobile terminal optimization skills
- Thanks to the authors of all references, some pictures come from references and the Internet, which are infringed and deleted.
- This series of articles is original by the author and only published in the blog park. You are welcome to share the link of this article, but you are not allowed to reprint it without consent!
- Series of articles, to be continued, please stamp the complete directory Content outline.
- Series of articles, to be continued, please stamp the complete directory Content outline.
- Series of articles, to be continued, please stamp the complete directory Content outline.
reference
- Unreal Engine Source
- Rendering and Graphics
- Materials
- Graphics Programming
- Mobile Rendering
- Qualcomm® Adreno™ GPU
- PowerVR Developer Documentation
- Arm Mali GPU Best Practices Developer Guide
- Arm Mali GPU Graphics and Gaming Development
- Moving Mobile Graphics
- GDC Vault
- Siggraph Conference Content
- GameDev Best Practices
- Accelerating Mobile XR
- Frequently Asked Questions
- Google Developer Contributes Universal Bandwidth Compression To Freedreno Driver
- Using pipeline barriers efficiently
- Optimized pixel-projected reflections for planar reflectors
- The UE4 screen shows the difference between the mobile terminal and the PC terminal and the sharing of minimizing the difference
- Deferred Shading in Unity URP
- General techniques for mobile game performance optimization
- In depth GPU hardware architecture and operation mechanism
- Adaptive Performance in Call of Duty Mobile
- Jet Set Vulkan : Reflecting on the move to Vulkan
- Vulkan Best Practices - Memory limits with Vulkan on Mali GPUs
- A Year in a Fortnite
- The Challenges of Porting Traha to Vulkan
- L2M - Binding and Format Optimization
- Adreno Best Practices
- Summary of mobile device GPU architecture knowledge
- Mali GPU Architectures
- Cyclic Redundancy Check
- Arm Guide for Unreal Engine
- Arm Virtual Reality
- Best Practices for VR on Unreal Engine
- Optimizing Assets for Mobile VR
- Arm® Guide for Unreal Engine 4 Optimizing Mobile Gaming Graphics
- Adaptive Scalable Texture Compression
- Tile-Based Rendering
- Understanding Render Passes
- Intro to Moving Mobile Graphics
- Mobile Graphics 101
- Intro to Moving Mobile Graphics
- Mobile Graphics 101
- Vulkan API
- Best Practices for Shaders