Analyze the unreal rendering system (12) - mobile terminal special Part 1 (UE mobile terminal rendering analysis)

Posted by weezy8802 on Thu, 04 Nov 2021 19:40:27 +0100

catalogue

 

 

12.1 overview of this chapter

All the previous chapters describe the rendering system of UE based on the delayed rendering pipeline at the PC end, especially Analyze the unreal rendering system (04) - delay rendering pipeline The flow and steps of the delay rendering pipeline on the PC side are described in detail.

As long as this article describes the rendering pipeline at the mobile end of the UE, it will finally compare the rendering differences between the mobile end and the PC end, as well as special optimization measures. This chapter mainly describes the following contents of UE rendering system:

  • Main processes and steps of fmobilescenerer.
  • Forward and delayed rendering pipelines at the mobile end.
  • Light, shadow and shadow at the mobile end.
  • The similarities and differences between mobile terminal and PC terminal, as well as the special optimization skills involved.

In particular, the UE source code analyzed in this article has been upgraded to 4.27.1. Students who need to watch the source code synchronously should pay attention to the update.

If you want to open the mobile end rendering pipeline in the UE editor of the PC, you can select the following menu:

Wait until the Shader is compiled, and the preview effect of the mobile end will be in the viewport of the UE editor.

12.1.1 characteristics of mobile equipment

Compared with the PC desktop platform, the mobile terminal has significant differences in size, power, hardware performance and many other aspects, which are specifically reflected in:

  • Smaller size. The portability of the mobile terminal requires that the whole machine equipment must be lightweight and can be placed in the palm or pocket, so the whole machine can only be limited to a very small volume.

  • Limited energy and power. Limited by battery storage technology, the current mainstream lithium battery is generally 10000 Ma, but the resolution and image quality of mobile devices are getting higher and higher. In order to meet the long enough endurance and heat dissipation restrictions, the overall power of mobile devices must be strictly controlled, usually within 5w.

  • Limited heat dissipation. PC devices can usually be equipped with cooling fans or even water cooling systems, while mobile devices do not have these active cooling methods and can only rely on heat conduction. If the heat dissipation is improper, the CPU and GPU will actively reduce the frequency to run with very limited performance, so as to avoid damage to equipment components due to overheating.

  • Limited hardware performance. The performance of various components of mobile devices (CPU, bandwidth, memory, GPU, etc.) is only one tenth of that of PC devices.

    Performance comparison chart of mainstream PC devices (NV GV100-400-A1 Titan V) and mainstream mobile devices (Samsung Exynos 9 8895) in 2018. Many hardware performance of mobile devices is only one tenth of that of PC devices, but the resolution is close to half of that of PC devices, which highlights the challenges and dilemmas of mobile devices.

    By 2020, the performance of mainstream mobile devices is as follows:

  • Special hardware architecture. For example, CPU and GPU share memory storage devices, which are called coupled architecture, and GPU's TB(D)R architecture, which are designed to complete as many operations as possible in low power consumption.

    Comparison diagram of decoupled hardware architecture of PC device and coupled hardware architecture of mobile device.

In addition, unlike the CPU and GPU on the PC side, which purely pursue computing Performance, there are three indicators to measure the Performance of the mobile end: Performance, Power and Area, commonly known as PPA. (figure below)

There are three basic parameters to measure mobile devices: Performance, Area and Power. Among them, Compute Density involves Performance and Area, and energy consumption ratio involves Performance and capacity consumption. The greater the better.

With the rise of mobile devices, XR devices are an important development branch of mobile devices. At present, there are XR devices with different sizes, functions and application scenarios:

Various forms of XR equipment.

With the recent explosion of Metaverse and the change of the name of FaceBook to Meta, and technology giants such as Apple, MicroSoft, NVidia and Google are stepping up the layout of future oriented immersive experience, XR devices, as the carrier and entrance closest to the imagination of Metaverse, naturally become a brand-new track with great potential to appear Big Macs in the future.

 

12.2 UE mobile terminal rendering characteristics

This chapter describes the rendering characteristics of UE4.27 on the mobile terminal.

12.2.1 Feature Level

The UE supports the following graphical API s at the mobile terminal:

Feature Level explain
OpenGL ES 3.1 The default feature level of Android system. You can configure specific material parameters in project settings (Project Settings > platforms > Android material quality - es31).
Android Vulkan The high-end renderer that can be used for some specific Android devices supports Vulkan 1.2 API. Vulkan with lightweight design concept will be more efficient than OpenGL in most cases.
Metal 2.0 Feature level dedicated to iOS devices. You can configure material parameters in Project Settings > platforms > iOS material quality.

In the current mainstream Android devices, better performance can be obtained by using Vulkan. The reason is that Vulkan's lightweight design concept enables applications such as UE to perform optimization more accurately. The following is a comparison table of Vulkan and OpenGL:

Vulkan OpenGL
Based on the state of the object, there is no global state. A single global state machine.
All state concepts are placed in the command buffer. The state is bound to a single context.
Can be multi-threaded coding. Rendering operations can only be performed sequentially.
It can accurately and explicitly manipulate the memory and synchronization of GPU. The memory and synchronization details of the GPU are usually hidden by the driver.
The driver has no runtime error detection, but there is a verification layer for developers. Extensive runtime error detection.

On the Windows platform, the UE editor can also start the simulators of OpenGL, Vulkan and Metal to preview the effect during the editor, but it may be different from the actual running device screen, so this function cannot be completely relied on.

Before opening Vulkan, you need to configure some parameters in the project. See the official document for details Android Vulkan Mobile Renderer.

In addition, the OpenGL support under windows has been removed in previous versions of UE. Although the simulation option of OpenGL still exists in the UE editor, the bottom layer is actually rendered with D3D.

12.2.2 Deferred Shading

Delayed shading of UE is a function added only in 4.26, which enables developers to realize more complex light and shadow effects on the mobile terminal, such as high-quality reflection, multi dynamic lighting, decals and advanced lighting features.

Up: forward rendering; Bottom: delay rendering.

If you want to enable delayed rendering on the mobile terminal, you need to add the r.Mobile.ShadingPath=1 field in DefaultEngine.ini under the project configuration directory, and then restart the editor.

12.2.3 Ground Truth Ambient Occlusion

Ground Truth Ambient Occlusion (GTAO) is an ambient occlusion technology close to the real world. It is a kind of shadow compensation. It can mask some indirect light, so as to obtain good soft shadow effect.

Turn on the effect of GTAO. Note that when the robot approaches the wall, it will leave a gradual soft shadow effect on the wall.

To enable GTAO, you need to check the following options:

In addition, GTAO depends on the option of Mobile HDR. In order to enable it on the corresponding target device, you also need to add the r.Mobile.AmbientOcclusionQuality field in the configuration of [Platform]Scalability.ini, and the value must be greater than 0, otherwise GTAO will be disabled.

It is worth noting that GTAO has performance problems on Mali devices because their maximum number of Compute Shader threads is less than 1024.

12.2.4 Dynamic Lighting and Shadow

The light source characteristics realized by UE at the mobile end include:

  • HDR illumination in linear space.
  • Lighting map with direction (considering normal).
  • The sun (horizontal light) supports distance field shadows + resolved specular highlights.
  • IBL illumination: each object samples the nearest reflection catcher without parallax correction.
  • Dynamic objects can correctly receive light and cast shadows.

The type, quantity, shadow and other information of dynamic light sources supported by the UE mobile terminal are as follows:

Light source type Maximum quantity shadow describe
Parallel light 1 CSM CSM is level 2 by default and supports level 4 at most.
Point source 4 I won't support it Point light shadow requires cube shadow map, while the technology of single Pass rendering cube shadow (OnePassPointLightShadow) requires GS (SM5).
Spotlight 4 support It is disabled by default and needs to be enabled in the project.
Area light 0 I won't support it Dynamic area lighting effects are not currently supported.

Dynamic spotlights need to be explicitly turned on in the project configuration:

In the pixel shader of mobile BasePass, spotlight shadow map shares the same texture sampler with CSM, and spotlight shadow and CSM use the same shadow map atlas. CSM can ensure that there is enough space, and the spotlights will be sorted by shadow resolution.

By default, the maximum number of visible shadows is limited to 8, but the upper limit can be changed by changing the value of r.mobile.maxvisiblemovablespotlightsshow. The resolution of spotlight shadows is based on screen size and r.Shadow.TexelsPerPixelSpotlight.

The total number of local lights (point lights and spotlights) in the forward render path cannot exceed 4.

The mobile terminal also supports a special shadow mode, that is, Modulated Shadows, which can only be used for fixed directional lights. The effect picture with modulation shadow turned on is as follows:

Modulation shadows also support changing shadow color and blending ratio:

Left: dynamic shadow; Right: modulation shadow.

The shadow at the mobile end also supports the setting of self shadow, shadow quality (r.shadowquality), depth offset and other parameters.

In addition, the mobile terminal uses GGX specular reflection by default. If you want to switch to the traditional specular shading model, you can modify it in the following configuration:

12.2.5 Pixel Projected Reflection

UE has made an optimized version of SSR for the mobile terminal, called Pixel Projected Reflection (PPR), which is also the core idea of reusing screen space pixels.

PPR renderings.

In order to turn on the PPR effect, the following conditions need to be met:

  • Turn on the MobileHDR option.

  • r. The value of mobile.pixelprojectedreflectionquality is greater than 0.

  • Set Project Settings > mobile and set the planar reflection mode to the correct mode:

    The Planar Reflection Mode has three options:

    • Usual: the function of planar reflection Actor is the same on all platforms.
    • MobilePPR: the planar reflection Actor works normally on PC / host platforms, but uses PPR rendering on mobile platforms.
    • MobilePPRExclusive: the planar reflection Actor will only be used for PPR on mobile platforms, leaving room for PC and Console projects to use traditional SSR.

By default, only high-end mobile devices can turn on r.Mobile.PixelProjectedReflectionQuality in [Project]Scalability.ini.

12.2.6 Mesh Auto-Instancing

Grid drawing pipeline at PC end Automatic mesh instance and merge rendering have been supported, which can greatly improve rendering performance. 4.27 this feature has been supported on the mobile terminal.

If you want to open it, you need to open DefaultEngine.ini under the project configuration directory and add the following fields:

r.Mobile.SupportGPUScene=1
r.Mobile.UseGPUSceneTexture=1

Restart the editor and wait for the Shader to compile to preview the effect.

Due to the need for gpuscenetext support, and the maximum Uniform Buffer of the Mali device is only 64kb, so it cannot support enough space. Therefore, the Mali device will use texture instead of buffer to store GPUScene data.

However, there are some limitations:

  • Automatic instantiation on mobile devices is mainly conducive to CPU intensive projects rather than GPU intensive projects. Although enabling automatic instantiation is unlikely to harm GPU intensive projects, it is unlikely to see significant performance improvements with it.

  • If a game or application requires a lot of memory, it may be better to turn off r.mobile.usegpuscenetext and use the buffer because it does not work properly on the Mali device.

    You can also turn off r.mobile.usegpuscenetext for Mali devices, while devices from other GPU manufacturers are in normal use.

The effectiveness of automatic instantiation largely depends on the exact specification and positioning of the project. It is recommended to create a build with automatic instantiation enabled and conduct a summary analysis to determine whether substantive performance improvement will be seen.

12.2.7 Post Processing

Because mobile devices have such restrictive factors as slower dependent texture read, limited hardware features, special hardware architecture, additional rendering target resolution, limited bandwidth and so on, post-processing on mobile devices will consume performance, and some extreme cases will jam the rendering pipeline.

Nevertheless, in some games or applications with high image quality requirements, they still rely heavily on the strong expressiveness of post-processing to take several steps for high quality. UE will not restrict developers from using post-processing.

In order to enable post-processing, you must first enable the MobileHDR option:

After enabling post-processing, you can set various post-processing effects in the Post Process Volume.

The post-processing supported on the mobile terminal includes Mobile Tonemapper, Color Grading, Lens, Bloom, Dirt Mask, Auto Exposure, Lens Flares, Depth of Field, etc.

In order to obtain better performance, the official suggestion is to only open Bloom and TAA on the mobile terminal.

12.2.8 other characteristics and limitations

  • Reflection Capture Compression

The mobile terminal supports the compression of the Reflection Capture Component, which can reduce the memory and bandwidth of the Reflection Capture runtime and improve the rendering efficiency. It needs to be enabled in the project configuration:

When enabled, ETC2 is used for compression by default. In addition, you can adjust for each Reflection Capture Component:

  • Material properties

Materials on mobile platforms (property level Open ES 3.1) use the same node based creation process as other platforms, and most nodes are supported on the mobile end.

The material properties supported by the mobile platform include BaseColor, Roughness, Metallic, spectral, Normal, Emissive and reflection, but do not support Scene Color expression, Tessellation input and subsurface scattering shading model.

There are some limitations on the materials supported by mobile platforms:

  • Due to hardware limitations, only 16 texture samplers can be used.
  • Only DefaultLit and Unlit shading models are available.
  • Custom UVs should be used to avoid relying on texture reading (no mathematical calculation of texture UV).
  • Translucent and Masked materials are very energy consuming, so it is recommended to use opaque materials as much as possible.
  • Depth fade can be used in translucent materials on iOS platforms, but it is not supported on platforms where the hardware does not support obtaining data from the depth buffer, resulting in unacceptable performance costs.

The material properties panel has some special options for the mobile end:

These attributes are described as follows:

  • Mobile separate transparency: whether to turn on a separate translucent rendering texture at the mobile end.

  • Use Full Precision: whether to Use Full Precision. If yes, it can reduce bandwidth occupation and energy consumption and improve performance, but there may be defects of distant objects:

    Left: full precision material; Right: semi precision material, the sun in the distance has a defect.

  • Use Lightmap Directionality: whether to turn on the directionality of the lightmap. If checked, the direction and pixel normals of the lightmap will be considered, but the performance consumption will be improved.

  • Use Alpha to Coverage: whether MSAA anti aliasing is enabled for Masked materials. If checked, MSAA will be enabled.

  • Full rough: whether it is completely rough. If checked, it will greatly improve the rendering efficiency of this material.

In addition, the grid types supported by the mobile terminal include:

  • Skeletal Mesh
  • Static Mesh
  • Landscape
  • CPU particle sprites, particle meshes

None of the above types are supported. Other restrictions include:

  • A single mesh can only be up to 65k, because the vertex index is only 16 bits.
  • The number of bones in a single Skeletal Mesh must be less than 75 because of the limitation of hardware performance.

 

12.3 FMobileSceneRenderer

Fmobilescenereader inherits from fscenereader, which is responsible for the scene rendering process of the mobile terminal, while the PC terminal also inherits from fscenereader FDeferredShadingSceneRenderer . Their inheritance relationship is as follows:

classDiagram-v2 FSceneRenderer <|-- FMobileSceneRenderer FSceneRenderer <|-- FDeferredShadingSceneRenderer

Fdeferredshadingscenerer has been mentioned in the previous articles. Its rendering process is particularly complex, including complex light and shadow and rendering steps. In contrast, the logic and steps of fmobilescenerer will be much simpler. The following is the screenshot of RenderDoc:

The above mainly includes InitViews, ShadowDepths, PrePass, BasePass, OcclusionTest, shadowprojectonopaque, translucence, PostProcessing and other steps. These steps exist on the PC side, but the implementation process may be different. See the analysis in the following chapters.

12.3.1 renderer main flow

The main process of the scene renderer on the mobile end also occurs in fmobilescenerer:: render. The code and analysis are as follows:

// Engine\Source\Runtime\Renderer\Private\MobileShadingRenderer.cpp

void FMobileSceneRenderer::Render(FRHICommandListImmediate& RHICmdList)
{
    // Update element scene information.
    Scene->UpdateAllPrimitiveSceneInfos(RHICmdList);

    // Prepare the rendered area of the view
    PrepareViewRectsForRendering(RHICmdList);

    // Prepare sky atmosphere data
    if (ShouldRenderSkyAtmosphere(Scene, ViewFamily.EngineShowFlags))
    {
        for (int32 LightIndex = 0; LightIndex < NUM_ATMOSPHERE_LIGHTS; ++LightIndex)
        {
            if (Scene->AtmosphereLights[LightIndex])
            {
                PrepareSunLightProxy(*Scene->GetSkyAtmosphereSceneInfo(), LightIndex, *Scene->AtmosphereLights[LightIndex]);
            }
        }
    }
    else
    {
        Scene->ResetAtmosphereLightsProperties();
    }

    if(!ViewFamily.EngineShowFlags.Rendering)
    {
        return;
    }

    // Wait for occlusion rejection test
    WaitOcclusionTests(RHICmdList);
    FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    // Initialize the view, find visible entities, and prepare RT and buffer data for rendering
    InitViews(RHICmdList);
    
    if (GRHINeedsExtraDeletionLatency || !GRHICommandList.Bypass())
    {
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FMobileSceneRenderer_PostInitViewsFlushDel);

        // Occlusion queries may be suspended, so it is best to let RHI thread and GPU work while waiting. In addition, when RHI thread is executed, this is the only place to process pending deletion
        FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
        FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThreadFlushResources);
    }

    GEngine->GetPreRenderDelegate().Broadcast();

    // Commit global dynamic buffer before rendering starts
    DynamicIndexBuffer.Commit();
    DynamicVertexBuffer.Commit();
    DynamicReadBuffer.Commit();
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_SceneSim));

    if (ViewFamily.bLateLatchingEnabled)
    {
        BeginLateLatching(RHICmdList);
    }

    FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList);

    // Working with virtual textures
    if (bUseVirtualTexturing)
    {
        SCOPED_GPU_STAT(RHICmdList, VirtualTextureUpdate);
        
        FVirtualTextureSystem::Get().Update(RHICmdList, FeatureLevel, Scene);
        
        // Clear virtual texture feedback to default value
        FUnorderedAccessViewRHIRef FeedbackUAV = SceneContext.GetVirtualTextureFeedbackUAV();
        RHICmdList.Transition(FRHITransitionInfo(FeedbackUAV, ERHIAccess::SRVMask, ERHIAccess::UAVMask));
        RHICmdList.ClearUAVUint(FeedbackUAV, FUintVector4(~0u, ~0u, ~0u, ~0u));
        RHICmdList.Transition(FRHITransitionInfo(FeedbackUAV, ERHIAccess::UAVMask, ERHIAccess::UAVMask));
        RHICmdList.BeginUAVOverlap(FeedbackUAV);
    }
    
    // Sorted lighting information
    FSortedLightSetSceneInfo SortedLightSet;
    
    // Delay rendering
    if (bDeferredShading)
    {
        // Collect and sort lights
        GatherAndSortLights(SortedLightSet);
        
        int32 NumReflectionCaptures = Views[0].NumBoxReflectionCaptures + Views[0].NumSphereReflectionCaptures;
        bool bCullLightsToGrid = (NumReflectionCaptures > 0 || GMobileUseClusteredDeferredShading != 0);
        FRDGBuilder GraphBuilder(RHICmdList);
        // Calculate the light grid
        ComputeLightGrid(GraphBuilder, bCullLightsToGrid, SortedLightSet);
        GraphBuilder.Execute();
    }

    // Generate sky / atmosphere LUT
    const bool bShouldRenderSkyAtmosphere = ShouldRenderSkyAtmosphere(Scene, ViewFamily.EngineShowFlags);
    if (bShouldRenderSkyAtmosphere)
    {
        FRDGBuilder GraphBuilder(RHICmdList);
        RenderSkyAtmosphereLookUpTables(GraphBuilder);
        GraphBuilder.Execute();
    }

    // Inform the special effects system to prepare the scene for rendering
    if (FXSystem && ViewFamily.EngineShowFlags.Particles)
    {
        FXSystem->PreRender(RHICmdList, NULL, !Views[0].bIsPlanarReflection);
        if (FGPUSortManager* GPUSortManager = FXSystem->GetGPUSortManager())
        {
            GPUSortManager->OnPreRender(RHICmdList);
        }
    }
    // Polling occlusion culling requests
    FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Shadows));

    // Render shadows
    RenderShadowDepthMaps(RHICmdList);
    FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    // Collect a list of views
    TArray<const FViewInfo*> ViewList;
    for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++) 
    {
        ViewList.Add(&Views[ViewIndex]);
    }

    // Render custom depth
    if (bShouldRenderCustomDepth)
    {
        FRDGBuilder GraphBuilder(RHICmdList);
        FSceneTextureShaderParameters SceneTextures = CreateSceneTextureShaderParameters(GraphBuilder, Views[0].GetFeatureLevel(), ESceneTextureSetupMode::None);
        RenderCustomDepthPass(GraphBuilder, SceneTextures);
        GraphBuilder.Execute();
    }

    // Render depth PrePass
    if (bIsFullPrepassEnabled)
    {
        // SDF and AO require full PrePass depth

        FRHIRenderPassInfo DepthPrePassRenderPassInfo(
            SceneContext.GetSceneDepthSurface(),
            EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil);

        DepthPrePassRenderPassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch();
        DepthPrePassRenderPassInfo.bOcclusionQueries = DepthPrePassRenderPassInfo.NumOcclusionQueries != 0;

        RHICmdList.BeginRenderPass(DepthPrePassRenderPassInfo, TEXT("DepthPrepass"));

        RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass));
        
        // Render full depth PrePass
        RenderPrePass(RHICmdList);

        // Submit occlusion culling
        RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion));
        RenderOcclusion(RHICmdList);

        RHICmdList.EndRenderPass();

        // SDF shadow
        if (bRequiresDistanceFieldShadowingPass)
        {
            CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderSDFShadowing);
            RenderSDFShadowing(RHICmdList);
        }

        // HZB.
        if (bShouldRenderHZB)
        {
            RenderHZB(RHICmdList, SceneContext.SceneDepthZ);
        }

        // AO.
        if (bRequiresAmbientOcclusionPass)
        {
            RenderAmbientOcclusion(RHICmdList, SceneContext.SceneDepthZ);
        }
    }

    FRHITexture* SceneColor = nullptr;
    
    // Delay rendering
    if (bDeferredShading)
    {
        SceneColor = RenderDeferred(RHICmdList, ViewList, SortedLightSet);
    }
    // Forward rendering
    else
    {
        SceneColor = RenderForward(RHICmdList, ViewList);
    }
    
    // Render speed buffer
    if (bShouldRenderVelocities)
    {
        FRDGBuilder GraphBuilder(RHICmdList);

        FRDGTextureMSAA SceneDepthTexture = RegisterExternalTextureMSAA(GraphBuilder, SceneContext.SceneDepthZ);
        FRDGTextureRef VelocityTexture = TryRegisterExternalTexture(GraphBuilder, SceneContext.SceneVelocity);

        if (VelocityTexture != nullptr)
        {
            AddClearRenderTargetPass(GraphBuilder, VelocityTexture);
        }

        // Speed buffer for rendering movable objects
        AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_Velocity));
        RenderVelocities(GraphBuilder, SceneDepthTexture.Resolve, VelocityTexture, FSceneTextureShaderParameters(), EVelocityPass::Opaque, false);
        AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_AfterVelocity));

        // Speed buffer for rendering transparent objects
        AddSetCurrentStatPass(GraphBuilder, GET_STATID(STAT_CLMM_TranslucentVelocity));
        RenderVelocities(GraphBuilder, SceneDepthTexture.Resolve, VelocityTexture, GetSceneTextureShaderParameters(CreateMobileSceneTextureUniformBuffer(GraphBuilder, EMobileSceneTextureSetupMode::SceneColor)), EVelocityPass::Translucent, false);

        GraphBuilder.Execute();
    }

    // Deal with the logic after scene rendering
    {
        FRendererModule& RendererModule = static_cast<FRendererModule&>(GetRendererModule());
        FRDGBuilder GraphBuilder(RHICmdList);
        RendererModule.RenderPostOpaqueExtensions(GraphBuilder, Views, SceneContext);

        if (FXSystem && Views.IsValidIndex(0))
        {
            AddUntrackedAccessPass(GraphBuilder, [this](FRHICommandListImmediate& RHICmdList)
            {
                check(RHICmdList.IsOutsideRenderPass());

                FXSystem->PostRenderOpaque(
                    RHICmdList,
                    Views[0].ViewUniformBuffer,
                    nullptr,
                    nullptr,
                    Views[0].AllowGPUParticleUpdate()
                );
                if (FGPUSortManager* GPUSortManager = FXSystem->GetGPUSortManager())
                {
                    GPUSortManager->OnPostRenderOpaque(RHICmdList);
                }
            });
        }
        GraphBuilder.Execute();
    }

    // Flush / commit command buffer
    if (bSubmitOffscreenRendering)
    {
        RHICmdList.SubmitCommandsHint();
        RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
    }
    
    // Convert the scene color to SRV for subsequent steps to read
    if (!bGammaSpace || bRenderToSceneColor)
    {
        RHICmdList.Transition(FRHITransitionInfo(SceneColor, ERHIAccess::Unknown, ERHIAccess::SRVMask));
    }

    if (bDeferredShading)
    {
        // Releases the original reference on the scene render target
        SceneContext.AdjustGBufferRefCount(RHICmdList, -1);
    }

    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Post));

    // Work with virtual textures
    if (bUseVirtualTexturing)
    {    
        SCOPED_GPU_STAT(RHICmdList, VirtualTextureUpdate);

        // No pass after this should make VT page requests
        RHICmdList.EndUAVOverlap(SceneContext.VirtualTextureFeedbackUAV);
        RHICmdList.Transition(FRHITransitionInfo(SceneContext.VirtualTextureFeedbackUAV, ERHIAccess::UAVMask, ERHIAccess::SRVMask));

        TArray<FIntRect, TInlineAllocator<4>> ViewRects;
        ViewRects.AddUninitialized(Views.Num());
        for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ++ViewIndex)
        {
            ViewRects[ViewIndex] = Views[ViewIndex].ViewRect;
        }
        
        FVirtualTextureFeedbackBufferDesc Desc;
        Desc.Init2D(SceneContext.GetBufferSizeXY(), ViewRects, SceneContext.GetVirtualTextureFeedbackScale());

        SubmitVirtualTextureFeedbackBuffer(RHICmdList, SceneContext.VirtualTextureFeedback, Desc);
    }

    FMemMark Mark(FMemStack::Get());
    FRDGBuilder GraphBuilder(RHICmdList);

    FRDGTextureRef ViewFamilyTexture = TryCreateViewFamilyTexture(GraphBuilder, ViewFamily);
    
    // Parsing scene
    if (ViewFamily.bResolveScene)
    {
        if (!bGammaSpace || bRenderToSceneColor)
        {
            // Complete rendering of each view or full stereo buffer (if enabled)
            {
                RDG_EVENT_SCOPE(GraphBuilder, "PostProcessing");
                SCOPE_CYCLE_COUNTER(STAT_FinishRenderViewTargetTime);

                TArray<TRDGUniformBufferRef<FMobileSceneTextureUniformParameters>, TInlineAllocator<1, SceneRenderingAllocator>> MobileSceneTexturesPerView;
                MobileSceneTexturesPerView.SetNumZeroed(Views.Num());

                const auto SetupMobileSceneTexturesPerView = [&]()
                {
                    for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ++ViewIndex)
                    {
                        EMobileSceneTextureSetupMode SetupMode = EMobileSceneTextureSetupMode::SceneColor;
                        if (Views[ViewIndex].bCustomDepthStencilValid)
                        {
                            SetupMode |= EMobileSceneTextureSetupMode::CustomDepth;
                        }

                        if (bShouldRenderVelocities)
                        {
                            SetupMode |= EMobileSceneTextureSetupMode::SceneVelocity;
                        }

                        MobileSceneTexturesPerView[ViewIndex] = CreateMobileSceneTextureUniformBuffer(GraphBuilder, SetupMode);
                    }
                };

                SetupMobileSceneTexturesPerView();

                FMobilePostProcessingInputs PostProcessingInputs;
                PostProcessingInputs.ViewFamilyTexture = ViewFamilyTexture;

                // Post render effects
                for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
                {
                    RDG_EVENT_SCOPE_CONDITIONAL(GraphBuilder, Views.Num() > 1, "View%d", ViewIndex);
                    PostProcessingInputs.SceneTextures = MobileSceneTexturesPerView[ViewIndex];
                    AddMobilePostProcessingPasses(GraphBuilder, Views[ViewIndex], PostProcessingInputs, NumMSAASamples > 1);
                }
            }
        }
    }

    GEngine->GetPostRenderDelegate().Broadcast();

    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_SceneEnd));

    if (bShouldRenderVelocities)
    {
        SceneContext.SceneVelocity.SafeRelease();
    }
    
    if (ViewFamily.bLateLatchingEnabled)
    {
        EndLateLatching(RHICmdList, Views[0]);
    }

    RenderFinish(GraphBuilder, ViewFamilyTexture);
    GraphBuilder.Execute();

    // Polling occlusion culling requests
    FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
    FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
}

Yes Analyze the unreal rendering system (04) - delay rendering pipeline Students in this chapter should know that the scene rendering process on the mobile terminal simplifies many steps, which is equivalent to a subset of the scene renderer on the PC terminal. Of course, in order to adapt to the unique GPU hardware architecture of the mobile terminal, the scene rendering of the mobile terminal is also different from that of the PC terminal. It will be analyzed in detail later. The main steps and processes of the mobile end scenario are as follows:

stateDiagram-v2 state bDeferredShading <<choice>> state bIsFullPrepassEnabled <<choice>> [*] --> UpdateAllPrimitiveSceneInfos UpdateAllPrimitiveSceneInfos --> PrepareViewRectsForRendering PrepareViewRectsForRendering --> InitViews InitViews --> bDeferredShading bDeferredShading --> RenderSkyAtmosphereLookUpTables* : No bDeferredShading --> GatherAndSortLights: Yes GatherAndSortLights --> ComputeLightGrid ComputeLightGrid --> RenderSkyAtmosphereLookUpTables* RenderSkyAtmosphereLookUpTables* --> RenderShadowDepthMaps RenderShadowDepthMaps --> RenderCustomDepthPass* RenderCustomDepthPass* --> bIsFullPrepassEnabled bIsFullPrepassEnabled --> bDeferredShading2: No bIsFullPrepassEnabled --> RenderPrePass:Yes RenderPrePass --> RenderOcclusion RenderOcclusion --> RenderSDFShadowing* RenderSDFShadowing* --> RenderHZB* RenderHZB* --> RenderAmbientOcclusion* RenderAmbientOcclusion* --> bDeferredShading2 bDeferredShading2-->RenderForward:No bDeferredShading2-->RenderDeferred:Yes RenderDeferred --> RenderVelocities* RenderForward --> RenderVelocities* RenderVelocities* --> AddMobilePostProcessingPasses* AddMobilePostProcessingPasses*-->RenderFinish RenderFinish --> [*]

As for the above flow chart, the following points need to be explained:

  • The flowchart nodes bDeferredShading and bDeferredShading2 are the same variable. The main purpose of distinguishing them here is to prevent mermaid syntax drawing errors.
  • Nodes with * are conditional and non inevitable steps.

UE4.26 adds the delayed rendering pipeline of the mobile terminal, so the above code has the forward rendering branch RenderForward and the delayed rendering branch renderderrred, which return the rendering result SceneColor.

The mobile terminal also supports rendering features such as primitive GPU scene, SDF shadow, AO, sky atmosphere, virtual texture, occlusion elimination and so on.

Since UE4.26, rendering system has been widely used RDG system The scene renderer on the mobile end is no exception. A total of several FRDGBuilder instances are declared in the above code, which are used to calculate the light source lattice, render the sky atmosphere LUT, customize the depth, speed buffer, render post event, post-processing, etc. they are relatively independent function modules or rendering stages.

12.3.2 RenderForward

RenderForward is responsible for the branch of forward rendering in the mobile scene renderer. Its code and analysis are as follows:

FRHITexture* FMobileSceneRenderer::RenderForward(FRHICommandListImmediate& RHICmdList, const TArrayView<const FViewInfo*> ViewList)
{
    const FViewInfo& View = *ViewList[0];
    FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList);
                
    FRHITexture* SceneColor = nullptr;
    FRHITexture* SceneColorResolve = nullptr;
    FRHITexture* SceneDepth = nullptr;
    ERenderTargetActions ColorTargetAction = ERenderTargetActions::Clear_Store;
    EDepthStencilTargetActions DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_DontStoreDepthStencil;

    // Enable mobile MSAA
    bool bMobileMSAA = NumMSAASamples > 1 && SceneContext.GetSceneColorSurface()->GetNumSamples() > 1;

    // Whether to enable the mobile terminal multi attempt mode
    static const auto CVarMobileMultiView = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("vr.MobileMultiView"));
    const bool bIsMultiViewApplication = (CVarMobileMultiView && CVarMobileMultiView->GetValueOnAnyThread() != 0);
    
    // The rendering branch of gamma space
    if (bGammaSpace && !bRenderToSceneColor)
    {
        // If MSAA is turned on, the rendered texture (including scene color and resolution texture) is obtained from the SceneContext
        if (bMobileMSAA)
        {
            SceneColor = SceneContext.GetSceneColorSurface();
            SceneColorResolve = ViewFamily.RenderTarget->GetRenderTargetTexture();
            ColorTargetAction = ERenderTargetActions::Clear_Resolve;
            RHICmdList.Transition(FRHITransitionInfo(SceneColorResolve, ERHIAccess::Unknown, ERHIAccess::RTV | ERHIAccess::ResolveDst));
        }
        // Non MSAA, get render texture from view family
        else
        {
            SceneColor = ViewFamily.RenderTarget->GetRenderTargetTexture();
            RHICmdList.Transition(FRHITransitionInfo(SceneColor, ERHIAccess::Unknown, ERHIAccess::RTV));
        }
        SceneDepth = SceneContext.GetSceneDepthSurface();
    }
    // Linear space or render to scene texture
    else
    {
        SceneColor = SceneContext.GetSceneColorSurface();
        if (bMobileMSAA)
        {
            SceneColorResolve = SceneContext.GetSceneColorTexture();
            ColorTargetAction = ERenderTargetActions::Clear_Resolve;
            RHICmdList.Transition(FRHITransitionInfo(SceneColorResolve, ERHIAccess::Unknown, ERHIAccess::RTV | ERHIAccess::ResolveDst));
        }
        else
        {
            SceneColorResolve = nullptr;
            ColorTargetAction = ERenderTargetActions::Clear_Store;
        }

        SceneDepth = SceneContext.GetSceneDepthSurface();
                
        if (bRequiresMultiPass)
        {    
            // store targets after opaque so translucency render pass can be restarted
            ColorTargetAction = ERenderTargetActions::Clear_Store;
            DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil;
        }
                        
        if (bKeepDepthContent)
        {
            // store depth if post-processing/capture needs it
            DepthTargetAction = EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil;
        }
    }

    // Depth texture state of prepass
    if (bIsFullPrepassEnabled)
    {
        ERenderTargetActions DepthTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetDepthActions(DepthTargetAction)));
        ERenderTargetActions StencilTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetStencilActions(DepthTargetAction)));
        DepthTargetAction = MakeDepthStencilTargetActions(DepthTarget, StencilTarget);
    }

    FRHITexture* ShadingRateTexture = nullptr;
    
    if (!View.bIsSceneCapture && !View.bIsReflectionCapture)
    {
        TRefCountPtr<IPooledRenderTarget> ShadingRateTarget = GVRSImageManager.GetMobileVariableRateShadingImage(ViewFamily);
        if (ShadingRateTarget.IsValid())
        {
            ShadingRateTexture = ShadingRateTarget->GetRenderTargetItem().ShaderResourceTexture;
        }
    }

    // Scene color rendering Pass information
    FRHIRenderPassInfo SceneColorRenderPassInfo(
        SceneColor,
        ColorTargetAction,
        SceneColorResolve,
        SceneDepth,
        DepthTargetAction,
        nullptr, // we never resolve scene depth on mobile
        ShadingRateTexture,
        VRSRB_Sum,
        FExclusiveDepthStencil::DepthWrite_StencilWrite
    );
    SceneColorRenderPassInfo.SubpassHint = ESubpassHint::DepthReadSubpass;
    if (!bIsFullPrepassEnabled)
    {
        SceneColorRenderPassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch();
        SceneColorRenderPassInfo.bOcclusionQueries = SceneColorRenderPassInfo.NumOcclusionQueries != 0;
    }
    // If the scene color is not multi view, but the application is, you need to render the multi view as single view to the shader
    SceneColorRenderPassInfo.MultiViewCount = View.bIsMobileMultiViewEnabled ? 2 : (bIsMultiViewApplication ? 1 : 0);

    // Start rendering scene colors
    RHICmdList.BeginRenderPass(SceneColorRenderPassInfo, TEXT("SceneColorRendering"));
    
    if (GIsEditor && !View.bIsSceneCapture)
    {
        DrawClearQuad(RHICmdList, Views[0].BackgroundColor);
    }

    if (!bIsFullPrepassEnabled)
    {
        RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass));
        // Render depth pre pass
        RenderPrePass(RHICmdList);
    }

    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Opaque));
    // Render BasePass: opaque and masked objects
    RenderMobileBasePass(RHICmdList, ViewList);
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    //Render debug mode
#if !(UE_BUILD_SHIPPING || UE_BUILD_TEST)
    if (ViewFamily.UseDebugViewPS())
    {
        // Here we use the base pass depth result to get z culling for opaque and masque.
        // The color needs to be cleared at this point since shader complexity renders in additive.
        DrawClearQuad(RHICmdList, FLinearColor::Black);
        RenderMobileDebugView(RHICmdList, ViewList);
        RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
    }
#endif // !(UE_BUILD_SHIPPING || UE_BUILD_TEST)

    const bool bAdrenoOcclusionMode = CVarMobileAdrenoOcclusionMode.GetValueOnRenderThread() != 0;
    if (!bIsFullPrepassEnabled)
    {
        // Occlusion Culling 
        if (!bAdrenoOcclusionMode)
        {
            // Submit occlusion culling
            RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion));
            RenderOcclusion(RHICmdList);
            RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
        }
    }

    // Post event to handle plug-in rendering
    {
        CSV_SCOPED_TIMING_STAT_EXCLUSIVE(ViewExtensionPostRenderBasePass);
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FMobileSceneRenderer_ViewExtensionPostRenderBasePass);
        for (int32 ViewExt = 0; ViewExt < ViewFamily.ViewExtensions.Num(); ++ViewExt)
        {
            for (int32 ViewIndex = 0; ViewIndex < ViewFamily.Views.Num(); ++ViewIndex)
            {
                ViewFamily.ViewExtensions[ViewExt]->PostRenderBasePass_RenderThread(RHICmdList, Views[ViewIndex]);
            }
        }
    }
    
    // If you need to render reflections from transparent objects or pixel projections, you need to split pass
    if (bRequiresMultiPass || bRequiresPixelProjectedPlanarRelfectionPass)
    {
        RHICmdList.EndRenderPass();
    }
       
    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Translucency));

    // Reopen the transparent render channel if necessary
    if (bRequiresMultiPass || bRequiresPixelProjectedPlanarRelfectionPass)
    {
        check(RHICmdList.IsOutsideRenderPass());

        // If the current hardware does not support reading and writing the same depth buffer, the scene depth is copied
        ConditionalResolveSceneDepth(RHICmdList, View);

        if (bRequiresPixelProjectedPlanarRelfectionPass)
        {
            const FPlanarReflectionSceneProxy* PlanarReflectionSceneProxy = Scene ? Scene->GetForwardPassGlobalPlanarReflection() : nullptr;
            RenderPixelProjectedReflection(RHICmdList, SceneContext, PlanarReflectionSceneProxy);

            FRHITransitionInfo TranslucentRenderPassTransitions[] = {
            FRHITransitionInfo(SceneColor, ERHIAccess::SRVMask, ERHIAccess::RTV),
            FRHITransitionInfo(SceneDepth, ERHIAccess::SRVMask, ERHIAccess::DSVWrite)
            };
            RHICmdList.Transition(MakeArrayView(TranslucentRenderPassTransitions, UE_ARRAY_COUNT(TranslucentRenderPassTransitions)));
        }

        DepthTargetAction = EDepthStencilTargetActions::LoadDepthStencil_DontStoreDepthStencil;
        FExclusiveDepthStencil::Type ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilRead;
        if (bModulatedShadowsInUse)
        {
            ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilWrite;
        }

        // The opaque mesh used for moving end pixel projection reflection must write the depth to the depth RT because the mesh is rendered only once (if the quality level is lower than or equal to BestPerformance)
        if (IsMobilePixelProjectedReflectionEnabled(View.GetShaderPlatform())
            && GetMobilePixelProjectedReflectionQuality() == EMobilePixelProjectedReflectionQuality::BestPerformance)
        {
            ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite;
        }

        if (bKeepDepthContent && !bMobileMSAA)
        {
            DepthTargetAction = EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil;
        }

#if PLATFORM_HOLOLENS
        if (bShouldRenderDepthToTranslucency)
        {
            ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite;
        }
#endif

        // Transparent object rendering Pass
        FRHIRenderPassInfo TranslucentRenderPassInfo(
            SceneColor,
            SceneColorResolve ? ERenderTargetActions::Load_Resolve : ERenderTargetActions::Load_Store,
            SceneColorResolve,
            SceneDepth,
            DepthTargetAction, 
            nullptr,
            ShadingRateTexture,
            VRSRB_Sum,
            ExclusiveDepthStencil
        );
        TranslucentRenderPassInfo.NumOcclusionQueries = 0;
        TranslucentRenderPassInfo.bOcclusionQueries = false;
        TranslucentRenderPassInfo.SubpassHint = ESubpassHint::DepthReadSubpass;
        
        // Start rendering translucent objects
        RHICmdList.BeginRenderPass(TranslucentRenderPassInfo, TEXT("SceneColorTranslucencyRendering"));
    }

    // The scene depth is read-only and can be obtained
    RHICmdList.NextSubpass();

    if (!View.bIsPlanarReflection)
    {
        // Render decals
        if (ViewFamily.EngineShowFlags.Decals)
        {
            CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals);
            RenderDecals(RHICmdList);
        }

        // Render modulated shadow casting
        if (ViewFamily.EngineShowFlags.DynamicShadows)
        {
            CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderShadowProjections);
            RenderModulatedShadowProjections(RHICmdList);
        }
    }
    
    // Paint translucent
    if (ViewFamily.EngineShowFlags.Translucency)
    {
        CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency);
        SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime);
        
        RenderTranslucency(RHICmdList, ViewList);
        
        FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
        RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
    }

    if (!bIsFullPrepassEnabled)
    {
        // Adreno occlusion culling mode
        if (bAdrenoOcclusionMode)
        {
            RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion));
            // flush
            RHICmdList.SubmitCommandsHint();
            bSubmitOffscreenRendering = false; // submit once
            // Issue occlusion queries
            RenderOcclusion(RHICmdList);
            RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
        }
    }

    // Precomputed tone mapping before MSAA is parsed (valid only on iOS)
    if (!bGammaSpace)
    {
        PreTonemapMSAA(RHICmdList);
    }

    // End scene color rendering
    RHICmdList.EndRenderPass();

    // Optimize the analytical texture of the returned scene color (only when MSAA is turned on)
    return SceneColorResolve ? SceneColorResolve : SceneColor;
}

The main steps of forward rendering at the mobile end are similar to those at the PC end, rendering PrePass, BasePass, special rendering (decals, AO, occlusion culling, etc.) and translucent objects in turn. Their flow chart is as follows:

stateDiagram-v2 [*] --> DrawClearQuad* DrawClearQuad* --> RenderPrePass* RenderPrePass* --> RenderMobileBasePass RenderMobileBasePass --> RenderOcclusion* RenderOcclusion* --> RenderDecals* RenderDecals* --> RenderModulatedShadowProjections* RenderModulatedShadowProjections* --> RenderTranslucency* RenderTranslucency* --> PreTonemapMSAA* PreTonemapMSAA* --> [*]

Among them, occlusion removal is related to GPU manufacturers. For example, Qualcomm Adreno series GPU chips require between Flush rendering instructions and Switch FBO:

Render Opaque -> Render Translucent -> Flush -> Render Queries -> Switch FBO

Then UE also follows the special requirements of Adreno series chips and makes special treatment for its occlusion removal.

Adreno series chips support the rendering of Bin and ordinary Direct mixed modes of TBDR architecture, and will automatically switch to Direct mode during occlusion query to reduce the overhead of occlusion query. If the query is not submitted between the Flush rendering instruction and the Switch FBO, the whole rendering pipeline will be stuck and the rendering performance will be degraded.

MSAA is the preferred anti aliasing for UE forward rendering at the mobile end due to its natural hardware support and a good balance between effect and efficiency. Therefore, there are many logic to deal with MSAA in the above code, including color, depth texture and its resource status. If MSAA is enabled, by default, the scene color is parsed in RHICmdList.EndRenderPass() (at the same time, the data on the chip block is written back to the system video memory), so as to obtain the anti aliasing texture. MSAA on the mobile terminal is not enabled by default, but can be set in the following interface:

Forward rendering supports both Gamma space and HDR (linear space) color space modes. If it is a linear space, steps such as tone mapping are required in post rendering. The default is HDR, which can be changed in the project configuration:

bRequiresMultiPass of the above code indicates whether a special render Pass is required to draw a translucent object. Its value is determined by the following code:

// Engine\Source\Runtime\Renderer\Private\MobileShadingRenderer.cpp

bool FMobileSceneRenderer::RequiresMultiPass(FRHICommandListImmediate& RHICmdList, const FViewInfo& View) const
{
    // Vulkan uses subpasses
    if (IsVulkanPlatform(ShaderPlatform))
    {
        return false;
    }

    // All iOS support frame_buffer_fetch
    if (IsMetalMobilePlatform(ShaderPlatform))
    {
        return false;
    }

    if (IsMobileDeferredShadingEnabled(ShaderPlatform))
    {
        // TODO: add GL support
        return true;
    }
    
    // Some Androids support frame_buffer_fetch
    if (IsAndroidOpenGLESPlatform(ShaderPlatform) && (GSupportsShaderFramebufferFetch || GSupportsShaderDepthStencilFetch))
    {
        return false;
    }
        
    // Always render reflection capture in single pass
    if (View.bIsPlanarReflection || View.bIsSceneCapture)
    {
        return false;
    }

    // Always render LDR in single pass
    if (!IsMobileHDR())
    {
        return false;
    }

    // MSAA depth can't be sampled or resolved, unless we are on PC (no vulkan)
    if (NumMSAASamples > 1 && !IsSimulatedPlatform(ShaderPlatform))
    {
        return false;
    }

    return true;
}

Similar but different in meaning are the bIsMultiViewApplication and bIsMobileMultiViewEnabled tags, indicating whether to turn on multi view rendering and the number of multi views. It is only used for VR, which is determined by console variable vr.MobileMultiView, graphics API and other factors. MultiView is used in XR to optimize rendering twice. It has two modes: Basic and Advanced:

MultiView comparison chart for optimizing rendering such as VR. Above: for rendering without MultiView mode, two eyes submit drawing instructions respectively; Medium: basic MultiView mode, reuse submission instructions, and copy one more Command List at GPU layer; Lower: Advanced MultiView mode, which can reuse DC, Command List and geometric information.

bKeepDepthContent indicates whether to keep the depth content and determines its code:

bKeepDepthContent = 
    bRequiresMultiPass || 
    bForceDepthResolve ||
    bRequiresPixelProjectedPlanarRelfectionPass ||
    bSeparateTranslucencyActive ||
    Views[0].bIsReflectionCapture ||
    (bDeferredShading && bPostProcessUsesSceneDepth) ||
    bShouldRenderVelocities ||
    bIsFullPrepassEnabled;

// The depth with MSAA is never reserved
bKeepDepthContent = (NumMSAASamples > 1 ? false : bKeepDepthContent);

The above code also reveals a special rendering method of plane reflection at the mobile end: Pixel Projected Reflection (PPR). Its implementation principle is similar to SSR, but requires less data, only scene color, depth buffer and reflection area. Its core steps:

  • Calculate the mirror position of all pixels of the scene color in the reflection plane.
  • Test whether the reflection of the pixel is within the reflection area.
    • Rays are cast to mirror pixel locations.
    • Test whether the intersection is within the reflection area.
  • If the intersection point is found, calculate the mirror position of the pixel on the screen.
  • Writes the color of the mirrored pixel at the intersection.

  • If the intersection in the reflection area is blocked by other objects, the reflection at this position is eliminated.

PPR effect list.

PPR can be set in the project configuration:

12.3.3 RenderDeferred

UE adds a delayed rendering branch to the mobile end rendering pipeline on 4.26, and improves and optimizes it on 4.27. Whether the mobile terminal turns on the delayed coloring feature is determined by the following code:

// Engine\Source\Runtime\RenderCore\Private\RenderUtils.cpp

bool IsMobileDeferredShadingEnabled(const FStaticShaderPlatform Platform)
{
    // Disable delayed shading for OpenGL
    if (IsOpenGLPlatform(Platform))
    {
        // needs MRT framebuffer fetch or PLS
        return false;
    }
    
    // The console variable "r.Mobile.ShadingPath" should be 1
    static auto* MobileShadingPathCvar = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.Mobile.ShadingPath"));
    return MobileShadingPathCvar->GetValueOnAnyThread() == 1;
}

Simply put, it is a non OpenGL graphics API and the console variable r.Mobile.ShadingPath is set to 1.

r.Mobile.ShadingPath cannot be set dynamically in the editor. It can only be opened by adding the following fields in the project root directory / Config/DefaultEngine.ini:

[/Script/Engine.RendererSettings]

r.Mobile.ShadingPath=1

After adding the above fields, restart the UE editor and wait for the shader compilation to preview the delayed shading effect of the mobile terminal.

The following is the code and analysis of the delayed rendering branch fmobilescenerer:: renderdeferred:

FRHITexture* FMobileSceneRenderer::RenderDeferred(FRHICommandListImmediate& RHICmdList, const TArrayView<const FViewInfo*> ViewList, const FSortedLightSetSceneInfo& SortedLightSet)
{
    FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList);
    
    // Prepare GBuffer
    FRHITexture* ColorTargets[4] = {
        SceneContext.GetSceneColorSurface(),
        SceneContext.GetGBufferATexture().GetReference(),
        SceneContext.GetGBufferBTexture().GetReference(),
        SceneContext.GetGBufferCTexture().GetReference()
    };

    // Whether RHI needs to store GBuffer in GPU system memory and shade in a separate render channel
    ERenderTargetActions GBufferAction = bRequiresMultiPass ? ERenderTargetActions::Clear_Store : ERenderTargetActions::Clear_DontStore;
    EDepthStencilTargetActions DepthAction = bKeepDepthContent ? EDepthStencilTargetActions::ClearDepthStencil_StoreDepthStencil : EDepthStencilTargetActions::ClearDepthStencil_DontStoreDepthStencil;
    
    // RT's load/store action
    ERenderTargetActions ColorTargetsAction[4] = {ERenderTargetActions::Clear_Store, GBufferAction, GBufferAction, GBufferAction};
    if (bIsFullPrepassEnabled)
    {
        ERenderTargetActions DepthTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetDepthActions(DepthAction)));
        ERenderTargetActions StencilTarget = MakeRenderTargetActions(ERenderTargetLoadAction::ELoad, GetStoreAction(GetStencilActions(DepthAction)));
        DepthAction = MakeDepthStencilTargetActions(DepthTarget, StencilTarget);
    }
    
    FRHIRenderPassInfo BasePassInfo = FRHIRenderPassInfo();
    int32 ColorTargetIndex = 0;
    for (; ColorTargetIndex < UE_ARRAY_COUNT(ColorTargets); ++ColorTargetIndex)
    {
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].RenderTarget = ColorTargets[ColorTargetIndex];
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].ResolveTarget = nullptr;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].ArraySlice = -1;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].MipIndex = 0;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].Action = ColorTargetsAction[ColorTargetIndex];
    }
    
    if (MobileRequiresSceneDepthAux(ShaderPlatform))
    {
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].RenderTarget = SceneContext.SceneDepthAux->GetRenderTargetItem().ShaderResourceTexture.GetReference();
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].ResolveTarget = nullptr;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].ArraySlice = -1;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].MipIndex = 0;
        BasePassInfo.ColorRenderTargets[ColorTargetIndex].Action = GBufferAction;
        ColorTargetIndex++;
    }

    BasePassInfo.DepthStencilRenderTarget.DepthStencilTarget = SceneContext.GetSceneDepthSurface();
    BasePassInfo.DepthStencilRenderTarget.ResolveTarget = nullptr;
    BasePassInfo.DepthStencilRenderTarget.Action = DepthAction;
    BasePassInfo.DepthStencilRenderTarget.ExclusiveDepthStencil = FExclusiveDepthStencil::DepthWrite_StencilWrite;
        
    BasePassInfo.SubpassHint = ESubpassHint::DeferredShadingSubpass;
    if (!bIsFullPrepassEnabled)
    {
        BasePassInfo.NumOcclusionQueries = ComputeNumOcclusionQueriesToBatch();
        BasePassInfo.bOcclusionQueries = BasePassInfo.NumOcclusionQueries != 0;
    }
    BasePassInfo.ShadingRateTexture = nullptr;
    BasePassInfo.bIsMSAA = false;
    BasePassInfo.MultiViewCount = 0;

    RHICmdList.BeginRenderPass(BasePassInfo, TEXT("BasePassRendering"));
    
    if (GIsEditor && !Views[0].bIsSceneCapture)
    {
        DrawClearQuad(RHICmdList, Views[0].BackgroundColor);
    }

    // Depth PrePass
    if (!bIsFullPrepassEnabled)
    {
        RHICmdList.SetCurrentStat(GET_STATID(STAT_CLM_MobilePrePass));
        // Depth pre-pass
        RenderPrePass(RHICmdList);
    }

    // BasePass: opaque and hollow objects
    RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Opaque));
    RenderMobileBasePass(RHICmdList, ViewList);
    RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);

    // Occlusion culling
    if (!bIsFullPrepassEnabled)
    {
        // Issue occlusion queries
        RHICmdList.SetCurrentStat(GET_STATID(STAT_CLMM_Occlusion));
        RenderOcclusion(RHICmdList);
        RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
    }

    // Non multi Pass mode
    if (!bRequiresMultiPass)
    {
        // The next sub Pass: SSceneColor + GBuffer write, SceneDepth read only
        RHICmdList.NextSubpass();
        
        // Render decals
        if (ViewFamily.EngineShowFlags.Decals)
        {
            CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals);
            RenderDecals(RHICmdList);
        }

        // The next sub Pass: SceneColor is written, and SceneDepth is read-only
        RHICmdList.NextSubpass();
        
        // Delay light shading
        MobileDeferredShadingPass(RHICmdList, *Scene, ViewList, SortedLightSet);
        
        // Paint translucent
        if (ViewFamily.EngineShowFlags.Translucency)
        {
            CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency);
            SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime);
            RenderTranslucency(RHICmdList, ViewList);
            FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
            RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
        }
        
        // End rendering Pass
        RHICmdList.EndRenderPass();
    }
    // Multi Pass mode (mobile terminal simulated by PC device)
    else
    {
        // End Sub pass
        RHICmdList.NextSubpass();
        RHICmdList.NextSubpass();
        RHICmdList.EndRenderPass();
        
        // SceneColor + GBuffer write, SceneDepth is read only
        {
            for (int32 Index = 0; Index < UE_ARRAY_COUNT(ColorTargets); ++Index)
            {
                BasePassInfo.ColorRenderTargets[Index].Action = ERenderTargetActions::Load_Store;
            }
            BasePassInfo.DepthStencilRenderTarget.Action = EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil;
            BasePassInfo.DepthStencilRenderTarget.ExclusiveDepthStencil = FExclusiveDepthStencil::DepthRead_StencilRead;
            BasePassInfo.SubpassHint = ESubpassHint::None;
            BasePassInfo.NumOcclusionQueries = 0;
            BasePassInfo.bOcclusionQueries = false;
            
            RHICmdList.BeginRenderPass(BasePassInfo, TEXT("AfterBasePass"));
            
            // Render decals
            if (ViewFamily.EngineShowFlags.Decals)
            {
                CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderDecals);
                RenderDecals(RHICmdList);
            }
            
            RHICmdList.EndRenderPass();
        }

        // SceneColor write, SceneDepth is read only
        {
            FRHIRenderPassInfo ShadingPassInfo(
                SceneContext.GetSceneColorSurface(),
                ERenderTargetActions::Load_Store,
                nullptr,
                SceneContext.GetSceneDepthSurface(),
                EDepthStencilTargetActions::LoadDepthStencil_StoreDepthStencil, 
                nullptr,
                nullptr,
                VRSRB_Passthrough,
                FExclusiveDepthStencil::DepthRead_StencilWrite
            );
            ShadingPassInfo.NumOcclusionQueries = 0;
            ShadingPassInfo.bOcclusionQueries = false;
            
            RHICmdList.BeginRenderPass(ShadingPassInfo, TEXT("MobileShadingPass"));
            
            // Delay light shading
            MobileDeferredShadingPass(RHICmdList, *Scene, ViewList, SortedLightSet);
            
            // Paint translucent
            if (ViewFamily.EngineShowFlags.Translucency)
            {
                CSV_SCOPED_TIMING_STAT_EXCLUSIVE(RenderTranslucency);
                SCOPE_CYCLE_COUNTER(STAT_TranslucencyDrawTime);
                RenderTranslucency(RHICmdList, ViewList);
                FRHICommandListExecutor::GetImmediateCommandList().PollOcclusionQueries();
                RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
            }

            RHICmdList.EndRenderPass();
        }
    }

    return ColorTargets[0];
}

It can be seen from the above that the delayed rendering pipeline at the mobile end is similar to that of the PC. first render the BasePass to obtain the geometric information of GBuffer, and then perform lighting calculation. Their flow chart is as follows:

stateDiagram-v2 [*] --> DrawClearQuad* DrawClearQuad* --> RenderPrePass* RenderPrePass* --> RenderMobileBasePass RenderMobileBasePass --> RenderOcclusion* RenderOcclusion* --> RenderDecals* RenderDecals* --> MobileDeferredShadingPass* MobileDeferredShadingPass* --> RenderTranslucency* RenderTranslucency* --> PreTonemapMSAA* PreTonemapMSAA* --> [*]

Of course, there are some differences from PC. the most obvious is that the mobile terminal uses SubPass rendering adapted to TB(D)R architecture, so that when rendering PrePass depth, BasePass and lighting calculation, the mobile terminal keeps the scene color, depth, GBuffer and other information in the buffer of on chip, so as to improve rendering efficiency and reduce equipment energy consumption.

12.3.3.1 MobileDeferredShadingPass

The process of delaying lighting rendering is performed by MobileDeferredShadingPass:

void MobileDeferredShadingPass(
    FRHICommandListImmediate& RHICmdList, 
    const FScene& Scene, 
    const TArrayView<const FViewInfo*> PassViews, 
    const FSortedLightSetSceneInfo &SortedLightSet)
{
    SCOPED_DRAW_EVENT(RHICmdList, MobileDeferredShading);

    const FViewInfo& View0 = *PassViews[0];

    FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList);
    // Create a Uniform Buffer
    FUniformBufferRHIRef PassUniformBuffer = CreateMobileSceneTextureUniformBuffer(RHICmdList);
    FUniformBufferStaticBindings GlobalUniformBuffers(PassUniformBuffer);
    SCOPED_UNIFORM_BUFFER_GLOBAL_BINDINGS(RHICmdList, GlobalUniformBuffers);
    // Set the viewport
    RHICmdList.SetViewport(View0.ViewRect.Min.X, View0.ViewRect.Min.Y, 0.0f, View0.ViewRect.Max.X, View0.ViewRect.Max.Y, 1.0f);

    // The default material for lighting
    FCachedLightMaterial DefaultMaterial;
    DefaultMaterial.MaterialProxy = UMaterial::GetDefaultMaterial(MD_LightFunction)->GetRenderProxy();
    DefaultMaterial.Material = DefaultMaterial.MaterialProxy->GetMaterialNoFallback(ERHIFeatureLevel::ES3_1);
    check(DefaultMaterial.Material);

    // Draws a directional light
    RenderDirectLight(RHICmdList, Scene, View0, DefaultMaterial);

    if (GMobileUseClusteredDeferredShading == 0)
    {
        // Render simple non clustered lights
        RenderSimpleLights(RHICmdList, Scene, PassViews, SortedLightSet, DefaultMaterial);
    }

    // Render non clustered local lights
    int32 NumLights = SortedLightSet.SortedLights.Num();
    int32 StandardDeferredStart = SortedLightSet.SimpleLightsEnd;
    if (GMobileUseClusteredDeferredShading != 0)
    {
        StandardDeferredStart = SortedLightSet.ClusteredSupportedEnd;
    }

    // Render local lights
    for (int32 LightIdx = StandardDeferredStart; LightIdx < NumLights; ++LightIdx)
    {
        const FSortedLightSceneInfo& SortedLight = SortedLightSet.SortedLights[LightIdx];
        const FLightSceneInfo& LightSceneInfo = *SortedLight.LightSceneInfo;
        RenderLocalLight(RHICmdList, Scene, View0, LightSceneInfo, DefaultMaterial);
    }
}

Next, continue to analyze the interfaces for rendering different types of lights:

// Engine\Source\Runtime\Renderer\Private\MobileDeferredShadingPass.cpp

// Render directional light
static void RenderDirectLight(FRHICommandListImmediate& RHICmdList, const FScene& Scene, const FViewInfo& View, const FCachedLightMaterial& DefaultLightMaterial)
{
    FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(RHICmdList);

    // Find the first directional light
    FLightSceneInfo* DirectionalLight = nullptr;
    for (int32 ChannelIdx = 0; ChannelIdx < UE_ARRAY_COUNT(Scene.MobileDirectionalLights) && !DirectionalLight; ChannelIdx++)
    {
        DirectionalLight = Scene.MobileDirectionalLights[ChannelIdx];
    }

    // Render state
    FGraphicsPipelineStateInitializer GraphicsPSOInit;
    RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
    // Increase self illumination to SceneColor
    GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One>::GetRHI();
    GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI();
    // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn
    uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit);
    GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<
                                        false, CF_Always,
                                        true, CF_Equal, SO_Keep, SO_Keep, SO_Keep,        
                                        false, CF_Always, SO_Keep, SO_Keep, SO_Keep,
                                        GET_STENCIL_MOBILE_SM_MASK(0x7), 0x00>::GetRHI(); // 4 bits for shading models
    
    // Process VS
    TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap);
    
    const FMaterialRenderProxy* LightFunctionMaterialProxy = nullptr;
    if (View.Family->EngineShowFlags.LightFunctions && DirectionalLight)
    {
        LightFunctionMaterialProxy = DirectionalLight->Proxy->GetLightFunctionMaterial();
    }
    FMobileDirectLightFunctionPS::FPermutationDomain PermutationVector = FMobileDirectLightFunctionPS::BuildPermutationVector(View, DirectionalLight != nullptr);
    FCachedLightMaterial LightMaterial;
    TShaderRef<FMobileDirectLightFunctionPS> PixelShader;
    GetLightMaterial(DefaultLightMaterial, LightFunctionMaterialProxy, PermutationVector.ToDimensionValueId(), LightMaterial, PixelShader);
    
    GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
    GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
    GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
    GraphicsPSOInit.PrimitiveType = PT_TriangleList;
    SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);

    // Process PS
    FMobileDirectLightFunctionPS::FParameters PassParameters;
    PassParameters.Forward = View.ForwardLightingResources->ForwardLightDataUniformBuffer;
    PassParameters.MobileDirectionalLight = Scene.UniformBuffers.MobileDirectionalLightUniformBuffers[1];
    PassParameters.ReflectionCaptureData = Scene.UniformBuffers.ReflectionCaptureUniformBuffer;
    FReflectionUniformParameters ReflectionUniformParameters;
    SetupReflectionUniformParameters(View, ReflectionUniformParameters);
    PassParameters.ReflectionsParameters = CreateUniformBufferImmediate(ReflectionUniformParameters, UniformBuffer_SingleDraw);
    PassParameters.LightFunctionParameters = FVector4(1.0f, 1.0f, 0.0f, 0.0f);
    if (DirectionalLight)
    {
        const bool bUseMovableLight = DirectionalLight && !DirectionalLight->Proxy->HasStaticShadowing();
        PassParameters.LightFunctionParameters2 = FVector(DirectionalLight->Proxy->GetLightFunctionFadeDistance(), DirectionalLight->Proxy->GetLightFunctionDisabledBrightness(), bUseMovableLight ? 1.0f : 0.0f);
        const FVector Scale = DirectionalLight->Proxy->GetLightFunctionScale();
        // Switch x and z so that z of the user specified scale affects the distance along the light direction
        const FVector InverseScale = FVector(1.f / Scale.Z, 1.f / Scale.Y, 1.f / Scale.X);
        PassParameters.WorldToLight = DirectionalLight->Proxy->GetWorldToLight() * FScaleMatrix(FVector(InverseScale));
    }
    FMobileDirectLightFunctionPS::SetParameters(RHICmdList, PixelShader, View, LightMaterial.MaterialProxy, *LightMaterial.Material, PassParameters);
    
    RHICmdList.SetStencilRef(StencilRef);
            
    const FIntPoint TargetSize = SceneContext.GetBufferSizeXY();
    
    // Draw with a full screen rectangle
    DrawRectangle(
        RHICmdList, 
        0, 0, 
        View.ViewRect.Width(), View.ViewRect.Height(), 
        View.ViewRect.Min.X, View.ViewRect.Min.Y, 
        View.ViewRect.Width(), View.ViewRect.Height(),
        FIntPoint(View.ViewRect.Width(), View.ViewRect.Height()), 
        TargetSize, 
        VertexShader);
}

// Render simple lights in non - clustered mode
static void RenderSimpleLights(
    FRHICommandListImmediate& RHICmdList, 
    const FScene& Scene, 
    const TArrayView<const FViewInfo*> PassViews, 
    const FSortedLightSetSceneInfo &SortedLightSet, 
    const FCachedLightMaterial& DefaultMaterial)
{
    const FSimpleLightArray& SimpleLights = SortedLightSet.SimpleLights;
    const int32 NumViews = PassViews.Num();
    const FViewInfo& View0 = *PassViews[0];

    // Process VS
    TShaderMapRef<TDeferredLightVS<true>> VertexShader(View0.ShaderMap);
    TShaderRef<FMobileRadialLightFunctionPS> PixelShaders[2];
    {
        const FMaterialShaderMap* MaterialShaderMap = DefaultMaterial.Material->GetRenderingThreadShaderMap();
        FMobileRadialLightFunctionPS::FPermutationDomain PermutationVector;
        PermutationVector.Set<FMobileRadialLightFunctionPS::FSpotLightDim>(false);
        PermutationVector.Set<FMobileRadialLightFunctionPS::FIESProfileDim>(false);
        PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(false);
        PixelShaders[0] = MaterialShaderMap->GetShader<FMobileRadialLightFunctionPS>(PermutationVector);
        PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(true);
        PixelShaders[1] = MaterialShaderMap->GetShader<FMobileRadialLightFunctionPS>(PermutationVector);
    }

    // Set PSO
    FGraphicsPipelineStateInitializer GraphicsPSOLight[2];
    {
        SetupSimpleLightPSO(RHICmdList, View0, VertexShader, PixelShaders[0], GraphicsPSOLight[0]);
        SetupSimpleLightPSO(RHICmdList, View0, VertexShader, PixelShaders[1], GraphicsPSOLight[1]);
    }
    
    // Set template buffer
    FGraphicsPipelineStateInitializer GraphicsPSOLightMask;
    {
        RHICmdList.ApplyCachedRenderTargets(GraphicsPSOLightMask);
        GraphicsPSOLightMask.PrimitiveType = PT_TriangleList;
        GraphicsPSOLightMask.BlendState = TStaticBlendStateWriteMask<CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE>::GetRHI();
        GraphicsPSOLightMask.RasterizerState = View0.bReverseCulling ? TStaticRasterizerState<FM_Solid, CM_CCW>::GetRHI() : TStaticRasterizerState<FM_Solid, CM_CW>::GetRHI();
        // set stencil to 1 where depth test fails
        GraphicsPSOLightMask.DepthStencilState = TStaticDepthStencilState<
            false, CF_DepthNearOrEqual,
            true, CF_Always, SO_Keep, SO_Replace, SO_Keep,        
            false, CF_Always, SO_Keep, SO_Keep, SO_Keep,
            0x00, STENCIL_SANDBOX_MASK>::GetRHI();
        GraphicsPSOLightMask.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4();
        GraphicsPSOLightMask.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
        GraphicsPSOLightMask.BoundShaderState.PixelShaderRHI = nullptr;
    }
    
    // Traverse the list of all simple lights and perform shading calculations
    for (int32 LightIndex = 0; LightIndex < SimpleLights.InstanceData.Num(); LightIndex++)
    {
        const FSimpleLightEntry& SimpleLight = SimpleLights.InstanceData[LightIndex];
        for (int32 ViewIndex = 0; ViewIndex < NumViews; ViewIndex++)
        {
            const FViewInfo& View = *PassViews[ViewIndex];
            const FSimpleLightPerViewEntry& SimpleLightPerViewData = SimpleLights.GetViewDependentData(LightIndex, ViewIndex, NumViews);
            const FSphere LightBounds(SimpleLightPerViewData.Position, SimpleLight.Radius);
            
            if (NumViews > 1)
            {
                // set viewports only we we have more than one 
                // otherwise it is set at the start of the pass
                RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 1.0f);
            }

            // Render a light mask
            SetGraphicsPipelineState(RHICmdList, GraphicsPSOLightMask);
            VertexShader->SetSimpleLightParameters(RHICmdList, View, LightBounds);
            RHICmdList.SetStencilRef(1);
            StencilingGeometry::DrawSphere(RHICmdList);
                        
            // Render lights
            FMobileRadialLightFunctionPS::FParameters PassParameters;
            FDeferredLightUniformStruct DeferredLightUniformsValue;
            SetupSimpleDeferredLightParameters(SimpleLight, SimpleLightPerViewData, DeferredLightUniformsValue);
            PassParameters.DeferredLightUniforms = TUniformBufferRef<FDeferredLightUniformStruct>::CreateUniformBufferImmediate(DeferredLightUniformsValue, EUniformBufferUsage::UniformBuffer_SingleFrame);
            PassParameters.IESTexture = GWhiteTexture->TextureRHI;
            PassParameters.IESTextureSampler = GWhiteTexture->SamplerStateRHI;
            if (SimpleLight.Exponent == 0)
            {
                SetGraphicsPipelineState(RHICmdList, GraphicsPSOLight[1]);
                FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShaders[1], View, DefaultMaterial.MaterialProxy, *DefaultMaterial.Material, PassParameters);
            }
            else
            {
                SetGraphicsPipelineState(RHICmdList, GraphicsPSOLight[0]);
                FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShaders[0], View, DefaultMaterial.MaterialProxy, *DefaultMaterial.Material, PassParameters);
            }
            VertexShader->SetSimpleLightParameters(RHICmdList, View, LightBounds);
            
            // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn
            uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit);
            RHICmdList.SetStencilRef(StencilRef);

            // Render light sources (point light and spotlight) with spheres to quickly eliminate pixels outside the influence of light sources
            StencilingGeometry::DrawSphere(RHICmdList);
        }
    }
}

// Render local lights
static void RenderLocalLight(
    FRHICommandListImmediate& RHICmdList, 
    const FScene& Scene, 
    const FViewInfo& View, 
    const FLightSceneInfo& LightSceneInfo, 
    const FCachedLightMaterial& DefaultLightMaterial)
{
    if (!LightSceneInfo.ShouldRenderLight(View))
    {
        return;
    }

    // Ignore nonlocal lights (lights other than lights and spotlights)
    const uint8 LightType = LightSceneInfo.Proxy->GetLightType();
    const bool bIsSpotLight = LightType == LightType_Spot;
    const bool bIsPointLight = LightType == LightType_Point;
    if (!bIsSpotLight && !bIsPointLight)
    {
        return;
    }
    
    // Draw a lighting template
    if (GMobileUseLightStencilCulling != 0)
    {
        RenderLocalLight_StencilMask(RHICmdList, Scene, View, LightSceneInfo);
    }

    // Handle IES illumination
    bool bUseIESTexture = false;
    FTexture* IESTextureResource = GWhiteTexture;
    if (View.Family->EngineShowFlags.TexturedLightProfiles && LightSceneInfo.Proxy->GetIESTextureResource())
    {
        IESTextureResource = LightSceneInfo.Proxy->GetIESTextureResource();
        bUseIESTexture = true;
    }
        
    FGraphicsPipelineStateInitializer GraphicsPSOInit;
    RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
    GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI();
    GraphicsPSOInit.PrimitiveType = PT_TriangleList;
    const FSphere LightBounds = LightSceneInfo.Proxy->GetBoundingSphere();
    
    // Sets the light rasterization and depth state
    if (GMobileUseLightStencilCulling != 0)
    {
        SetLocalLightRasterizerAndDepthState_StencilMask(GraphicsPSOInit, View);
    }
    else
    {
        SetLocalLightRasterizerAndDepthState(GraphicsPSOInit, View, LightBounds);
    }

    // Set VS
    TShaderMapRef<TDeferredLightVS<true>> VertexShader(View.ShaderMap);
        
    const FMaterialRenderProxy* LightFunctionMaterialProxy = nullptr;
    if (View.Family->EngineShowFlags.LightFunctions)
    {
        LightFunctionMaterialProxy = LightSceneInfo.Proxy->GetLightFunctionMaterial();
    }
    FMobileRadialLightFunctionPS::FPermutationDomain PermutationVector;
    PermutationVector.Set<FMobileRadialLightFunctionPS::FSpotLightDim>(bIsSpotLight);
    PermutationVector.Set<FMobileRadialLightFunctionPS::FInverseSquaredDim>(LightSceneInfo.Proxy->IsInverseSquared());
    PermutationVector.Set<FMobileRadialLightFunctionPS::FIESProfileDim>(bUseIESTexture);
    FCachedLightMaterial LightMaterial;
    TShaderRef<FMobileRadialLightFunctionPS> PixelShader;
    GetLightMaterial(DefaultLightMaterial, LightFunctionMaterialProxy, PermutationVector.ToDimensionValueId(), LightMaterial, PixelShader);
            
    GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4();
    GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
    GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
    SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);

    VertexShader->SetParameters(RHICmdList, View, &LightSceneInfo);

    // Set PS
    FMobileRadialLightFunctionPS::FParameters PassParameters;
    PassParameters.DeferredLightUniforms = TUniformBufferRef<FDeferredLightUniformStruct>::CreateUniformBufferImmediate(GetDeferredLightParameters(View, LightSceneInfo), EUniformBufferUsage::UniformBuffer_SingleFrame);
    PassParameters.IESTexture = IESTextureResource->TextureRHI;
    PassParameters.IESTextureSampler = IESTextureResource->SamplerStateRHI;
    const float TanOuterAngle = bIsSpotLight ? FMath::Tan(LightSceneInfo.Proxy->GetOuterConeAngle()) : 1.0f;
    PassParameters.LightFunctionParameters = FVector4(TanOuterAngle, 1.0f /*ShadowFadeFraction*/, bIsSpotLight ? 1.0f : 0.0f, bIsPointLight ? 1.0f : 0.0f);
    PassParameters.LightFunctionParameters2 = FVector(LightSceneInfo.Proxy->GetLightFunctionFadeDistance(), LightSceneInfo.Proxy->GetLightFunctionDisabledBrightness(),    0.0f);
    const FVector Scale = LightSceneInfo.Proxy->GetLightFunctionScale();
    // Switch x and z so that z of the user specified scale affects the distance along the light direction
    const FVector InverseScale = FVector(1.f / Scale.Z, 1.f / Scale.Y, 1.f / Scale.X);
    PassParameters.WorldToLight = LightSceneInfo.Proxy->GetWorldToLight() * FScaleMatrix(FVector(InverseScale));
    FMobileRadialLightFunctionPS::SetParameters(RHICmdList, PixelShader, View, LightMaterial.MaterialProxy, *LightMaterial.Material, PassParameters);

    // Only the pixels of the default lighting model (MSM_DefaultLit) are drawn
    uint8 StencilRef = GET_STENCIL_MOBILE_SM_MASK(MSM_DefaultLit);
    RHICmdList.SetStencilRef(StencilRef);

    // Point lights are drawn with spheres
    if (LightType == LightType_Point)
    {
        StencilingGeometry::DrawSphere(RHICmdList);
    }
    // Spotlights are drawn with cones
    else // LightType_Spot
    {
        StencilingGeometry::DrawCone(RHICmdList);
    }
}

When drawing a light source, it is divided into three steps according to the type of light source: directional light, non clustered simple light source and local light source (point light and spotlight). It should be noted that the mobile terminal only supports the calculation of the default lighting model (MSM_DefaultLit), and other advanced lighting models (hair, sub surface scattering, varnish, eyes, cloth, etc.) are not supported temporarily.

When drawing directional lights, only one can be drawn at most. The full screen rectangular drawing is adopted, and several levels of CSM shadows are supported.

When drawing non clustered simple lights, whether point lights or spotlights, they are drawn with spheres and do not support shadows.

When drawing a local light source, it will be much more complex. First draw the local light source template buffer, then set the rasterization and depth state, and then draw the light source. The point light source is drawn by sphere and does not support shadow; The spotlight adopts cone drawing and can support shadows. By default, the spotlight does not support dynamic light and shadow calculation and needs to be turned on in the project configuration:

In addition, whether to turn on the template to eliminate pixels with disjoint light sources is determined by GMobileUseLightStencilCulling, and GMobileUseLightStencilCulling is determined by r.Mobile.UseLightStencilCulling, which is 1 by default (i.e. on). The template buffer code for rendering light source is as follows:

static void RenderLocalLight_StencilMask(FRHICommandListImmediate& RHICmdList, const FScene& Scene, const FViewInfo& View, const FLightSceneInfo& LightSceneInfo)
{
    const uint8 LightType = LightSceneInfo.Proxy->GetLightType();

    FGraphicsPipelineStateInitializer GraphicsPSOInit;
    // Apply cached RT (color / depth, etc.)
    RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
    GraphicsPSOInit.PrimitiveType = PT_TriangleList;
    // Disable all RT writes
    GraphicsPSOInit.BlendState = TStaticBlendStateWriteMask<CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE, CW_NONE>::GetRHI();
    GraphicsPSOInit.RasterizerState = View.bReverseCulling ? TStaticRasterizerState<FM_Solid, CM_CCW>::GetRHI() : TStaticRasterizerState<FM_Solid, CM_CW>::GetRHI();
    // If the depth test fails, the write template buffer value is 1
    GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<
        false, CF_DepthNearOrEqual,
        true, CF_Always, SO_Keep, SO_Replace, SO_Keep,        
        false, CF_Always, SO_Keep, SO_Keep, SO_Keep,
        0x00, 
        // Note that only the sandbox bit dedicated to Pass is written, that is, the bit with index 0 of the template buffer
        STENCIL_SANDBOX_MASK>::GetRHI();
    
    // The VS for drawing the light source template is TDeferredLightVS
    TShaderMapRef<TDeferredLightVS<true> > VertexShader(View.ShaderMap);
    GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4();
    GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
    // PS is empty
    GraphicsPSOInit.BoundShaderState.PixelShaderRHI = nullptr;

    SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);
    VertexShader->SetParameters(RHICmdList, View, &LightSceneInfo);
    // The template value is 1
    RHICmdList.SetStencilRef(1);

    // Draw with different shapes according to different light sources
    if (LightType == LightType_Point)
    {
        StencilingGeometry::DrawSphere(RHICmdList);
    }
    else // LightType_Spot
    {
        StencilingGeometry::DrawCone(RHICmdList);
    }
}

For each local light source, first draw the Mask within the light source range, and then calculate the illumination of the pixels that pass the Stencil test (Early-Z). The specific analysis process takes the spotlight in the following figure as an example:

Upper: a spotlight waiting for rendering in the scene; Medium: the template Mask (white area) drawn by the template Pass marks the pixels with closer depth that overlap the shape of the spotlight in the screen space; Below: the effect of lighting calculation on effective pixels.

The DepthStencil status used for lighting calculation of effective pixels is as follows:

The pixel performing illumination must be within the light source shape, and the pixels outside the light source shape will be eliminated. The template Pass marks the pixels closer to the depth of the light source shape (pixels outside the light source shape body). The light source drawing Pass eliminates the pixels marked by the template Pass through the template test, and then finds the pixels in the light source shape body through the depth test, so as to improve the lighting calculation efficiency.

This light source template clipping technology at the mobile end and the Unity speech of Siggraph2020 Deferred Shading in Unity URP The mentioned template based lighting calculation is similar (the idea is consistent, but the practice may not be exactly the same). This paper also proposes a geometry simulation that is more suitable for the shape of the light source:

As well as comparing the performance of various light source calculation methods on PC and mobile terminal, the following is the comparison diagram of Mali GPU:

The performance comparison of Mali Gpu using different lighting rendering technologies shows that the lighting algorithm based on template clipping is better than the conventional and blocking algorithms on the mobile end.

It is worth mentioning that the light source template clipping technology combined with GPU's Early-Z technology will greatly improve the lighting rendering performance. The current mainstream mobile GPU supports Early-Z technology, which also lays a foundation for the application of light source template clipping.

There may be room for improvement in the light source clipping algorithm currently implemented by UE. For example, the pixels facing back to the light source (shown in the red box below) can not be calculated. (however, how to quickly and effectively find pixels with backlight is another problem)

12.3.3.2 MobileBasePassShader

This section mainly describes the shader s involved in the BasePass of the mobile terminal, including VS and PS. First look at VS:

// Engine\Shaders\Private\MobileBasePassVertexShader.usf

(......)

struct FMobileShadingBasePassVSToPS
{
    FVertexFactoryInterpolantsVSToPS FactoryInterpolants;
    FMobileBasePassInterpolantsVSToPS BasePassInterpolants;
    float4 Position : SV_POSITION;
};

#define FMobileShadingBasePassVSOutput FMobileShadingBasePassVSToPS
#define VertexFactoryGetInterpolants VertexFactoryGetInterpolantsVSToPS

// VS main entrance
void Main(
    FVertexFactoryInput Input
    , out FMobileShadingBasePassVSOutput Output
#if INSTANCED_STEREO
    , uint InstanceId : SV_InstanceID
    , out uint LayerIndex : SV_RenderTargetArrayIndex
#elif MOBILE_MULTI_VIEW
    , in uint ViewId : SV_ViewID
#endif
    )
{
// Stereo view mode
#if INSTANCED_STEREO
    const uint EyeIndex = GetEyeIndex(InstanceId);
    ResolvedView = ResolveView(EyeIndex);
    LayerIndex = EyeIndex;
    Output.BasePassInterpolants.MultiViewId = float(EyeIndex);
// Multi view mode
#elif MOBILE_MULTI_VIEW
    #if COMPILER_GLSL_ES3_1
        const int MultiViewId = int(ViewId);
        ResolvedView = ResolveView(uint(MultiViewId));
        Output.BasePassInterpolants.MultiViewId = float(MultiViewId);
    #else
        ResolvedView = ResolveView(ViewId);
        Output.BasePassInterpolants.MultiViewId = float(ViewId);
    #endif
#else
    ResolvedView = ResolveView();
#endif

    // Initialize the packed interpolation data
#if PACK_INTERPOLANTS
    float4 PackedInterps[NUM_VF_PACKED_INTERPOLANTS];
    UNROLL 
    for(int i = 0; i < NUM_VF_PACKED_INTERPOLANTS; ++i)
    {
        PackedInterps[i] = 0;
    }
#endif

    // Process vertex factory data
    FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input);
    float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates);
    float4 WorldPosition = WorldPositionExcludingWPO;

    // Get the vertex data of the material, process the coordinates, etc
    half3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates);    
    FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal);

    half3 WorldPositionOffset = GetMaterialWorldPositionOffset(VertexParameters);
    
    WorldPosition.xyz += WorldPositionOffset;

    float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition);
    Output.Position = mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip);
    Output.BasePassInterpolants.PixelPosition = WorldPosition;

#if USE_WORLD_POSITION_EXCLUDING_SHADER_OFFSETS
    Output.BasePassInterpolants.PixelPositionExcludingWPO = WorldPositionExcludingWPO.xyz;
#endif

    // Crop face
#if USE_PS_CLIP_PLANE
    Output.BasePassInterpolants.OutClipDistance = dot(ResolvedView.GlobalClippingPlane, float4(WorldPosition.xyz - ResolvedView.PreViewTranslation.xyz, 1));
#endif

    // Vertex fog
#if USE_VERTEX_FOG
    float4 VertexFog = CalculateHeightFog(WorldPosition.xyz - ResolvedView.TranslatedWorldCameraOrigin);

    #if PROJECT_SUPPORT_SKY_ATMOSPHERE && MATERIAL_IS_SKY==0 // Do not apply aerial perpsective on sky materials
        if (ResolvedView.SkyAtmosphereApplyCameraAerialPerspectiveVolume > 0.0f)
        {
            const float OneOverPreExposure = USE_PREEXPOSURE ? ResolvedView.OneOverPreExposure : 1.0f;
            // Sample the aerial perspective (AP). It is also blended under the VertexFog parameter.
            VertexFog = GetAerialPerspectiveLuminanceTransmittanceWithFogOver(
                ResolvedView.RealTimeReflectionCapture, ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeSizeAndInvSize,
                Output.Position, WorldPosition.xyz*CM_TO_SKY_UNIT, ResolvedView.TranslatedWorldCameraOrigin*CM_TO_SKY_UNIT,
                View.CameraAerialPerspectiveVolume, View.CameraAerialPerspectiveVolumeSampler,
                ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthResolutionInv,
                ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthResolution,
                ResolvedView.SkyAtmosphereAerialPerspectiveStartDepthKm,
                ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthSliceLengthKm,
                ResolvedView.SkyAtmosphereCameraAerialPerspectiveVolumeDepthSliceLengthKmInv,
                OneOverPreExposure, VertexFog);
        }
    #endif

    #if PACK_INTERPOLANTS
        PackedInterps[0] = VertexFog;
    #else
        Output.BasePassInterpolants.VertexFog = VertexFog;
    #endif // PACK_INTERPOLANTS
#endif // USE_VERTEX_FOG

    (......)

    // Obtain the data to be interpolated
    Output.FactoryInterpolants = VertexFactoryGetInterpolants(Input, VFIntermediates, VertexParameters);

    Output.BasePassInterpolants.PixelPosition.w = Output.Position.w;

    // Pack interpolation data
#if PACK_INTERPOLANTS
    VertexFactoryPackInterpolants(Output.FactoryInterpolants, PackedInterps);
#endif // PACK_INTERPOLANTS

#if !OUTPUT_MOBILE_HDR && COMPILER_GLSL_ES3_1
    Output.Position.y *= -1;
#endif
}

As can be seen from the above, view instances will be processed differently according to different stereo rendering, multi view and normal modes. Vertex fog is supported, but it is turned off by default and needs to be turned on in the project configuration.

There is a packed interpolation mode to compress the interpolation consumption and bandwidth between VS and PS. Enable macro PACK_INTERPOLANTS decides that it is defined as follows:

// Engine\Shaders\Private\MobileBasePassCommon.ush

#define PACK_INTERPOLANTS (USE_VERTEX_FOG && NUM_VF_PACKED_INTERPOLANTS > 0 && (ES3_1_PROFILE))

In other words, only when vertex fog is enabled, vertex factory packing interpolation data exists, and opengres3.1 shading platform is available, can the packing interpolation feature be enabled. Compared with the VS of BasePass on the PC side, the mobile side has been greatly simplified, which can be simply considered as a small subset of the PC side. Continue to analyze PS:

// Engine\Shaders\Private\MobileBasePassVertexShader.usf

#include "Common.ush"

// Various macro definitions
#define MobileSceneTextures MobileBasePass.SceneTextures
#define EyeAdaptationStruct MobileBasePass

(......)

// Pre normalized capture of the scene closest to the rendered object (not supported for completely rough materials)
#if !FULLY_ROUGH
    #if HQ_REFLECTIONS
    #define MAX_HQ_REFLECTIONS 3
    TextureCube ReflectionCubemap0;
    SamplerState ReflectionCubemapSampler0;
    TextureCube ReflectionCubemap1;
    SamplerState ReflectionCubemapSampler1;
    TextureCube ReflectionCubemap2;
    SamplerState ReflectionCubemapSampler2;
    // x,y,z - inverted average brightness for 0, 1, 2; w - sky cube texture max mips.
    float4 ReflectionAverageBrigtness;
    float4 ReflectanceMaxValueRGBM;
    float4 ReflectionPositionsAndRadii[MAX_HQ_REFLECTIONS];
        #if ALLOW_CUBE_REFLECTIONS
        float4x4 CaptureBoxTransformArray[MAX_HQ_REFLECTIONS];
        float4 CaptureBoxScalesArray[MAX_HQ_REFLECTIONS];
        #endif
    #endif
#endif

// Reflector / IBL and other interfaces
half4 GetPlanarReflection(float3 WorldPosition, half3 WorldNormal, half Roughness);
half MobileComputeMixingWeight(half IndirectIrradiance, half AverageBrightness, half Roughness);
half3 GetLookupVectorForBoxCaptureMobile(half3 ReflectionVector, ...);
half3 GetLookupVectorForSphereCaptureMobile(half3 ReflectionVector, ...);
void GatherSpecularIBL(FMaterialPixelParameters MaterialParameters, ...);
void BlendReflectionCaptures(FMaterialPixelParameters MaterialParameters, ...)
half3 GetImageBasedReflectionLighting(FMaterialPixelParameters MaterialParameters, ...);

// Other interfaces
half3 FrameBufferBlendOp(half4 Source);
bool UseCSM();
void ApplyPixelDepthOffsetForMobileBasePass(inout FMaterialPixelParameters MaterialParameters, FPixelMaterialInputs PixelMaterialInputs, out float OutDepth);

// Cumulative dynamic point light
#if MAX_DYNAMIC_POINT_LIGHTS > 0
void AccumulateLightingOfDynamicPointLight(
    FMaterialPixelParameters MaterialParameters, 
    FMobileShadingModelContext ShadingModelContext,
    FGBufferData GBuffer,
    float4 LightPositionAndInvRadius, 
    float4 LightColorAndFalloffExponent, 
    float4 SpotLightDirectionAndSpecularScale, 
    float4 SpotLightAnglesAndSoftTransitionScaleAndLightShadowType, 
    #if SUPPORT_SPOTLIGHTS_SHADOW
    FPCFSamplerSettings Settings,
    float4 SpotLightShadowSharpenAndShadowFadeFraction,
    float4 SpotLightShadowmapMinMax,
    float4x4 SpotLightShadowWorldToShadowMatrix,
    #endif
    inout half3 Color)
{
    uint LightShadowType = SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.w;
    float FadedShadow = 1.0f;

    // Calculate spotlight shadows
#if SUPPORT_SPOTLIGHTS_SHADOW
    if ((LightShadowType & LightShadowType_Shadow) == LightShadowType_Shadow)
    {

        float4 HomogeneousShadowPosition = mul(float4(MaterialParameters.AbsoluteWorldPosition, 1), SpotLightShadowWorldToShadowMatrix);
        float2 ShadowUVs = HomogeneousShadowPosition.xy / HomogeneousShadowPosition.w;
        if (all(ShadowUVs >= SpotLightShadowmapMinMax.xy && ShadowUVs <= SpotLightShadowmapMinMax.zw))
        {
            // Clamp pixel depth in light space for shadowing opaque, because areas of the shadow depth buffer that weren't rendered to will have been cleared to 1
            // We want to force the shadow comparison to result in 'unshadowed' in that case, regardless of whether the pixel being shaded is in front or behind that plane
            float LightSpacePixelDepthForOpaque = min(HomogeneousShadowPosition.z, 0.99999f);
            Settings.SceneDepth = LightSpacePixelDepthForOpaque;
            Settings.TransitionScale = SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.z;

            half Shadow = MobileShadowPCF(ShadowUVs, Settings);

            Shadow = saturate((Shadow - 0.5) * SpotLightShadowSharpenAndShadowFadeFraction.x + 0.5);

            FadedShadow = lerp(1.0f, Square(Shadow), SpotLightShadowSharpenAndShadowFadeFraction.y);
        }
    }
#endif

    // Calculate illumination
    if ((LightShadowType & ValidLightType) != 0)
    {
        float3 ToLight = LightPositionAndInvRadius.xyz - MaterialParameters.AbsoluteWorldPosition;
        float DistanceSqr = dot(ToLight, ToLight);
        float3 L = ToLight * rsqrt(DistanceSqr);
        half3 PointH = normalize(MaterialParameters.CameraVector + L);

        half PointNoL = max(0, dot(MaterialParameters.WorldNormal, L));
        half PointNoH = max(0, dot(MaterialParameters.WorldNormal, PointH));

        // Calculates the attenuation of the light source
        float Attenuation;
        if (LightColorAndFalloffExponent.w == 0)
        {
            // Sphere falloff (technically just 1/d2 but this avoids inf)
            Attenuation = 1 / (DistanceSqr + 1);

            float LightRadiusMask = Square(saturate(1 - Square(DistanceSqr * (LightPositionAndInvRadius.w * LightPositionAndInvRadius.w))));
            Attenuation *= LightRadiusMask;
        }
        else
        {
            Attenuation = RadialAttenuation(ToLight * LightPositionAndInvRadius.w, LightColorAndFalloffExponent.w);
        }

#if PROJECT_MOBILE_ENABLE_MOVABLE_SPOTLIGHTS
        if ((LightShadowType & LightShadowType_SpotLight) == LightShadowType_SpotLight)
        {
            Attenuation *= SpotAttenuation(L, -SpotLightDirectionAndSpecularScale.xyz, SpotLightAnglesAndSoftTransitionScaleAndLightShadowType.xy) * FadedShadow;
        }
#endif

        // Accumulate lighting results
#if !FULLY_ROUGH
        FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, PointNoL, MaterialParameters.CameraVector, PointH, PointNoH);
        Color += min(65000.0, (Attenuation) * LightColorAndFalloffExponent.rgb * (1.0 / PI) * (Lighting.Diffuse + Lighting.Specular * SpotLightDirectionAndSpecularScale.w));
#else
        Color += (Attenuation * PointNoL) * LightColorAndFalloffExponent.rgb * (1.0 / PI) * ShadingModelContext.DiffuseColor;
#endif
    }
}
#endif

(......)

// Calculate indirect illumination
half ComputeIndirect(VTPageTableResult LightmapVTPageTableResult, FVertexFactoryInterpolantsVSToPS Interpolants, float3 DiffuseDir, FMobileShadingModelContext ShadingModelContext, out half IndirectIrradiance, out half3 Color)
{
    //To keep IndirectLightingCache conherence with PC, initialize the IndirectIrradiance to zero.
    IndirectIrradiance = 0;
    Color = 0;

    // Indirect diffuse reflection
#if LQ_TEXTURE_LIGHTMAP
    float2 LightmapUV0, LightmapUV1;
    uint LightmapDataIndex;
    GetLightMapCoordinates(Interpolants, LightmapUV0, LightmapUV1, LightmapDataIndex);

    half4 LightmapColor = GetLightMapColorLQ(LightmapVTPageTableResult, LightmapUV0, LightmapUV1, LightmapDataIndex, DiffuseDir);
    Color += LightmapColor.rgb * ShadingModelContext.DiffuseColor * View.IndirectLightingColorScale;
    IndirectIrradiance = LightmapColor.a;
#elif CACHED_POINT_INDIRECT_LIGHTING
    #if MATERIALBLENDING_MASKED || MATERIALBLENDING_SOLID
        // Apply normals to translucent objects
        FThreeBandSHVectorRGB PointIndirectLighting;
        PointIndirectLighting.R.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[0];
        PointIndirectLighting.R.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[0];
        PointIndirectLighting.R.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[0];

        PointIndirectLighting.G.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[1];
        PointIndirectLighting.G.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[1];
        PointIndirectLighting.G.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[1];

        PointIndirectLighting.B.V0 = IndirectLightingCache.IndirectLightingSHCoefficients0[2];
        PointIndirectLighting.B.V1 = IndirectLightingCache.IndirectLightingSHCoefficients1[2];
        PointIndirectLighting.B.V2 = IndirectLightingCache.IndirectLightingSHCoefficients2[2];

        FThreeBandSHVector DiffuseTransferSH = CalcDiffuseTransferSH3(DiffuseDir, 1);

        // Calculates diffuse illumination with normal effects added
        half3 DiffuseGI = max(half3(0, 0, 0), DotSH3(PointIndirectLighting, DiffuseTransferSH));

        IndirectIrradiance = Luminance(DiffuseGI);
        Color += ShadingModelContext.DiffuseColor * DiffuseGI * View.IndirectLightingColorScale;
    #else 
        // Translucency uses non directional, diffuse reflection is packaged in xyz, and has been on the cpu side except PI and SH diffuse reflection
        half3 PointIndirectLighting = IndirectLightingCache.IndirectLightingSHSingleCoefficient.rgb;
        half3 DiffuseGI = PointIndirectLighting;

        IndirectIrradiance = Luminance(DiffuseGI);
        Color += ShadingModelContext.DiffuseColor * DiffuseGI * View.IndirectLightingColorScale;
    #endif
#endif

    return IndirectIrradiance;
}

// PS main entrance
PIXELSHADER_EARLYDEPTHSTENCIL
void Main( 
    FVertexFactoryInterpolantsVSToPS Interpolants
    , FMobileBasePassInterpolantsVSToPS BasePassInterpolants
    , in float4 SvPosition : SV_Position
    OPTIONAL_IsFrontFace
    , out half4 OutColor    : SV_Target0
#if DEFERRED_SHADING_PATH
    , out half4 OutGBufferA    : SV_Target1
    , out half4 OutGBufferB    : SV_Target2
    , out half4 OutGBufferC    : SV_Target3
#endif
#if USE_SCENE_DEPTH_AUX
    , out float OutSceneDepthAux : SV_Target4
#endif
#if OUTPUT_PIXEL_DEPTH_OFFSET
    , out float OutDepth : SV_Depth
#endif
    )
{
#if MOBILE_MULTI_VIEW
    ResolvedView = ResolveView(uint(BasePassInterpolants.MultiViewId));
#else
    ResolvedView = ResolveView();
#endif

#if USE_PS_CLIP_PLANE
    clip(BasePassInterpolants.OutClipDistance);
#endif

    // Decompress the packed interpolation data
#if PACK_INTERPOLANTS
    float4 PackedInterpolants[NUM_VF_PACKED_INTERPOLANTS];
    VertexFactoryUnpackInterpolants(Interpolants, PackedInterpolants);
#endif

#if COMPILER_GLSL_ES3_1 && !OUTPUT_MOBILE_HDR && !MOBILE_EMULATION
    // LDR Mobile needs screen vertical flipped
    SvPosition.y = ResolvedView.BufferSizeAndInvSize.y - SvPosition.y - 1;
#endif

    // Gets the pixel properties of the material
    FMaterialPixelParameters MaterialParameters = GetMaterialPixelParameters(Interpolants, SvPosition);
    FPixelMaterialInputs PixelMaterialInputs;
    {
        float4 ScreenPosition = SvPositionToResolvedScreenPosition(SvPosition);
        float3 WorldPosition = BasePassInterpolants.PixelPosition.xyz;
        float3 WorldPositionExcludingWPO = BasePassInterpolants.PixelPosition.xyz;
        #if USE_WORLD_POSITION_EXCLUDING_SHADER_OFFSETS
            WorldPositionExcludingWPO = BasePassInterpolants.PixelPositionExcludingWPO;
        #endif
        CalcMaterialParametersEx(MaterialParameters, PixelMaterialInputs, SvPosition, ScreenPosition, bIsFrontFace, WorldPosition, WorldPositionExcludingWPO);

#if FORCE_VERTEX_NORMAL
        // Quality level override of material's normal calculation, can be used to avoid normal map reads etc.
        MaterialParameters.WorldNormal = MaterialParameters.TangentToWorld[2];
        MaterialParameters.ReflectionVector = ReflectionAboutCustomWorldNormal(MaterialParameters, MaterialParameters.WorldNormal, false);
#endif
    }

    // Pixel depth offset
#if OUTPUT_PIXEL_DEPTH_OFFSET
    ApplyPixelDepthOffsetForMobileBasePass(MaterialParameters, PixelMaterialInputs, OutDepth);
#endif
      
    // Mask material
#if !EARLY_Z_PASS_ONLY_MATERIAL_MASKING
    //Clip if the blend mode requires it.
    GetMaterialCoverageAndClipping(MaterialParameters, PixelMaterialInputs);
#endif

    // Calculate and cache GBuffer data to prevent subsequent multiple texture applications
    FGBufferData GBuffer = (FGBufferData)0;
    GBuffer.WorldNormal = MaterialParameters.WorldNormal;
    GBuffer.BaseColor = GetMaterialBaseColor(PixelMaterialInputs);
    GBuffer.Metallic = GetMaterialMetallic(PixelMaterialInputs);
    GBuffer.Specular = GetMaterialSpecular(PixelMaterialInputs);
    GBuffer.Roughness = GetMaterialRoughness(PixelMaterialInputs);
    GBuffer.ShadingModelID = GetMaterialShadingModel(PixelMaterialInputs);
    half MaterialAO = GetMaterialAmbientOcclusion(PixelMaterialInputs);

    // Apply AO
#if APPLY_AO
    half4 GatheredAmbientOcclusion = Texture2DSample(AmbientOcclusionTexture, AmbientOcclusionSampler, SvPositionToBufferUV(SvPosition));

    MaterialAO *= GatheredAmbientOcclusion.r;
#endif

    GBuffer.GBufferAO = MaterialAO;

    // Since the minimum standard value that can be represented by IEEE 754 (FP16) is 2^-24 = 5.96e-8, and the subsequent roughness involves the calculation of 1.0 / Roughness^4, in order to prevent division error, it is necessary to ensure that roughness ^ 4 > = 5.96e-8. Here, directly Clamp the roughness to 0.015625(0.015625^4 = 5.96e-8)
    // In addition, in order to match the delayed rendering on the PC side (the roughness is stored in the 8-bit value), it is also automatically clamped to 1.0
    GBuffer.Roughness = max(0.015625, GetMaterialRoughness(PixelMaterialInputs));
    
    // Initialize the mobile end shading model context FMobileShadingModelContext
    FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0;
    ShadingModelContext.Opacity = GetMaterialOpacity(PixelMaterialInputs);

    // Thin layer transparency
#if MATERIAL_SHADINGMODEL_THIN_TRANSLUCENT
    (......)
#endif

    half3 Color = 0;

    // Custom data
    half CustomData0 = GetMaterialCustomData0(MaterialParameters);
    half CustomData1 = GetMaterialCustomData1(MaterialParameters);
    InitShadingModelContext(ShadingModelContext, GBuffer, MaterialParameters.SvPosition, MaterialParameters.CameraVector, CustomData0, CustomData1);
    float3 DiffuseDir = MaterialParameters.WorldNormal;

    // Hair model
#if MATERIAL_SHADINGMODEL_HAIR
    (......)
#endif

    // Lightmap virtual texture
    VTPageTableResult LightmapVTPageTableResult = (VTPageTableResult)0.0f;
#if LIGHTMAP_VT_ENABLED
    {
        float2 LightmapUV0, LightmapUV1;
        uint LightmapDataIndex;
        GetLightMapCoordinates(Interpolants, LightmapUV0, LightmapUV1, LightmapDataIndex);
        LightmapVTPageTableResult = LightmapGetVTSampleInfo(LightmapUV0, LightmapDataIndex, SvPosition.xy);
    }
#endif

#if LIGHTMAP_VT_ENABLED
    // This must occur after CalcMaterialParameters(), which is required to initialize the VT feedback mechanism
    // Lightmap request is always the first VT sample in the shader
    StoreVirtualTextureFeedback(MaterialParameters.VirtualTextureFeedback, 0, LightmapVTPageTableResult.PackedRequest);
#endif

    // Calculate indirect light
    half IndirectIrradiance;
    half3 IndirectColor;
    ComputeIndirect(LightmapVTPageTableResult, Interpolants, DiffuseDir, ShadingModelContext, IndirectIrradiance, IndirectColor);
    Color += IndirectColor;

    // Precomputed shadow map
    half Shadow = GetPrimaryPrecomputedShadowMask(LightmapVTPageTableResult, Interpolants).r;

#if DEFERRED_SHADING_PATH
    float4 OutGBufferD;
    float4 OutGBufferE;
    float4 OutGBufferF;
    float4 OutGBufferVelocity = 0;

    GBuffer.IndirectIrradiance = IndirectIrradiance;
    GBuffer.PrecomputedShadowFactors.r = Shadow;

    // Encode GBuffer data
    EncodeGBuffer(GBuffer, OutGBufferA, OutGBufferB, OutGBufferC, OutGBufferD, OutGBufferE, OutGBufferF, OutGBufferVelocity);
#else

#if !MATERIAL_SHADINGMODEL_UNLIT

    // daylight.
#if ENABLE_SKY_LIGHT
    half3 SkyDiffuseLighting = GetSkySHDiffuseSimple(MaterialParameters.WorldNormal);
    half3 DiffuseLookup = SkyDiffuseLighting * ResolvedView.SkyLightColor.rgb;
    IndirectIrradiance += Luminance(DiffuseLookup);
#endif
            
    Color *= MaterialAO;
    IndirectIrradiance *= MaterialAO;

    float  ShadowPositionZ = 0;
#if DIRECTIONAL_LIGHT_CSM && !MATERIAL_SHADINGMODEL_SINGLELAYERWATER
    // CSM shadows
    if (UseCSM())
    {
        half ShadowMap = MobileDirectionalLightCSM(MaterialParameters.ScreenPosition.xy, MaterialParameters.ScreenPosition.w, ShadowPositionZ);
    #if ALLOW_STATIC_LIGHTING
        Shadow = min(ShadowMap, Shadow);
    #else
        Shadow = ShadowMap;
    #endif
    }
#endif /* DIRECTIONAL_LIGHT_CSM */

    // Distance field shadows
#if APPLY_DISTANCE_FIELD
    if (ShadowPositionZ == 0)
    {
        Shadow = Texture2DSample(MobileBasePass.ScreenSpaceShadowMaskTexture, MobileBasePass.ScreenSpaceShadowMaskSampler, SvPositionToBufferUV(SvPosition)).x;
    }
#endif

    half NoL = max(0, dot(MaterialParameters.WorldNormal, MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz));
    half3 H = normalize(MaterialParameters.CameraVector + MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz);
    half NoH = max(0, dot(MaterialParameters.WorldNormal, H));

    // Directional light + IBL
#if FULLY_ROUGH
    Color += (Shadow * NoL) * MobileDirectionalLight.DirectionalLightColor.rgb * ShadingModelContext.DiffuseColor;
#else
    FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, NoL, MaterialParameters.CameraVector, H, NoH);
    // Mobiledirectionallight.directionallightdistancefademandspectral scale. Z saves the spectral scale of the directional light
    Color += (Shadow) * MobileDirectionalLight.DirectionalLightColor.rgb * (Lighting.Diffuse + Lighting.Specular * MobileDirectionalLight.DirectionalLightDistanceFadeMADAndSpecularScale.z);

    // Hair coloring
#if    !(MATERIAL_SINGLE_SHADINGMODEL && MATERIAL_SHADINGMODEL_HAIR)
    (......)
#endif
#endif /* FULLY_ROUGH */

    // Local light sources, up to 4
#if MAX_DYNAMIC_POINT_LIGHTS > 0 && !MATERIAL_SHADINGMODEL_SINGLELAYERWATER

        if(NumDynamicPointLights > 0)
        {

            #if SUPPORT_SPOTLIGHTS_SHADOW
                FPCFSamplerSettings Settings;
                Settings.ShadowDepthTexture = DynamicSpotLightShadowTexture;
                Settings.ShadowDepthTextureSampler = DynamicSpotLightShadowSampler;
                Settings.ShadowBufferSize = DynamicSpotLightShadowBufferSize;
                Settings.bSubsurface = false;
                Settings.bTreatMaxDepthUnshadowed = false;
                Settings.DensityMulConstant = 0;
                Settings.ProjectionDepthBiasParameters = 0;
            #endif

            AccumulateLightingOfDynamicPointLight(MaterialParameters, ...);
        
            if (MAX_DYNAMIC_POINT_LIGHTS > 1 && NumDynamicPointLights > 1)
            {
                AccumulateLightingOfDynamicPointLight(MaterialParameters, ...);

                if (MAX_DYNAMIC_POINT_LIGHTS > 2 && NumDynamicPointLights > 2)
                {
                    AccumulateLightingOfDynamicPointLight(MaterialParameters, ...);

                    if (MAX_DYNAMIC_POINT_LIGHTS > 3 && NumDynamicPointLights > 3)
                    {
                        AccumulateLightingOfDynamicPointLight(MaterialParameters, ...);
                    }
                }
            }
        }

#endif

    // Sky light
#if ENABLE_SKY_LIGHT
    #if MATERIAL_TWOSIDED && LQ_TEXTURE_LIGHTMAP
    if (NoL == 0)
    {
    #endif

    #if MATERIAL_SHADINGMODEL_SINGLELAYERWATER
        ShadingModelContext.WaterDiffuseIndirectLuminance += SkyDiffuseLighting;
    #endif
        Color += SkyDiffuseLighting * half3(ResolvedView.SkyLightColor.rgb) * ShadingModelContext.DiffuseColor * MaterialAO;
    #if MATERIAL_TWOSIDED && LQ_TEXTURE_LIGHTMAP
    }
    #endif
#endif

#endif /* !MATERIAL_SHADINGMODEL_UNLIT */

#if MATERIAL_SHADINGMODEL_SINGLELAYERWATER
    (......)
#endif // MATERIAL_SHADINGMODEL_SINGLELAYERWATER

#endif// DEFERRED_SHADING_PATH

    // Handles vertex fog
    half4 VertexFog = half4(0, 0, 0, 1);
#if USE_VERTEX_FOG
#if PACK_INTERPOLANTS
    VertexFog = PackedInterpolants[0];
#else
    VertexFog = BasePassInterpolants.VertexFog;
#endif
#endif
    
    // Self luminous.
    half3 Emissive = GetMaterialEmissive(PixelMaterialInputs);
#if MATERIAL_SHADINGMODEL_THIN_TRANSLUCENT
    Emissive *= TopMaterialCoverage;
#endif
    Color += Emissive;

#if !MATERIAL_SHADINGMODEL_UNLIT && MOBILE_EMULATION
    Color = lerp(Color, ShadingModelContext.DiffuseColor, ResolvedView.UnlitViewmodeMask);
#endif

    // Combine fog color to output color
    #if MATERIALBLENDING_ALPHACOMPOSITE || MATERIAL_SHADINGMODEL_SINGLELAYERWATER
        OutColor = half4(Color * VertexFog.a + VertexFog.rgb * ShadingModelContext.Opacity, ShadingModelContext.Opacity);
    #elif MATERIALBLENDING_ALPHAHOLDOUT
        // not implemented for holdout
        OutColor = half4(Color * VertexFog.a + VertexFog.rgb * ShadingModelContext.Opacity, ShadingModelContext.Opacity);
    #elif MATERIALBLENDING_TRANSLUCENT
        OutColor = half4(Color * VertexFog.a + VertexFog.rgb, ShadingModelContext.Opacity);
    #elif MATERIALBLENDING_ADDITIVE
        OutColor = half4(Color * (VertexFog.a * ShadingModelContext.Opacity.x), 0.0f);
    #elif MATERIALBLENDING_MODULATE
        half3 FoggedColor = lerp(half3(1, 1, 1), Color, VertexFog.aaa * VertexFog.aaa);
        OutColor = half4(FoggedColor, ShadingModelContext.Opacity);
    #else
        OutColor.rgb = Color * VertexFog.a + VertexFog.rgb;

        #if !MATERIAL_USE_ALPHA_TO_COVERAGE
            // Scene color alpha is not used yet so we set it to 1
            OutColor.a = 1.0;

            #if OUTPUT_MOBILE_HDR 
                // Store depth in FP16 alpha. This depth value can be fetched during translucency or sampled in post-processing
                OutColor.a = SvPosition.z;
            #endif
        #else
            half MaterialOpacityMask = GetMaterialMaskInputRaw(PixelMaterialInputs);
            OutColor.a = GetMaterialMask(PixelMaterialInputs) / max(abs(ddx(MaterialOpacityMask)) + abs(ddy(MaterialOpacityMask)), 0.0001f) + 0.5f;
        #endif
    #endif

    #if !MATERIALBLENDING_MODULATE && USE_PREEXPOSURE
        OutColor.rgb *= ResolvedView.PreExposure;
    #endif

    #if MATERIAL_IS_SKY
        OutColor.rgb = min(OutColor.rgb, Max10BitsFloat.xxx * 0.5f);
    #endif

#if USE_SCENE_DEPTH_AUX
    OutSceneDepthAux = SvPosition.z;
#endif

    // Process the alpha of the color
#if USE_EDITOR_COMPOSITING && (MOBILE_EMULATION)
    // Editor primitive depth testing
    OutColor.a = 1.0;
    #if MATERIALBLENDING_MASKED
        // some material might have an opacity value
        OutColor.a = GetMaterialMaskInputRaw(PixelMaterialInputs);
    #endif
    clip(OutColor.a - GetMaterialOpacityMaskClipValue());
#else
    #if OUTPUT_GAMMA_SPACE
        OutColor.rgb = sqrt(OutColor.rgb);
    #endif
#endif

#if NUM_VIRTUALTEXTURE_SAMPLES || LIGHTMAP_VT_ENABLED
    FinalizeVirtualTextureFeedback(
        MaterialParameters.VirtualTextureFeedback,
        MaterialParameters.SvPosition,
        ShadingModelContext.Opacity,
        View.FrameNumber,
        View.VTFeedbackBuffer
    );
#endif
}

The processing process of BasePassPS at the mobile terminal is complex and has many steps, mainly including decompressing interpolation data, obtaining and calculating material properties, calculating and adjusting GBuffer, processing or adjusting GBuffer data, calculating the lighting of forward rendering branches (horizontal light, local light), calculating distance field, CSM and other shadows, calculating sky light, and processing static light, indirect light and IBL, Calculate the fog effect, and deal with special coloring models such as water body, hair and thin layer transparency.

Since the minimum value that can be represented by the standard 16 bit floating-point number (FP16) is \ (\ cfrac {1.0} {2 ^ {24} = 5.96 \ cdot 10 ^ {- 8} \), and the subsequent illumination calculation involves the 4th power operation of roughness (\ (\ cfrac{1.0}{\text{Roughness}^4} \)), in order to prevent the division error, the roughness needs to be intercepted to \ (0.015625 \) (\ (0.015625^4 = 5.96 \cdot 10^{-8} \).

GBuffer.Roughness = max(0.015625, GetMaterialRoughness(PixelMaterialInputs));

This also warns us that we need to pay special attention to and control the data accuracy when developing the rendering features of the mobile terminal, otherwise various wonderful picture abnormalities often occur in low-end devices due to insufficient data accuracy.

Although there are many codes above, they are controlled by many macros. The code required to actually render a single material may be only a small subset of them. For example, four local light sources are supported by default, but if it can be set to 2 or less in the engineering configuration (below), the actual executed light source instructions are much less.

If it is a forward rendering branch, many processing of GBuffer will be ignored; If it is a delayed rendering branch, the calculation of directional light and local light source will be ignored and executed by the shader of delayed rendering Pass.

The following is an analysis of the important interface EncodeGBuffer:

void EncodeGBuffer(
    FGBufferData GBuffer,
    out float4 OutGBufferA,
    out float4 OutGBufferB,
    out float4 OutGBufferC,
    out float4 OutGBufferD,
    out float4 OutGBufferE,
    out float4 OutGBufferVelocity,
    float QuantizationBias = 0        // -0.5 to 0.5 random float. Used to bias quantization.
    )
{
    if (GBuffer.ShadingModelID == SHADINGMODELID_UNLIT)
    {
        OutGBufferA = 0;
        SetGBufferForUnlit(OutGBufferB);
        OutGBufferC = 0;
        OutGBufferD = 0;
        OutGBufferE = 0;
    }
    else
    {
        // GBufferA: octahedral compressed normal, pre calculated shadow factor, object by object data
#if MOBILE_DEFERRED_SHADING
        OutGBufferA.rg = UnitVectorToOctahedron( normalize(GBuffer.WorldNormal) ) * 0.5f + 0.5f;
        OutGBufferA.b = GBuffer.PrecomputedShadowFactors.x;
        OutGBufferA.a = GBuffer.PerObjectGBufferData;
#else
        (......)
#endif

        // GBufferB: metallicity, high luminosity, roughness, coloring model, other Mask
        OutGBufferB.r = GBuffer.Metallic;
        OutGBufferB.g = GBuffer.Specular;
        OutGBufferB.b = GBuffer.Roughness;
        OutGBufferB.a = EncodeShadingModelIdAndSelectiveOutputMask(GBuffer.ShadingModelID, GBuffer.SelectiveOutputMask);

        // GBufferC: basic color, AO or indirect light
        OutGBufferC.rgb = EncodeBaseColor( GBuffer.BaseColor );

#if ALLOW_STATIC_LIGHTING
        // No space for AO. Multiply IndirectIrradiance by AO instead of storing.
        OutGBufferC.a = EncodeIndirectIrradiance(GBuffer.IndirectIrradiance * GBuffer.GBufferAO) + QuantizationBias * (1.0 / 255.0);
#else
        OutGBufferC.a = GBuffer.GBufferAO;
#endif

        OutGBufferD = GBuffer.CustomData;
        OutGBufferE = GBuffer.PrecomputedShadowFactors;
    }

#if WRITES_VELOCITY_TO_GBUFFER
    OutGBufferVelocity = GBuffer.Velocity;
#else
    OutGBufferVelocity = 0;
#endif
}

Under the default lighting model (DefaultLit), BasePass outputs the following textures:

12.3.3.3 MobileDeferredShading

The VS of delayed lighting on the mobile terminal is the same as that on the PC terminal. Both are DeferredLightVertexShaders.usf, but the PS is different. MobileDeferredShading.usf is used. Since VS is the same as PC and there is no special operation, it will be ignored here. If you are interested, you can see the section of Chapter 5 5.5.3.1 DeferredLightVertexShader.

The following is a direct analysis of PS Code:

// Engine\Shaders\Private\MobileDeferredShading.usf

(......)

// Mobile end light source data structure
struct FMobileLightData
{
    float3 Position;
    float  InvRadius;
    float3 Color;
    float  FalloffExponent;
    float3 Direction;
    float2 SpotAngles;
    float SourceRadius;
    float SpecularScale;
    bool bInverseSquared;
    bool bSpotLight;
};

// Get GBuffer data
void FetchGBuffer(in float2 UV, out float4 GBufferA, out float4 GBufferB, out float4 GBufferC, out float4 GBufferD, out float SceneDepth)
{
    // Vulkan's child pass gets data
#if VULKAN_PROFILE
    GBufferA = VulkanSubpassFetch1(); 
    GBufferB = VulkanSubpassFetch2(); 
    GBufferC = VulkanSubpassFetch3(); 
    GBufferD = 0;
    SceneDepth = ConvertFromDeviceZ(VulkanSubpassDepthFetch());
    // The sub pass of Metal gets data
#elif METAL_PROFILE
    GBufferA = SubpassFetchRGBA_1(); 
    GBufferB = SubpassFetchRGBA_2(); 
    GBufferC = SubpassFetchRGBA_3(); 
    GBufferD = 0; 
    SceneDepth = ConvertFromDeviceZ(SubpassFetchR_4());
    // The sub pass of other platforms (DX, OpenGL) obtains data
#else
    GBufferA = Texture2DSampleLevel(MobileSceneTextures.GBufferATexture, MobileSceneTextures.GBufferATextureSampler, UV, 0); 
    GBufferB = Texture2DSampleLevel(MobileSceneTextures.GBufferBTexture, MobileSceneTextures.GBufferBTextureSampler, UV, 0);
    GBufferC = Texture2DSampleLevel(MobileSceneTextures.GBufferCTexture, MobileSceneTextures.GBufferCTextureSampler, UV, 0);
    GBufferD = 0;
    SceneDepth = ConvertFromDeviceZ(Texture2DSampleLevel(MobileSceneTextures.SceneDepthTexture, MobileSceneTextures.SceneDepthTextureSampler, UV, 0).r);
#endif
}

// Decompress GBuffer data
FGBufferData DecodeGBufferMobile(
    float4 InGBufferA,
    float4 InGBufferB,
    float4 InGBufferC,
    float4 InGBufferD)
{
    FGBufferData GBuffer;
    GBuffer.WorldNormal = OctahedronToUnitVector( InGBufferA.xy * 2.0f - 1.0f );
    GBuffer.PrecomputedShadowFactors = InGBufferA.z;
    GBuffer.PerObjectGBufferData = InGBufferA.a;  
    GBuffer.Metallic    = InGBufferB.r;
    GBuffer.Specular    = InGBufferB.g;
    GBuffer.Roughness    = max(0.015625, InGBufferB.b);
    // Note: must match GetShadingModelId standalone function logic
    // Also Note: SimpleElementPixelShader directly sets SV_Target2 ( GBufferB ) to indicate unlit.
    // An update there will be required if this layout changes.
    GBuffer.ShadingModelID = DecodeShadingModelId(InGBufferB.a);
    GBuffer.SelectiveOutputMask = DecodeSelectiveOutputMask(InGBufferB.a);
    GBuffer.BaseColor = DecodeBaseColor(InGBufferC.rgb);
#if ALLOW_STATIC_LIGHTING
    GBuffer.GBufferAO = 1;
    GBuffer.IndirectIrradiance = DecodeIndirectIrradiance(InGBufferC.a);
#else
    GBuffer.GBufferAO = InGBufferC.a;
    GBuffer.IndirectIrradiance = 1;
#endif
    GBuffer.CustomData = HasCustomGBufferData(GBuffer.ShadingModelID) ? InGBufferD : 0;
    return GBuffer;
}

// Direct illumination
half3 GetDirectLighting(
    FMobileLightData LightData, 
    FMobileShadingModelContext ShadingModelContext, 
    FGBufferData GBuffer, 
    float3 WorldPosition, 
    half3 CameraVector)
{
    half3 DirectLighting = 0;
    
    float3 ToLight = LightData.Position - WorldPosition;
    float DistanceSqr = dot(ToLight, ToLight);
    float3 L = ToLight * rsqrt(DistanceSqr);
    
    // Light attenuation
    float Attenuation = 0.0;
    if (LightData.bInverseSquared)
    {
        // Sphere falloff (technically just 1/d2 but this avoids inf)
        Attenuation = 1.0f / (DistanceSqr + 1.0f);
        Attenuation *= Square(saturate(1 - Square(DistanceSqr * Square(LightData.InvRadius))));
    }
    else
    {
        Attenuation = RadialAttenuation(ToLight * LightData.InvRadius, LightData.FalloffExponent);
    }

    // Spotlight attenuation
    if (LightData.bSpotLight)
    {
        Attenuation *= SpotAttenuation(L, -LightData.Direction, LightData.SpotAngles);
    }
    
    // If the attenuation is not 0, direct illumination is calculated
    if (Attenuation > 0.0)
    {
        half3 H = normalize(CameraVector + L);
        half NoL = max(0.0, dot(GBuffer.WorldNormal, L));
        half NoH = max(0.0, dot(GBuffer.WorldNormal, H));
        FMobileDirectLighting Lighting = MobileIntegrateBxDF(ShadingModelContext, GBuffer, NoL, CameraVector, H, NoH);
        DirectLighting = (Lighting.Diffuse + Lighting.Specular * LightData.SpecularScale) * (LightData.Color * (1.0 / PI) * Attenuation);
    }
    return DirectLighting;
}

// Illumination function
half ComputeLightFunctionMultiplier(float3 WorldPosition);
// Using light grids to add local lighting, dynamic shadows are not supported because a per light shadow map is required
half3 GetLightGridLocalLighting(const FCulledLightsGridData InLightGridData, ...);

// PS main entrance for directional light
void MobileDirectLightPS(
    noperspective float4 UVAndScreenPos : TEXCOORD0, 
    float4 SvPosition : SV_POSITION, 
    out half4 OutColor : SV_Target0)
{
    // Recover (read) GBuffer data
    FGBufferData GBuffer = (FGBufferData)0;
    float SceneDepth = 0; 
    {
        float4 GBufferA = 0; 
        float4 GBufferB = 0; 
        float4 GBufferC = 0; 
        float4 GBufferD = 0;
        FetchGBuffer(UVAndScreenPos.xy, GBufferA, GBufferB, GBufferC, GBufferD, SceneDepth);
        GBuffer = DecodeGBufferMobile(GBufferA, GBufferB, GBufferC, GBufferD);
    }
    
    // Calculate the base vector
    float2 ScreenPos = UVAndScreenPos.zw;
    float3 WorldPosition = mul(float4(ScreenPos * SceneDepth, SceneDepth, 1), View.ScreenToWorld).xyz;
    half3 CameraVector = normalize(View.WorldCameraOrigin - WorldPosition);
    half NoV = max(0, dot(GBuffer.WorldNormal, CameraVector));
    half3 ReflectionVector = GBuffer.WorldNormal * (NoV * 2.0) - CameraVector;
    
    half3 Color = 0;
    // Check movable light param to determine if we should be using precomputed shadows
    half Shadow = LightFunctionParameters2.z > 0.0f ? 1.0f : GBuffer.PrecomputedShadowFactors.r;

    // CSM shadows
#if APPLY_CSM
    float  ShadowPositionZ = 0;
    float4 ScreenPosition = SvPositionToScreenPosition(float4(SvPosition.xyz,SceneDepth));
    float ShadowMap = MobileDirectionalLightCSM(ScreenPosition.xy, SceneDepth, ShadowPositionZ);
    Shadow = min(ShadowMap, Shadow);
#endif

    // Shading model context
    FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0;
    {
        half DielectricSpecular = 0.08 * GBuffer.Specular;
        ShadingModelContext.DiffuseColor = GBuffer.BaseColor - GBuffer.BaseColor * GBuffer.Metallic;    // 1 mad
        ShadingModelContext.SpecularColor = (DielectricSpecular - DielectricSpecular * GBuffer.Metallic) + GBuffer.BaseColor * GBuffer.Metallic;    // 2 mad
        // BRDF of computing environment
        ShadingModelContext.SpecularColor = GetEnvBRDF(ShadingModelContext.SpecularColor, GBuffer.Roughness, NoV);
    }
    
    // Local light source
    float2 LocalPosition = SvPosition.xy - View.ViewRectMin.xy;
    uint GridIndex = ComputeLightGridCellIndex(uint2(LocalPosition.x, LocalPosition.y), SceneDepth);
    // Cluster light source
#if USE_CLUSTERED
    {
        const uint EyeIndex = 0;
        const FCulledLightsGridData CulledLightGridData = GetCulledLightsGrid(GridIndex, EyeIndex);
        Color += GetLightGridLocalLighting(CulledLightGridData, ShadingModelContext, GBuffer, WorldPosition, CameraVector, EyeIndex, 0);
    }
#endif
            
    // Calculate the directional light
    half NoL = max(0, dot(GBuffer.WorldNormal, MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz));
    half3 H = normalize(CameraVector + MobileDirectionalLight.DirectionalLightDirectionAndShadowTransition.xyz);
    half NoH = max(0, dot(GBuffer.WorldNormal, H));
    FMobileDirectLighting Lighting;
    Lighting.Specular = ShadingModelContext.SpecularColor * CalcSpecular(GBuffer.Roughness, NoH);
    Lighting.Diffuse = ShadingModelContext.DiffuseColor;
    Color += (Shadow * NoL) * MobileDirectionalLight.DirectionalLightColor.rgb * (Lighting.Diffuse + Lighting.Specular * MobileDirectionalLight.DirectionalLightDistanceFadeMADAndSpecularScale.z);

    // Process reflections (IBL, reflection catcher)
#if APPLY_REFLECTION
    uint NumCulledEntryIndex = (ForwardLightData.NumGridCells + GridIndex) * NUM_CULLED_LIGHTS_GRID_STRIDE;
    uint NumLocalReflectionCaptures = min(ForwardLightData.NumCulledLightsGrid[NumCulledEntryIndex + 0], ForwardLightData.NumReflectionCaptures);
    uint DataStartIndex = ForwardLightData.NumCulledLightsGrid[NumCulledEntryIndex + 1];

    float3 SpecularIBL = CompositeReflectionCapturesAndSkylight(
        1.0f,
        WorldPosition,
        ReflectionVector,//RayDirection,
        GBuffer.Roughness,
        GBuffer.IndirectIrradiance,
        1.0f,
        0.0f,
        NumLocalReflectionCaptures,
        DataStartIndex,
        0,
        true);
        
    Color += SpecularIBL * ShadingModelContext.SpecularColor;
#elif APPLY_SKY_REFLECTION
    float SkyAverageBrightness = 1.0f;
    float3 SpecularIBL = GetSkyLightReflection(ReflectionVector, GBuffer.Roughness, SkyAverageBrightness);
    SpecularIBL *= ComputeMixingWeight(GBuffer.IndirectIrradiance, SkyAverageBrightness, GBuffer.Roughness);
    Color += SpecularIBL * ShadingModelContext.SpecularColor;
#endif
    // Diffuse reflection of sky light
    half3 SkyDiffuseLighting = GetSkySHDiffuseSimple(GBuffer.WorldNormal);
    Color+= SkyDiffuseLighting * half3(View.SkyLightColor.rgb) * ShadingModelContext.DiffuseColor * GBuffer.GBufferAO;
    half LightAttenuation = ComputeLightFunctionMultiplier(WorldPosition);

#if USE_PREEXPOSURE
    // MobileHDR applies PreExposure in tonemapper
    LightAttenuation *= View.PreExposure;    
#endif
                    
    OutColor.rgb = Color.rgb * LightAttenuation;
    OutColor.a = 1;
}

// PS main entrance of local light source
void MobileRadialLightPS(
    float4 InScreenPosition : TEXCOORD0,
    float4 SVPos            : SV_POSITION,
    out half4 OutColor        : SV_Target0
)
{
    FGBufferData GBuffer = (FGBufferData)0;
    float SceneDepth = 0; 
    {
        float2 ScreenUV = InScreenPosition.xy / InScreenPosition.w * View.ScreenPositionScaleBias.xy + View.ScreenPositionScaleBias.wz;
        float4 GBufferA = 0;  
        float4 GBufferB = 0; 
        float4 GBufferC = 0; 
        float4 GBufferD = 0;
        FetchGBuffer(ScreenUV, GBufferA, GBufferB, GBufferC, GBufferD, SceneDepth);
        GBuffer = DecodeGBufferMobile(GBufferA, GBufferB, GBufferC, GBufferD);
    }
    
    // With a perspective projection, the clip space position is NDC * Clip.w
    // With an orthographic projection, clip space is the same as NDC
    float2 ClipPosition = InScreenPosition.xy / InScreenPosition.w * (View.ViewToClip[3][3] < 1.0f ? SceneDepth : 1.0f);
    float3 WorldPosition = mul(float4(ClipPosition, SceneDepth, 1), View.ScreenToWorld).xyz;
    half3 CameraVector = normalize(View.WorldCameraOrigin - WorldPosition);
    half NoV = max(0, dot(GBuffer.WorldNormal, CameraVector));
    
    // Assemble the light source data structure
    FMobileLightData LightData = (FMobileLightData)0;
    {
        LightData.Position = DeferredLightUniforms.Position;
        LightData.InvRadius = DeferredLightUniforms.InvRadius;
        LightData.Color = DeferredLightUniforms.Color;
        LightData.FalloffExponent = DeferredLightUniforms.FalloffExponent;
        LightData.Direction = DeferredLightUniforms.Direction;
        LightData.SpotAngles = DeferredLightUniforms.SpotAngles;
        LightData.SpecularScale = 1.0;
        LightData.bInverseSquared = INVERSE_SQUARED_FALLOFF; 
        LightData.bSpotLight = IS_SPOT_LIGHT; 
    }

    FMobileShadingModelContext ShadingModelContext = (FMobileShadingModelContext)0;
    {
        half DielectricSpecular = 0.08 * GBuffer.Specular;
        ShadingModelContext.DiffuseColor = GBuffer.BaseColor - GBuffer.BaseColor * GBuffer.Metallic;    // 1 mad
        ShadingModelContext.SpecularColor = (DielectricSpecular - DielectricSpecular * GBuffer.Metallic) + GBuffer.BaseColor * GBuffer.Metallic;    // 2 mad
        // Computing environment BRDF
        ShadingModelContext.SpecularColor = GetEnvBRDF(ShadingModelContext.SpecularColor, GBuffer.Roughness, NoV);
    }
    
    // Calculate direct light
    half3 Color = GetDirectLighting(LightData, ShadingModelContext, GBuffer, WorldPosition, CameraVector);
    
    // IES, illumination function
    half LightAttenuation = ComputeLightProfileMultiplier(WorldPosition, DeferredLightUniforms.Position, -DeferredLightUniforms.Direction, DeferredLightUniforms.Tangent);
    LightAttenuation*= ComputeLightFunctionMultiplier(WorldPosition);

#if USE_PREEXPOSURE
    // MobileHDR applies PreExposure in tonemapper
    LightAttenuation*= View.PreExposure;    
#endif

    OutColor.rgb = Color * LightAttenuation;
    OutColor.a = 1;
}

It can be seen from the above that the PS of directional light and local light source are different entrances, mainly because they are quite different. The illumination of directional light is calculated directly at the main entrance, with reflection (IBL, catcher) and sky light diffuse reflection calculated; The local light source will build a light source structure, enter the direct light calculation function, and finally deal with the unique IES and lighting function of the local light source.

In addition, when acquiring GBuffer, the unique read mode of SubPass is adopted, and different coloring platforms are different:

// Vulkan
[[vk::input_attachment_index(1)]]
SubpassInput<float4> GENERATED_SubpassFetchAttachment0;
#define VulkanSubpassFetch0() GENERATED_SubpassFetchAttachment0.SubpassLoad()

// Metal
Texture2D<float4> gl_LastFragDataRGBA_1;
#define SubpassFetchRGBA_1() gl_LastFragDataRGBA_1.Load(uint3(0, 0, 0), 0)

// DX / OpenGL
Texture2DSampleLevel(GBufferATexture, GBufferATextureSampler, UV, 0);

 

 

Team recruitment

The blogger's team is developing a new immersive experience product with UE4. It is in urgent need of all heroes to join in and work together for great cause. The following positions are urgently sought:

  • UE logic development.
  • UE engine program.
  • UE graphics rendering.
  • TA (technology and Art).

Requirements: enthusiastic about technology, solid technical foundation, good communication and cooperation skills, UE use experience or mobile terminal development experience is preferred.

If you are interested or want to know more, please add the blogger's wechat: 81079389 (indicate the blog Park job), or send your resume to the blogger's email: 81079389#qq.com (# replace with @).

Waiting for heroes to meet.

 

 

Special note

  • Part 1 ends, and Part 2 includes:
    • Mobile rendering technology
    • Mobile terminal optimization skills
  • Thanks to the authors of all references, some pictures come from references and the Internet, which are infringed and deleted.
  • This series of articles is original by the author and only published in the blog park. You are welcome to share the link of this article, but you are not allowed to reprint it without consent!
  • Series of articles, to be continued, please stamp the complete directory Content outline.
  • Series of articles, to be continued, please stamp the complete directory Content outline.
  • Series of articles, to be continued, please stamp the complete directory Content outline.

 

reference

Topics: UE4