cefsharp irequesthandler_CefSharp request resource interception and custom processing

Posted by feckless on Mon, 07 Mar 2022 20:19:33 +0100

 

preface

 

In CefSharp, we can not only use the Chromium browser kernel, but also realize our own resource request processing through various handlers exposed by Cef.

 

What is a resource request? In short, it refers to the various texts (js, css and html) requested by the front-end page during loading. On the browser with Chromium kernel, we can use the developer tools provided by the browser to check the requests that occur every time the page is loaded.

 

prepare

 

Since the focus of this article is to understand the resource interception processing of CefSharp, we will not discuss the development of the front end and the details of the client embedding CefSharp components. We first complete a basic WinForm program embedded in CefSharp: the program interface is as follows, with an address input field and a Panel displaying web pages:

 

 
   
 

And write an extremely simple page, which will request one js resource and one css resource:

 
 
   
  1. demo:
  2.    - index.html
  3.    - test1.js
  4.    - test1.css
 
 

The code of these files is very simple:

 
 
   
  1. body
  2. {
  3.    background-color: aqua
  4. }
  5. function myFunc() {
  6.    return 'test1 js file';
  7. }
  8. <!DOCTYPE html>
  9. <html lang="en" xmlns="http://www.w3.org/1999/xhtml">
  10. <head>
  11.     <meta charset="utf-8" />
  12.     <title>Home </title>
  13.     <!-- The following records js,css resources -->
  14.     <script type="text/javascript" src="test1.js"> </script>
  15.     <link type="text/css" rel="stylesheet" href="test1.css"/>
  16. </head>
  17. <body>
  18. <h1>Resource Intercept Example </h1>
  19. <h2 id="result"> </h2>
  20. <script>
  21.     // Call test1 myFunc in JS
  22.     document.getElementById( 'result').innerHTML = myFunc();
  23. </script>
  24. </body>
  25. </html>
 
 

The code is very simple and the effect is easy to know. After the page is loaded, the background color of the page is aqua, and the text "test1 js file" will be displayed on the page. At the same time, when we use the development tool to refresh the page, we can see the corresponding resource loading:

 

 
   
 

CefSharp resource interception and custom processing

 

After completing the above preparations, we enter the text: resource interception and user-defined processing. First of all, we need to reach an agreement on the understanding of the target. Resource interception means that we can detect the resource request events of html, js and CSS in the above figure. In the following Example, because we are the client program we use, a prompt will pop up during the request process; Custom processing means that after the interception prompt is completed, we can also replace these resources. Here, we set that after the interception is completed, we can replace js and CSS with another file we want: test2 js and test2 css:

 
 
   
  1. function myFunc() {
  2.     return 'test2 js file';
  3. }
  4. body
  5. {
  6.    background-color: beige
  7. }
 
 

That is, we hope that after interception and replacement, the text on the page is no longer the previous one, but "test2 js file", and the background color of the page is beige.

 

IRequestHandler

 

In CefSharp, if you want to intercept requests, the core Handler is IRequestHandler. Check the official source code and you will find that there are several definitions of methods. By reading the official summary, we can focus on the following two definitions (deleted in the note):

 
 
   
  1. /// <summary>
  2. /// Called before browser navigation.
  3. //Translation before browser navigation
  4. /// If the navigation is allowed <see cref="E:CefSharp.IWebBrowser.FrameLoadStart" /> and <see cref="E:CefSharp.IWebBrowser.FrameLoadEnd" />
  5. /// will be called. If the navigation is canceled <see cref="E:CefSharp.IWebBrowser.LoadError" /> will be called with an ErrorCode
  6. /// value of <see cref="F:CefSharp.CefErrorCode.Aborted" />.
  7. /// </summary>
  8. bool OnBeforeBrowse(
  9.  IWebBrowser chromiumWebBrowser,
  10.  IBrowser browser,
  11.  IFrame frame,
  12.  IRequest request,
  13.  bool userGesture,
  14.  bool isRedirect);
  15. /// <summary>
  16. /// Called on the CEF IO thread before a resource request is initiated.
  17. ///Called on the CEF IO thread before a resource request is initialized
  18. /// </summary>
  19. IResourceRequestHandler GetResourceRequestHandler(
  20.  IWebBrowser chromiumWebBrowser,
  21.  IBrowser browser,
  22.  IFrame frame,
  23.  IRequest request,
  24.  bool isNavigation,
  25.  bool isDownload,
  26.  string requestInitiator,
  27.  ref bool disableDefaultHandling);
 
 

Therefore, we inherit a default class named RequestHandler (please distinguish between DefaultRequestHandler) and override only the above two methods.

 
 
   
  1. public class MyRequestHandler : RequestHandler
  2. {
  3.     protected override bool OnBeforeBrowse(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, bool userGesture,
  4.         bool isRedirect)
  5.     {
  6.         // First call the implementation of the base class and debug the breakpoint
  7.         return base.OnBeforeBrowse(chromiumWebBrowser, browser, frame, request, userGesture, isRedirect);
  8.     }
  9.     protected override IResourceRequestHandler GetResourceRequestHandler(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame,
  10.         IRequest request, bool isNavigation, bool isDownload, string requestInitiator, ref bool disableDefaultHandling)
  11.     {
  12.         // First call the implementation of the base class and debug the breakpoint
  13.         return base.GetResourceRequestHandler(
  14.             chromiumWebBrowser, browser, frame, request, isNavigation,
  15.             isDownload, requestInitiator, ref disableDefaultHandling);
  16.     }
  17. }
 
 

Then complete the registration of the Handler:

 
 
   
  1. this._webBrowser = new ChromiumWebBrowser( string. Empty)
  2. {
  3.     RequestHandler = new MyRequestHandler()
  4. };
 
 

Hit the breakpoint and start accessing our Example: index html. It will be found that OnBeforeBrowse is called once, while GetResourceRequestHandler is called three times. Check the request parameter content in OnBeforeBrowse. It is a request for the home page, while the three times in GetResourceRequestHandler are: home page HTML resource and test1 JS and test1 css.

 

 
   
 

 
   
 

Combined with the official comments and debugging results, we can draw a conclusion: to intercept navigation, we can rewrite OnBeforeBrowse method. To intercept resources, we need to implement our own ResourceRequestHandler.

 

IResourceRequestHandler

 

Looking at the definition of IResourceRequestHandler, let's focus on a function definition again:

 
 
   
  1. /// <summary>
  2. /// Called on the CEF IO thread before a resource is loaded. To specify a handler for the resource return a <see cref="T:CefSharp.IResourceHandler" /> object
  3. /// </summary>
  4. /// <returns>To allow the resource to load using the default network loader return null otherwise return an instance of <see cref="T:CefSharp.IResourceHandler" /> with a valid stream</returns>
  5. IResourceHandler GetResourceHandler(
  6.  IWebBrowser chromiumWebBrowser,
  7.  IBrowser browser,
  8.  IFrame frame,
  9.  IRequest request);
 
 

It can be seen from the annotation that if the implementation returns null, Cef will use the default network loader to initiate the request, or we can return a custom resource Handler to process a legal data Stream. That is to say, for resource processing, if we want to implement custom processing (not interception, we can handle interception in the above two handlers so far), we also need to implement an instance of IResourceHandler interface and return it at GetResourceHandler, and Cef will use our Handler during processing. If we want to intercept any resources that are not provided by the opencss file or the opencss file Handler, we will use the default file to determine whether we want to intercept or replace any resources that are not provided by the opencss file Handler:

 
 
   
  1. public class MyResourceRequestHandler : ResourceRequestHandler
  2. {
  3.     protected override IResourceHandler GetResourceHandler(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request)
  4.    {
  5.         if (request.Url.EndsWith( "test1.js") || request.Url.EndsWith( "test1.css"))
  6.        {
  7.            MessageBox.Show( $@"Resource interception:{request.Url}");
  8.             string type = request.Url.EndsWith( ".js") ? "js" : "css"; // Here is a simple judgment of js or css, but write more
  9.             string fileName = null;
  10.             using (OpenFileDialog openFileDialog = new OpenFileDialog())
  11.            {
  12.                openFileDialog.Filter = $@"{type}file|*.{type}"; // filter
  13.                openFileDialog.Multiselect = true;
  14.                 if (openFileDialog.ShowDialog() == DialogResult.OK)
  15.                {
  16.                    fileName = openFileDialog.FileName;
  17.                }
  18.            }
  19.             if ( string.IsNullOrWhiteSpace(fileName))
  20.            {
  21.                 // There is no file selected, or go to the default Handler
  22.                 return base.GetResourceHandler(chromiumWebBrowser, browser, frame, request);
  23.            }
  24.             // Otherwise, use the selected resource to return
  25.             return new MyResourceHandler(fileName);
  26.        }
  27.         return base.GetResourceHandler(chromiumWebBrowser, browser, frame, request);
  28.    }
  29. }
 
 

IResourceHandler

 

According to the above, we further explore IResourceHandler. There is a default implementation of this Handler officially: RequestHandler. This Handler can be known as a network loaded Handler by reading the source code. Here, in order to implement our custom interception strategy, we'd better implement our own IResourceHandler separately. For this interface, there are the following comments:

 
 
   
  1. /// <summary>
  2. /// Class used to implement a custom resource handler. The methods of this class will always be called on the CEF IO thread.
  3. /// Blocking the CEF IO thread will adversely affect browser performance. We suggest you execute your code in a Task (or similar).
  4. /// To implement async handling, spawn a new Task (or similar), keep a reference to the callback. When you have a
  5. /// fully populated stream, execute the callback. Once the callback Executes, GetResponseHeaders will be called where you
  6. /// can modify the response including headers, or even redirect to a new Url. Set your responseLength and headers
  7. /// Populate the dataOut stream in ReadResponse. For those looking for a sample implementation or upgrading from
  8. /// a previous version <see cref="T:CefSharp.ResourceHandler" />. For those upgrading, inherit from ResourceHandler instead of IResourceHandler
  9. /// add the override keywoard to existing methods e.g. ProcessRequestAsync.
  10. /// </summary>
  11. public interface IResourceHandler : IDisposable
  12. { ... }
 
 

The annotation meaning of this class is roughly: we can implement the processing class of user-defined resources by implementing this interface. The method in this class is always invoked in the IO thread of CEF. However, blocking CEF IO threads will be detrimental to browser performance. So the official recommends that developers execute asynchronously by putting their processing code in Task (or similar asynchronous programming framework), and then invoke callback's corresponding operation function (continue, cancel, etc.) in asynchronism when completing or canceling (failure). When you have a fully populated Stream, execute callback (this step corresponds to the Open method). Once the callback is executed, the GetResponseHeaders method will be called, so you can modify the contents of the reply, including headers, in this method, or even redirect to a new Url. Set your own reponseLength and headers. Next, implement and fill the dataOut Stream in the ReadResponse function (which is actually about to be invalidated, but Read). Finally, CEF will Read data from the Stream to obtain resource data.

 

In fact, there are many ways to implement this Handler. Here we implement the simplest one.

 

Dispose

 

For the disposal that usually releases resources, because we only have a Demo here, we leave it blank for the time being.

 

Open(ProcessRequest)

 

The official note indicates that ProcessRequest will be abandoned and changed to Open in the near future. So ProcessRequest directly returns true. For the Open method, its comments tell us:

 
  • To perform resource processing (synchronization) immediately, set the handleRequest parameter to true and return true
  • Decide whether to process the resource later (asynchronously), set handleRequest to false, call the continue and cancel methods corresponding to callback to continue or cancel the request processing, and the current Open returns false.
  • To cancel the processing of resources immediately, set handleRequest to true and return false.
 

That is, true or false of handleRequest determines whether to process synchronously or asynchronously. If synchronization occurs, Cef will immediately decide whether to continue or cancel the follow-up through the return value of Open, true or false. If it is asynchronous, Cef will check the call of callback in an asynchronous way (the callback here is actually triggered by creating a Task callback). Here, we choose the synchronous method (there is no problem selecting asynchronous) and write the following code:

 
 
   
  1. public bool Open(IRequest request, out bool handleRequest, ICallback callback)
  2. {
  3.    handleRequest = true;
  4.     return true;
  5. }
 
 

GetResponseHeaders

 

In the previous section, we have completed the analysis of the resource data entry (Open). Now that we have told Cef that we are ready to start processing resource requests, we obviously need to start processing resources next. According to the previous summary comments, we need to implement the GetResponseHeaders method, because this is the second step of resource processing. The notes of this method are as follows:

 
 
   
  1. /// <summary>
  2. /// Retrieve response header information. If the response length is not known
  3. /// set <paramref name="responseLength" /> to -1 and ReadResponse() will be called until it
  4. /// returns false. If the response length is known set <paramref name="responseLength" />
  5. /// to a positive value and ReadResponse() will be called until it returns
  6. /// false or the specified number of bytes have been read.
  7. ///
  8. /// It is also possible to set <paramref name="response" /> to a redirect http status code
  9. /// and pass the new URL via a Location header. Likewise with <paramref name="redirectUrl" /> it
  10. /// is valid to set a relative or fully qualified URL as the Location header
  11. /// value. If an error occured while setting up the request you can call
  12. /// <see cref="P:CefSharp.IResponse.ErrorCode" /> on <paramref name="response" /> to indicate the error condition.
  13. /// </summary>
  14. void GetResponseHeaders(IResponse response, out long responseLength, out string redirectUrl);
 
 

Summary translation is explained as follows: get the response header information. If the data length of the response is unknown, set the responseLength to - 1, and then CEF will call ReadResponse (to be abolished, actually the Read method) until the Read method returns false. If the length of the response data is known, you can directly set the length of responseLength to a positive number, and then ReadResponse (Read) will be called until the Read method returns false or the byte length of the Read data reaches the set value of responseLength. Of course, you can also set response The statuscode value is the redirection value (30x) and the redirectUrl is the corresponding redirection Url to realize resource redirection.

 

In this article, we take a simple way: directly return the length of the resource, and then give it to the next Read method for real resource processing. In this step, we write to obtain the byte data of the local file to realize the local loading of js and css files, and save the data in the private variable of the ResourceHanlder instance.

 
 
   
  1. public void GetResponseHeaders(IResponse response, out long responseLength, out string redirectUrl)
  2. {
  3.     using (FileStream fileStream = new FileStream( this._localResourceFileName, FileMode.Open, FileAccess.Read))
  4.    {
  5.         using (BinaryReader binaryReader = new BinaryReader(fileStream))
  6.        {
  7.             long length = fileStream.Length;
  8.             this._localResourceData = new byte[length];
  9.             // Read the contents of the file and save it to the private variable byte array
  10.            binaryReader.Read( this._localResourceData, 0, this._localResourceData.Length);
  11.        }
  12.    }
  13.    responseLength = this._localResourceData.LongLength;
  14.    redirectUrl = null;
  15. }
 
 

Read

 

The definition and notes of this method are as follows:

 
 
   
  1. /// <summary>
  2. /// Read response data. If data is available immediately copy up to
  3. /// dataOut.Length bytes into dataOut, set bytesRead to the number of
  4. /// bytes copied, and return true. To read the data at a later time keep a
  5. /// pointer to dataOut, set bytesRead to 0, return true and execute
  6. /// callback when the data is available (dataOut will remain valid until
  7. /// the callback is executed). To indicate response completion set bytesRead
  8. /// to 0 and return false. To indicate failure set bytesRead to &lt; 0 (e.g. -2
  9. /// for ERR_FAILED) and return false. This method will be called in sequence
  10. /// but not from a dedicated thread.
  11. ///
  12. /// For backwards compatibility set bytesRead to -1 and return false and the ReadResponse method will be called.
  13. /// </summary>
  14. bool Read(Stream dataOut, out int bytesRead, IResourceReadCallback callback);
 
 

The translation of Summary is roughly: read response data. If the data is available immediately, you can directly put dataOut Copy the byte data of length to the dataOut stream, then set the value of bytesRead as the byte length value of the copied data, and finally return true. If the developer wants to continue holding the reference of dataOut (the comment is a pointer, but I think it's better to write it here as a reference to the dataOut) and fill the data stream later, you can set bytesRead to 0, execute the callback operation function when the data is ready asynchronously, and then return true immediately. (the dataOut stream will not be released until the callback is called). To let the CEF know that the current response data has been filled, set bytesRead to 0 and return false. To let the CEF know that the response fails, you need to set bytesRead to a number less than zero (for example, ERR_FAILED: -2), and then return false. This method will be called in turn, but not in a proprietary thread.

 

According to the above notes, it is summarized as follows:

 
  • Bytesread > 0, return true: data is filled, but Read will be called
  • bytesRead = 0, return false: the data is filled in, and it is the last call
  • Bytesread < 0, return false: error, current is the last call
  • bytesRead = 0, return true:CEF does not release the dataOut stream, and callback is invoked after the data is ready in the asynchronous call.
 

For this example, we add a private variable of this class_ dataReadCount is used to identify the number of read resource data bytes and is initialized to 0 in the constructor.

 

Every time you Read in Read, first check the number of bytes remaining to be Read this_ localResourceData. LongLength - this._ Datareadcount. If the value is zero, it indicates that all data has been copied to the periphery through dataOut. At this time, set bytesRead to 0 and directly return false; If the remaining value is greater than 0, you need to continue the copy operation. However, it should be noted that dataOut is not an infinite stream, but a stream similar to cache. Its Length value is 2 ^ 16 = 65536, so we need to set bytesRead to let the periphery know how many bytes of data we actually put in this stream. While using stream When writing API, you need to set the correct offset and count.

 

Finally, the implementation of Read is as follows:

 
 
   
  1. public bool Read(Stream dataOut, out int bytesRead, IResourceReadCallback callback)
  2. {
  3.    int leftToRead = this._localResourceData.Length - this._dataReadCount;
  4.     if (leftToRead == 0)
  5.    {
  6.        bytesRead = 0;
  7.         return false;
  8.    }
  9.    int needRead = Math.Min((int)dataOut.Length, leftToRead); // The maximum is dataout Lenght
  10.    dataOut.Write( this._localResourceData, this._dataReadCount, needRead);
  11.     this._dataReadCount += needRead;
  12.    bytesRead = needRead;
  13.     return true;
  14. }
 
 

Several other methods

 

The Cancel and Skip methods will not be called in this example, so the default implementation is used here without discussion. Interested partners can study it by themselves.

 

Final effect

 

Through the above code design and writing, we finally completed a simple Example of resource interception and custom processing. First, we load our web page without resource interception:

 

 
   
 

You can see the words "test1 js file" in the interface and the background color is sea blue. Next, we open the resource interception and load the page again. During the loading process, there will be a pop-up window for the interception of corresponding resources and we need to select our customized resource file:

 

 
   
  JS intercepted  

 
   
  Manually change the loaded JS  

 
   
  CSS intercepted  

 
   
  Manually change the loaded CSS  

After processing, the following display page is obtained:

 

 
   
 

Source code

 

The whole Demo source code is relatively simple, which has been verified in this article

 CefSharpResourceInterceptExample​github.com

Topics: C#