C# under Winform, WebKit, Geckofx, CefSharp comparison and CefSharp code are used for implementation

Posted by recurzion on Tue, 28 Dec 2021 15:18:59 +0100


The winform project is used to realize similar browser development, obtain page html metadata, and implement operations. The following components can be used:

browserkernelcompatibleGet cookies
VS built-in webBrowserIEworstIncomplete
WebKitFirefoxcommonlyIncomplete
GeckofxFirefoxgoodIncomplete
CefSharpChrome (Google)goodwhole

As long as the browser is needed in the project, CefSharp is strongly recommended, and the support for js is also good. You can directly execute js code.

1, Install CefSharp using the NuGet package manager that comes with visual studio 2021

The new version of visual studio 2021 comes with NuGet tool. You can see the relevant menu of NuGet package manager in the tool on the menu bar. The installation and use of NuGet tool are described below.
1. Tools → NuGet package manager → package management console
Here, you can enter the execution command on the console (as shown below):

2. If VS does not have nuget tool, go to "tools → extension and update" to search nuget

If you click the tool and don't see Nuget, please note that the Chinese name is: library package manager (N)

3. Use of nuget

After your vs has installed the extension NuGet, you can right-click reference in the project to see the right-click menu:
Click Manage NuGet package, search CefSharp and install all CefSharp packages or the three packages in the red box below
After installation, CefSharp has been referenced in the project solution, as shown in the following figure

Next, you can also see the CefSharp component in the toolbox, which can be dragged to the Form for use.

2, Classes written on the c# side expose functions to CefSharp side pages

The methods in the class written later in C# can be called by the js code loaded in the chromium WebBrowser component by registering with the chromium WebBrowser component.

1. Write the C# end class CefCustomJSObject for front-end call

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using CefSharp.WinForms;
using System.Windows.Forms;
namespace WindowsFormsApp1.bll
{
    public class CefCustomJSObject
    {
        // The chromium WebBrowser instance passed in to the main framework
        private static ChromiumWebBrowser _instanceBrowser = null;
        public CefCustomJSObject(ChromiumWebBrowser originalBrowser)
        {
            _instanceBrowser = originalBrowser;
        }
        /// <summary>
        ///A pop-up dialog box is displayed to display the methods called by the front-end JS
        /// </summary>
        ///< param name = "MSG" > pass in parameter < / param >
        public void showAlertMsg(string msg)
        {
            MessageBox.Show("You start from the front end js The information passed in by the call is [" + msg + "]");
        }
    }
}

2. Register the exposed CefCustomJSObject class with the chromium WebBrowser component

            //This sentence code needs to be added, otherwise an error will be reported in the following execution. Cefsharpsettings needs to be set in the lower version LegacyJavascriptBindingEnabled = true;
            //The higher version is set this way
            chromiumWebBrowser1.JavascriptObjectRepository.Settings.LegacyBindingEnabled = true;
            // The following code can be commented out. Setting WcfEnabled only affects the internal communication mode
            //CefSharpSettings.WcfEnabled = true;
            //(bound is the name of the object called by the front end. For example: bound.showAlertMsg();CefCustomJSObject is C# exposed Class object, and the method called by corresponding js is boundobject showAlertMsg
            chromiumWebBrowser1.JavascriptObjectRepository.Register("bound", new CefCustomJSObject(chromiumWebBrowser1), isAsync: false, options: BindingOptions.DefaultBinder);

3. Call the method in the exposed class at the front end

 var frame = chromiumWebBrowser1.GetMainFrame();
 var task = frame.EvaluateScriptAsync("(function() {   bound.showAlertMsg(\"Test content\"); })();", null);

The top code registers a javascript function at the forward end, which calls the showAlertMsg method in the second step registered bound.
The operation results are as follows:

3, CefSharp request resource interception and custom processing

Through the implementation of IRequestHandler interface, the interception processing of each request of chromium WebBrowser component is self-defined, such as downloading all image resources and css resources in a page.
The IRequestHandler interface is defined in the CefSharp namespace to implement this interface to handle events related to browser requests (such methods will be called on the indicated thread).
You do not need to implement the IRequestHandler interface by yourself, because in cefsharp There is a default implementation class RequestHandler of IRequestHandler interface in the handler namespace. We can use a convenient base class when customizing processing.

1. Customize the IRequestHandler interface

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using CefSharp.Handler;
using CefSharp;
namespace WindowsFormsApp1.bll
{
    /// <summary>
    ///Custom implementation MyRequestHandler class, inherited from cefsharp Handler. RequestHandler
    ///Only two methods, GetResourceRequestHandler and OnBeforeBrowse, are implemented
    /// </summary>
    public class MyRequestHandler : RequestHandler
    {
        /// <summary>
        ///Called on the CEF IO thread before a resource request is initialized
        /// </summary>
        /// <param name="chromiumWebBrowser"></param>
        /// <param name="browser"></param>
        /// <param name="frame"></param>
        /// <param name="request"></param>
        /// <param name="isNavigation"></param>
        /// <param name="isDownload"></param>
        /// <param name="requestInitiator"></param>
        /// <param name="disableDefaultHandling"></param>
        /// <returns></returns>
        protected override IResourceRequestHandler GetResourceRequestHandler(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, bool isNavigation, bool isDownload, string requestInitiator, ref bool disableDefaultHandling)
        {
            Console.WriteLine("GetResourceRequestHandler:"+request.Url);
            return base.GetResourceRequestHandler(chromiumWebBrowser, browser, frame, request, isNavigation, isDownload, requestInitiator, ref disableDefaultHandling);
        }
        /// <summary>
        //Call before browser navigation
        /// </summary>
        /// <param name="chromiumWebBrowser"></param>
        /// <param name="browser"></param>
        /// <param name="frame"></param>
        /// <param name="request"></param>
        /// <param name="userGesture"></param>
        /// <param name="isRedirect"></param>
        /// <returns></returns>
        protected override bool OnBeforeBrowse(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, bool userGesture, bool isRedirect)
        {
            Console.WriteLine("OnBeforeBrowse" + request.Url);
            return base.OnBeforeBrowse(chromiumWebBrowser, browser, frame, request, userGesture, isRedirect);
        }
    }
}

As can be seen from the above code, I only overridden two methods, onbeforebrowse and getresourcerequesthandler. The OnBeforeBrowse method only executes once every time the page url is loaded, and is called before the browser navigates. The GetResourceRequestHandler method is called on the CEF IO thread before a resource request is initialized, that is to say, if a page is opened, there are multiple request resources, such as multiple css links, multiple picture links, etc. The getresourcerequesthandler method is executed once for each link request.
If I want to automatically download each css resource, js resource and image resource, I can intercept it in the GetResourceRequestHandler method to realize automatic download.

//To determine what kind of resource it is, this if code block should be written before the return of IResourceRequestHandler
 if (request.Url.EndsWith("test1.js") || request.Url.EndsWith("test1.css"))
        {
            MessageBox.Show($@"Resource interception:{request.Url}");

            string type = request.Url.EndsWith(".js") ? "js" : "css"; 		// Here, simply judge whether js or css, but write more
            //Here you can write a resource auto save program

            
        }

The return value type of GetResourceRequestHandler method is IResourceRequestHandler interface type. IResourceRequestHandler interface is also in CefSharp namespace. This interface handles events related to browser requests. Unless otherwise specified, this class of method will be called in CEF IO thread.

The IResourceRequestHandler interface is in cefsharp The handler namespace also has a default implementation class ResourceRequestHandler, which implements the methods in the interface. We can use this class as the base class for derivation and implementation during custom development.

public class MyResourceRequestHandler : ResourceRequestHandler
    {
        private readonly System.IO.MemoryStream memoryStream = new System.IO.MemoryStream();
        protected override IResourceHandler GetResourceHandler(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request)
        {
            //Get the parameter data in the request body of the post request
          /*  if (request.Url.ToLower().Contains("dologin".ToLower()))
            {
                using (var postData = request.PostData)
                {
                    if (postData != null)
                    {
                        var elements = postData.Elements;
                        //Gets the charset property of the character encoding in the request
                        var charSet = request.GetCharSet();

                        foreach (var element in elements)
                        {
                            //PostDataElementType There are three types of enumeration: Empty type, Bytes byte type and File file type
                            if (element.Type == PostDataElementType.Bytes)
                            {
                                //Gets a string in the specified character set format from IPostDataElement
                                var body = element.GetBody(charSet);
                                Console.WriteLine(body);
                            }
                        }
                    }
                }
            }*/
            // return new MyResourceHandler("");
            return null;
        }

        protected override IResponseFilter GetResourceResponseFilter(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response)
        {
            return new CefSharp.ResponseFilter.StreamResponseFilter(memoryStream);
        }

        protected override void OnResourceLoadComplete(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response, UrlRequestStatus status, long receivedContentLength)
        {
            var bytes = memoryStream.ToArray();
            memoryStream.Close();
            memoryStream.Dispose();
           var charSet = request.GetCharSet();
            //Modify form content here
            Encoding encoding = Encoding.UTF8;
            if (charSet != null)
            {
                try
                {
                    encoding = Encoding.GetEncoding(charSet);
                }
                catch (ArgumentException)
                {
                }
            }
            string json = encoding.GetString(bytes);
            Console.WriteLine(json);
            JObject jObj = JObject.Parse(json);
            Console.WriteLine(jObj["total"].Value<int>());
            Info info=bll.JsonUtil.DeserializeObject<Info>(json);
            JArray jArray = JArray.Parse(jObj["data"].ToString());
            base.OnResourceLoadComplete(chromiumWebBrowser, browser, frame, request, response, status, receivedContentLength);
        }
    }

    public class Info
    {
        public long total { get; set; }
        public int page { get; set; }
        public List<Data> data{ get; set; }
    }
    public class Data
    {
        public long id { get; set; }
        public string year { get; set; }
        public string season { get; set; }
    }
You can see that three methods have been overridden in a class, namely GetResourceHandler, GetResourceResponseFilter, OnResourceLoadComplete has three methods.

If the GetResourceHandler method returns null, Cef will use the default network loader to initiate the request, Or we can return a custom resource Handler to process a legal Stream, that is, for resource processing, we need to implement custom processing (instead of interception, we can handle interception in the above two handlers so far.) we also need to implement an instance of IResourceHandler interface and return it at GetResourceHandler before Cef can use our Handler for processing.

GetResourceResponseFilter method can filter and modify the response data. A default implementation class cefsharp is given in the new version of cef ResponseFilter. StreamResponseFilter: a stream is passed in during initialization of this class. In the OnResourceLoadComplete method, the stream can obtain the data returned in the response. If this filter class is used, we can inherit cefsharp The iresponsefilter interface is rewritten and improved. For specific code, please refer to the source code of the StreamResponseFilter class.

OnResourceLoadComplete method executes this method when the resource loading is completed. The stream passed in the previous method GetResourceResponseFilter implements the filling of response response response data in this method, and the data can be read in this method;

OK, now let's summarize the calling relationships of these interfaces and methods

The IRequestHandle interface has a default instance, cefsharp Handler. RequestHandler
This class mainly overrides two methods after inheritance:

  • GetResourceRequestHandler method
    In this method, if null is returned, it will be loaded by the default processor. In the method, you can use the request related information in the request parameter, such as url, data in postData in post request, request header information, etc.
    If you need to intercept the data in the response, you need to implement the IResourceRequestHandler interface instance and return.
    Here is a small example to get the parameters of post type request in GetResourceRequestHandler method:
//Intercept the address containing dologin in the url
if (request.Url.ToLower().Contains("dologin".ToLower()))
             {
                 using (var postData = request.PostData)
                 {
                     if (postData != null)
                     {
                         var elements = postData.Elements;
                         //Gets the charset property of the character encoding in the request
                         var charSet = request.GetCharSet();

                         foreach (var element in elements)
                         {
                             //The PostDataElementType enumeration has three types: Empty type, Bytes type, and File file type
                             if (element.Type == PostDataElementType.Bytes)
                             {
                                 //Gets a string in the specified character set format from IPostDataElement
                                 var body = element.GetBody(charSet);
                                 //Modify form content here
                                 Encoding encoding = Encoding.Default;
                                 if (charSet != null)
                                 {
                                     try
                                     {
                                         encoding = Encoding.GetEncoding(charSet);
                                     }
                                     catch (ArgumentException)
                                     {
                                     }
                                 }
                                 //Convert to byte array. Here, the parameters are intercepted and edited again after adding ss=1
                                 element.Bytes = encoding.GetBytes(body + "&ss=1");
                             }
                         }
                     }
                 }
             }
           return null;

Or if we want to get the ajax result of a request, this method can be written as follows:

//Intercept the specified request
if (request.Url == "http://192.168.1.221/vue/conditionDataHandler" || request.Url.ToLower() == "http://192.168.1.221/vue/conditionDataHandler".ToLower())
 {
    //Customize the data intercepted from the response in the MyResourceRequestHandler class
    return new   MyResourceRequestHandler();
  }

The MyResourceRequestHandler class implements cefsharp Handler. A derived class of the resourcerequesthandler class.

  • OnBeforeBrowse method
    Call before browser navigation

The IResourceRequestHandler interface has a default instance cefsharp Handler. ResourceRequestHandler
This class mainly overrides three methods after inheritance:

  • GetResourceHandler method
    This method returns an IResourceHandler instance in which the response parameter can be obtained, where the response stream data can be intercepted and modified.
  • GetResourceResponseFilter method
    In combination with the following methods, return an IResponseFilter instance, cefsharp ResponseFilter. StreamResponseFilter is an IResponseFilter instance implemented by default. This class can obtain response data by passing in a stream. When implementing IResponseFilter in a custom way, you can refer to the source code of StreamResponseFilter class.
  • OnResourceLoadComplete method
    Execute when the resource is loaded. Here you can get the response data of the stream in the previous method.

Here is an example to get ajax request data

 private readonly System.IO.MemoryStream memoryStream = new System.IO.MemoryStream();
 protected override IResponseFilter GetResourceResponseFilter(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response)
 {
   return new CefSharp.ResponseFilter.StreamResponseFilter(memoryStream);
 }

protected override void OnResourceLoadComplete(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response, UrlRequestStatus status, long receivedContentLength)
 {
            var bytes = memoryStream.ToArray();
            memoryStream.Close();
            memoryStream.Dispose();
           var charSet = request.GetCharSet();
            //Modify form content here
            Encoding encoding = Encoding.UTF8;
            if (charSet != null)
            {
                try
                {
                    encoding = Encoding.GetEncoding(charSet);
                }
                catch (ArgumentException)
                {
                }
            }
            //Get the response result of ajax. The specific url is not distinguished in the method
            string json = encoding.GetString(bytes);
            Console.WriteLine(json);
            base.OnResourceLoadComplete(chromiumWebBrowser, browser, frame, request, response, status, receivedContentLength);
}

Implementation of IResourceHandler interface

public class MyResourceHandler : IResourceHandler
    {
        string _localResourceFileName;
        public MyResourceHandler(string localResourceFileName)
        {
            this._localResourceFileName = localResourceFileName;
        }

        public void Cancel()
        {
            throw new NotImplementedException();
        }
        //For the disposal that usually releases resources, because we only have a Demo here, we leave it blank for the time being.
        public void Dispose()
        {
            throw new NotImplementedException();
        }
        /// <summary>
        ///Get response header information. If the data length of the response is unknown, set the responseLength to - 1,
        ///Then, CEF will call ReadResponse (to be abolished, actually Read method) until the Read method returns false.
        ///If the length of the response data is known, you can directly set the length of responseLength to a positive number,
        ///Then, ReadResponse (Read) will be called until the Read method returns false or when
        ///The byte length of the read data has reached the set value of responseLength. Of course, you can also pass
        ///Set response The statuscode value is the redirected value (30x) and the redirectUrl is the corresponding redirected Url to implement resource redirection
        /// </summary>
        /// <param name="response"></param>
        /// <param name="responseLength"></param>
        /// <param name="redirectUrl"></param>
        /// <exception cref="NotImplementedException"></exception>
        public void GetResponseHeaders(IResponse response, out long responseLength, out string redirectUrl)
        {
            //Reorientation
            responseLength = -1;
            response.StatusCode = 301;
            redirectUrl ="http://www.baidu.com";
        }
        /// <summary>
        ///The official note indicates that ProcessRequest will be deprecated and changed to Open in the near future. So ProcessRequest we directly return true. For the Open method, its comments tell us:
        ///To perform resource processing (synchronization) immediately, set the handleRequest parameter to true and return true
        ///Decide whether to process the resource later (asynchronously), set handleRequest to false, call the continue and cancel methods corresponding to callback to continue or cancel the request processing, and the current Open returns false.
        ///To cancel the processing of resources immediately, set handleRequest to true and return false.
        ///That is, true or false of handleRequest determines whether to process synchronously or asynchronously. In case of synchronization, Cef will immediately decide whether to continue or cancel the follow-up through the return value of Open, true or false. If it is asynchronous, Cef will check the call of callback asynchronously (the callback here is actually triggered by creating a Task callback)
        /// </summary>
        /// <param name="request"></param>
        /// <param name="handleRequest"></param>
        /// <param name="callback"></param>
        /// <returns></returns>
        /// <exception cref="NotImplementedException"></exception>
        public bool Open(IRequest request, out bool handleRequest, ICallback callback)
        {
            handleRequest = true;
            return true; ;
        }

        public bool ProcessRequest(IRequest request, ICallback callback)
        {
            throw new NotImplementedException();
        }
        /// <summary>
        ///Read response data. If the data is available immediately, you can directly put dataout Length
        ///Copy the byte data of length to the dataOut stream, and then set the value of bytesRead as the copied data
        ///Byte length value, and finally return true. If developers want to continue to hold references to dataOut
        ///(the comment is a pointer, but I think it's better to write it here as a reference to the dataOut.)
        ///Then fill in the data stream later. You can set bytesRead to 0 and start the data stream asynchronously
        ///When the data is ready, execute the callback operation function, and then return true immediately.
        ///(the dataOut stream will not be released until the callback is called).
        ///To let CEF know that the current response data has been filled, set bytesRead to 0 and return false.
        ///To let the CEF know that the response fails, you need to set bytesRead to a number less than zero (for example, ERR_FAILED: -2),
        ///Then return false. This method will be called in turn, but not in a proprietary thread.
        ///According to the above notes, it is summarized as follows:
        ///Bytesread > 0, return true: data is filled, but Read will be called
        ///bytesRead = 0, return false: data filling is completed, and the current is the last call
        ///Bytesread < 0, return false: error, current is the last call
        /// bytesRead = 0, return true:CEF does not release dataOut flow, and callback is invoked when data is ready in asynchronous calls.
        /// </summary>
        /// <param name="dataOut"></param>
        /// <param name="bytesRead"></param>
        /// <param name="callback"></param>
        /// <returns></returns>
        /// <exception cref="NotImplementedException"></exception>
        public bool Read(Stream dataOut, out int bytesRead, IResourceReadCallback callback)
        {
            throw new NotImplementedException();
        }

        public bool ReadResponse(Stream dataOut, out int bytesRead, ICallback callback)
        {
            throw new NotImplementedException();
        }

        public bool Skip(long bytesToSkip, out long bytesSkipped, IResourceSkipCallback callback)
        {
            throw new NotImplementedException();
        }

Here we mainly look at the annotation of the method. There is no specific implementation method. Here we need to emphasize a scenario. If you want to tamper with the response data of a request, you can implement it in the open method of IResourceHandler interface or the Filter method of IResponseFilter interface. For specific implementation methods, please refer to cefsharp ResponseFilter. Streamresponsefilter source code.

Refer to the following article for how to write in the open method to introduce local html, css, js and other files Portal

Topics: C# chrome webkit