1. Problem background
1.1 what is front-end recording and playback?
As the name suggests, it is to record various operations of users in the web page, and support playback operations at any time.
1.2 why?
When it comes to needs, we have to say a classic scenario. Generally, the front end does exception monitoring and error reporting. It will collect and report JavaScript error reporting information and other relevant data in the process of website interaction in the form of self-development or access to a third-party SDK, that is, embedding points.
In the traditional buried point scheme, the specific error reporting code file and row and column information can be located according to SourceMap. It can basically locate most of the scene problems, but in some cases it is difficult to reproduce the errors. Most of them are one of the programmer's mantras during the test wrangling (I didn't report an error here, is there a problem with your computer).
If only we could record the wrong operation process, it would be convenient for us to reproduce the scene and retain the evidence, as if we had dug a hole for ourselves.
Add: I am a reprint. Our demand is to replay the whole insurance process of users (our company is an insurance company)
1.3 how to achieve?
Can the front end record video? My first reaction was to question, and then I was a wave of Google and found that there was a feasible solution.
Before Google, I thought of setting a timer to take a screenshot of the view window. The screenshot can be realized by canvas 2html, but this method will undoubtedly cause performance problems and will be rejected immediately.
Here's what I "know" about Google's solution. If you have any questions, please correct them.
2. Initial ideas
Web page is essentially a DOM node, which is rendered by browser. Whether we can save the DOM in some way and continuously record the DOM data state at different time nodes. Then restore the data to DOM node and render it to complete playback?
2.1 operation record
Through document documentElement. CloneNode () is a data object cloned into DOM. At this time, this data cannot be directly transmitted to the back end through the interface. Some formatting preprocessing needs to be carried out to process it into a data format convenient for transmission and storage. The simplest way is to serialize, that is, convert to JSON data format.
let docJSON = { "type": "Document", "childNodes": [ { "type": "Element", "tagName": "html", "attributes": {}, "childNodes": [ { "type": "Element", "tagName": "head", "attributes": {}, "childNodes": [] } ] } ] }
After having complete DOM data, you also need to listen when the DOM changes and record the DOM node information of each change. To monitor data, you can use MutationObserver, which is an API that can monitor DOM changes.
const observer = new MutationObserver(mutationsList => { console.log(mutationsList); // Changed data }); // Start observing the target node with the above configuration observer.observe(document, {});
In addition to monitoring DOM changes, there is also event monitoring. Users interact with web pages mostly through input devices such as mouse and keyboard. Behind these interactions is JavaScript event monitoring. Event monitoring can be completed by binding system events, which also need to be recorded. Take mouse movement as an example:
// Mouse movement document.addEventListener('mousemove', e => { // The pseudo code obtains the information of mouse movement and records it positions.push({ x: clientX, y: clientY, timeOffset: Date.now() - timeBaseline, }); });
Note: addEventListener can bind multiple same events without affecting the developer's event binding
2.2 playback operation
The data already exists, and then there is playback. Playback essentially restores JSON data into DOM nodes and renders them. It's not so easy to restore data!
2.3 rendering environment
First of all, in order to ensure code isolation during playback, a sandbox environment is required, which can be achieved by iframe tag, and iframe provides a sandbox attribute to configure sandbox. The function of sandbox environment is to ensure that the code is safe and undisturbed.
<iframe sandbox srcdoc></iframe>
Sandbox attribute can be used as a sandbox. Click to view the document
srcdoc can be set directly as a piece of html code
2.4 data restore
Snapshot reorganization is mainly the reorganization of DOM nodes, which is a bit like the process of transforming virtual DOM into real document nodes, but event type snapshots do not need reorganization.
2.5 timer
With data and environment, timers are also needed. Rendering DOM continuously through the timer is essentially an effect of playing video, and requestAnimationFrame is the most appropriate.
requestAnimationFrame will be executed in each frame to avoid congestion, which is a more appropriate choice than setTimeout
So far, I have a general idea that it is still a distance from landing. Thanks to open source, we can go to Github to see if there is a suitable wheel to copy (learn from), and there is just a ready-made framework rrweb might as well have a look together.3. rrweb framework
rrweb is a front-end recording and playback framework. Its full name is record and replay the web. As its name suggests, it can record and replay the operations in the web interface. Its core principle is the scheme introduced above.
3.1 composition of rrweb
rrweb consists of three parts:
- Rrweb snapshot mainly deals with the serialization and reorganization of DOM structure;
- The main functions of rrweb are recording and playback;
- Rrweb player a video player UI space
3.2 rrweb usage
npm installation is common, and import/require is not a big problem
3.2.1 recording
Via rrweb Record method to record the page, and the emit callback can accept the recorded data.
// 1. Recording let events = []; // Record snapshot rrweb.record({ emit(event) { // Store event into the events array events.push(event); }, });
3.2.2 playback
Via rrweb Replay can play back the video and need to transfer the recorded data.
// 2. Playback const replayer = new rrweb.Replayer(events); replayer.play();
4. rrweb source code
According to the above ideas, I will analyze some of the key codes. Of course, it's just some analysis based on my personal understanding. In fact, rrweb source code is far more than that.
The core part consists of three blocks: record, replay and snapshot.
4.1 Record recording
After the DOM is loaded, record will do a complete DOM serialization. We call it full snapshot, which records the whole HTML data structure.
In record Find the definition init of the key entry function in TS. the entry function calls takeFullSnapshot and observe(document) functions when the document loading is completed or (interactive, complete).
if ( document.readyState === 'interactive' || document.readyState === 'complete' ) { init(); } else { //... on('load',() => { init(); },), } const init = () => { takeFullSnapshot(); // Generate full snapshot handlers.push(observe(document)); //monitor };
document.readyState contains three states:
1. interactive;
2. loading;
3. complete
From the literal meaning, takeFullSnapshot is used to generate a "complete" snapshot, that is, it will sequence document s into a complete data, which is called full snapshot.
All serialization related operations are completed using snapshot, which accepts a dom object and a configuration object, passes document, and serializes the whole page to get the completed snapshot data.
// Generate full snapshot takeFullSnapshot = (isCheckout = false) => { //... const [node, idNodeMap] = snapshot(document, { //... Some configuration items }); //... }
idNodeMap is a key value key value pair object with id key and DOM object value
observe(document) is the initialization of some listeners. It also passes the whole document object to listen, and initializes some listeners by calling initObservers.
const observe = (doc: Document) => { return initObservers(//...) }
On observer The initObservers function definition can be found in the TS file. This function initializes 11 listeners, which can be divided into three categories: DOM type / Event type / Media media:
export function initObservers( // dom const mutationObserver = initMutationObserver(); const mousemoveHandler = initMoveObserver(); const mouseInteractionHandler = initMouseInteractionObserver(); const scrollHandler = initScrollObserver(); const viewportResizeHandler = initViewportResizeObserver(); // ... )
DOM change listener mainly includes DOM change (addition, deletion and modification) and style change. The core is realized through MutationObserver
let mutationObserverCtor = window.MutationObserver; const observer = new mutationObserverCtor( // Handling changing data mutationBuffer.processMutations.bind(mutationBuffer), ); observer.observe(doc, {}); return observer;
Interactive monitoring - Take mouse movement initMoveObserver as an example
// Mouse movement record function initMoveObserver() { const updatePosition = throttle<MouseEvent | TouchEvent>( (evt) => { positions.push({ x: clientX, y: clientY, }); }); const handlers = [ on('mousemove', updatePosition, doc), on('touchmove', updatePosition, doc), ]; }
The media type listener includes canvas / video / audio. Taking video as an example, it essentially records the play and pause status, and mediaInteractionCb calls back the play / pause status.
function initMediaInteractionObserver(): listenerHandler { mediaInteractionCb({ type: type === 'play' ? MediaInteractions.Play : MediaInteractions.Pause, id: mirror.getId(target as INode), }); }
5. Snapshot
snapshot is responsible for serialization and reorganization. It mainly handles DOM serialization through serializeNodeWithId and DOM reorganization through rebuildWithSN function.
The serializeNodeWithId function is responsible for serialization and mainly does three things:
- Call serializeNode to serialize Node;
- Generate a unique ID through genId() and bind it to the Node;
- The recursive implementation serializes the child nodes and finally returns an object with ID
// Serialize a DOM with ID export function serializeNodeWithId(n) { // 1. serializeNode, the serialization core function const _serializedNode = serializeNode(n); // 2. Generate unique ID let id = genId(); // Binding ID const serializedNode = Object.assign(_serializedNode, { id }); // 3. Child node serialization recursion for (const childN of Array.from(n.childNodes)) { const serializedChildNode = serializeNodeWithId(childN, bypassOptions); if (serializedChildNode) { serializedNode.childNodes.push(serializedChildNode); } } }
The core of serializeNodeWithId is to serialize DOM through serializeNode, and do some special processing for different nodes.
Processing of node attributes:
for (const { name, value } of Array.from((n as HTMLElement).attributes)) { attributes[name] = transformAttribute(doc, tagName, name, value); }
Handle the external css style, get the specific style code through getCssRulesString, and store it in attributes.
const cssText = getCssRulesString(stylesheet as CSSStyleSheet); if (cssText) { attributes._cssText = absoluteToStylesheet( cssText, stylesheet!.href!, ); }
To process the form, the logic is to save the selected state and do some security processing, such as replacing the content of the password box with *.
if ( attributes.type !== 'radio' && attributes.type !== 'checkbox' && // ... ) { attributes.value = maskInputOptions[tagName] ? '*'.repeat(value.length) : value; } else if (n.checked) { attributes.checked = n.checked; }
Save canvas status save canvas data through toDataURL
attributes.rr_dataURL = (n as HTMLCanvasElement).toDataURL();
rebuild is responsible for rebuilding the DOM:
Reorganize nodes through buildNodeWithSN function
Recursive call to reorganize child nodes
export function buildNodeWithSN(n) { // DOM reorganization core function buildNode let node = buildNode(n, { doc, hackCss }); // Child node reconstruction and appendChild for (const childN of n.childNodes) { const childNode = buildNodeWithSN(childN); if (afterAppend) { afterAppend(childNode); } } }
6. Replay playback
The playback part is in replay TS file, first create the sandbox environment, and then reconstruct the full snapshot of the document. Play the incremental snapshot by simulating the timer through the requestAnimationFrame.
The constructor of replay receives two parameters, snapshot data events and configuration item config
export class Replayer { constructor(events, config) { // 1. Create sandbox environment this.setupDom(); // 2. Timer const timer = new Timer(); // 3. Playback service this.service = new createPlayerService(events, timer); this.service.start(); } }
The core three steps in the constructor are to create a sandbox environment, a timer, and initialize the player and start it. Player creation depends on events and timer. In essence, it still uses timer to play.
6.1 sandbox environment
First, in replay This can be found in the constructor of TS The core of setupDom is to create a sandbox environment through iframe.
private setupDom() { // Create iframe this.iframe = document.createElement('iframe'); this.iframe.style.display = 'none'; this.iframe.setAttribute('sandbox', attributes.join(' ')); }
6.2 playback service
Also in replay In the TS constructor, the createPlayerService function is called to create the player server, which is machine. in the same directory. TS defines that the core idea is to add snapshot actions to the timer and call the timer Start() starts playback of the snapshot.
export function createPlayerService() { //... play(ctx) { // Get the doAction function executed by each event for (const event of needEvents) { //.. const castFn = getCastFn(event); actions.push({ doAction: () => { castFn(); } }) //.. } // Add to timer queue timer.addActions(actions); // Start timer to play video timer.start(); }, //... }
The playback service uses the third-party library @ xstate/fsm state machine to control various states (playback, pause, live broadcast)
Timer TS is also in the same level directory. The core is to realize the timer function through requestAnimationFrame, play back the snapshot, store the snapshot actions to be played in the form of queue, and then call action recursively in start DoAction to realize the snapshot restore of the corresponding time node.
export class Timer { // Add queue public addActions(actions: actionWithDelay[]) { this.actions = this.actions.concat(actions); } // Play queue public start() { function check() { // ... // Loop call doAction in actions, that is, the castFn function while (actions.length) { const action = actions[0]; actions.shift(); // doAction will play back the snapshot and perform different actions for different snapshots action.doAction(); } if (actions.length > 0 || self.liveMode) { self.raf = requestAnimationFrame(check); } } this.raf = requestAnimationFrame(check); } }
doAction will perform different actions in different types of snapshots. In the playback service, doAction will eventually call the getCastFn function to do some case s:
private getCastFn(event: eventWithTime, isSync = false) { switch (event.type) { case EventType.DomContentLoaded: //dom load parsing completed case EventType.FullSnapshot: // Full snapshot case EventType.IncrementalSnapshot: //increment castFn = () => { this.applyIncremental(event, isSync); } } }
The applyIncremental function performs different processing for different incremental snapshots, including DOM increment, mouse interaction, page scrolling, etc. taking the case of DOM incremental snapshot as an example, it will eventually go to applyMutation:
private applyIncremental(){ switch (d.source) { case IncrementalSource.Mutation: { this.applyMutation(d, isSync); // DOM change break; } case IncrementalSource.MouseMove: //Mouse movement case IncrementalSource.MouseInteraction: //Mouse click event //... }
applyMutation is the place where the DOM restore operation is finally performed, including the addition, deletion and modification steps of DOM:
private applyMutation(d: mutationData, useVirtualParent: boolean) { d.removes.forEach((mutation) => { //.. Remove dom }); const appendNode = (mutation: addedNodeMutation) => { // Add dom to specific node }; d.adds.forEach((mutation) => { // add to appendNode(mutation); }); d.texts.forEach((mutation) => { //... text processing }); d.attributes.forEach((mutation) => { //... Attribute processing }); }
The above is the key process implementation code of playback. rrweb not only does this, but also includes data compression, mobile terminal processing, privacy issues and other details. If you are interested, you can check the source code yourself.
last
This idea of recording and playback is really worth learning. The process of reading rrweb source code also benefits a lot. Some uses of data structures in the source code, such as double linked list, queue, tree, etc., are also worth a list.
Reference articles
- rrweb-io/rrweb
- rrweb: open the black box of web page recording and playback