This article was first published on the Front End Team Blog of the Government, Mining and Cloud: Exploration of Node.js Module System Source Code
Node.js enables front-end engineers to work across servers. Of course, the birth of a new operating environment will also bring new modules, functions, or ideological innovations. This article will lead readers to understand the module design ideas of Node.js (hereinafter referred to as Node) and analyze some core source code implementations.
CommonJS specification
Node initially followed the CommonJS specification to implement its own module system, and made some customizations that were different from the specification.The CommonJS specification is a form of module defined to solve the scope problem of JavaScript, which enables each module to execute in its own namespace.
The specification emphasizes that modules must export external variables or functions through module.exports and import the output of other modules into the current module scope through require(), following the conventions:
- In a module, you must expose a require variable, which is a function that accepts a module identifier and require returns the exported API of an external module.Require must throw an error if the required module cannot be returned.
- In a module, there must be a free variable called exports, which is an object whose properties can be mounted on exports at the time of execution.The module must use the exports object as the only way to export.
- In a module, you must have a free variable module, which is also an object.The module object must have an ID attribute, which is the top ID of the module.The ID attribute must be such that require(module.id) returns the exports object from the module from which the module.id originated (that is, the module.id can be passed to another module, and the original module must be returned when it is requested).
Node's implementation of CommonJS specification
- A module.require function within a module and a global require function are defined to load the module.
- In a Node module system, each file is considered a separate module.When a module is loaded, it is initialized as an instance of the Module object, whose basic implementation and properties are as follows:
function Module(id = "", parent) { // Module id, usually the absolute path of the module this.id = id; this.path = path.dirname(id); this.exports = {}; // Current module caller this.parent = parent; updateChildren(parent, this, false); this.filename = null; // Is the module loading complete this.loaded = false; // Modules referenced by the current module this.children = []; }
- Each module exposes its own exports property as a usage interface.
Module Export and Reference
In Node, you can export a variable or function as a whole using the module.exports object, or you can mount the variable or function you want to export onto the properties of the exports object, as shown in the following code:
// 1. Using exports: Authors are used to exporting tool library functions or constants exports.name = 'xiaoxiang'; exports.add = (a, b) => a + b; // 2. Using module.exports: Export an entire object or a single function ... module.exports = { add, minus }
By referencing a module through the global require function, you can pass in the module name, relative path, or absolute path. When the module file suffix is js / json / node, the suffix can be omitted, as shown in the following code:
// Reference Module const { add, minus } = require('./module'); const a = require('/usr/app/module'); const http = require('http');
Matters needing attention:
- The exports variable is available within the file-level scope of the module and is assigned to module.exports before the module executes.
exports.name = 'test'; console.log(module.exports.name); // test module.export.name = 'test'; console.log(exports.name); // test
- If a new value is assigned to exports, it will no longer be bound to module.exports, and vice versa:
exports = { name: 'test' }; console.log(module.exports.name, exports.name); // undefined, test
- When the module.exports attribute is completely replaced by a new object, it is often necessary to reassign exports:
module.exports = exports = { name: 'test' }; console.log(module.exports.name, exports.name) // test, test
Module System Implementation Analysis
Module Location
The following is the code implementation of the require function:
// require entry function Module.prototype.require = function(id) { //... requireDepth++; try { return Module._load(id, this, /* isMain */ false); // Loading modules } finally { requireDepth--; } };
The code above receives a given module path, where requireDepth is used to record the depth of module loading.The class method_load of Module implements the main logic of the Node loading module. Let's parse the source implementation of the Module._load function. For your understanding, I've added comments to this article.
Module._load = function(request, parent, isMain) { // Step 1: Resolve the full path of the module const filename = Module._resolveFilename(request, parent, isMain); // Step 2: Load the module, which can be handled in three situations // Scenario 1: A module with a cache returns the exports property of the module directly const cachedModule = Module._cache[filename]; if (cachedModule !== undefined) return cachedModule.exports; // Scenario 2: Loading an in-house modeling module const mod = loadNativeModule(filename, request); if (mod && mod.canBeRequiredByUsers) return mod.exports; // Scenario 3: Building module loading const module = new Module(filename, parent); // Caching module instances after loading Module._cache[filename] = module; // Step 3: Load module file module.load(filename); // Step 4: Return the exported object return module.exports; };
Load Policy
There is a lot of information in the code above, so let's focus on the following issues:
-
What is the module's caching strategy?
Analyzing the above code, we can see that the _load load load function gives different loading strategies for three cases:- Case 1: Cache hit, return directly.
- Scenario 2: Build-in module, returns the exposed exports attribute, which is the alias of module.exports.
- Scenario 3: Generate a module using a file or third-party code, then return and cache it so that the next time the same access is made, the cache will be used instead of reloading.
- How does Module._resolveFilename(request, parent, isMain) resolve the file name?
Let's look at the class methods defined below:
Module._resolveFilename = function(request, parent, isMain, options) { if (NativeModule.canBeRequiredByUsers(request)) { // Preferential loading of internal modeling blocks return request; } let paths; // The options used by the node require.resolve function, options.paths, are used to specify the lookup path if (typeof options === "object" && options !== null) { if (ArrayIsArray(options.paths)) { const isRelative = request.startsWith("./") || request.startsWith("../") || (isWindows && request.startsWith(".\\")) || request.startsWith("..\\"); if (isRelative) { paths = options.paths; } else { const fakeParent = new Module("", null); paths = []; for (let i = 0; i < options.paths.length; i++) { const path = options.paths[i]; fakeParent.paths = Module._nodeModulePaths(path); const lookupPaths = Module._resolveLookupPaths(request, fakeParent); for (let j = 0; j < lookupPaths.length; j++) { if (!paths.includes(lookupPaths[j])) paths.push(lookupPaths[j]); } } } } else if (options.paths === undefined) { paths = Module._resolveLookupPaths(request, parent); } else { //... } } else { // Find module existence path paths = Module._resolveLookupPaths(request, parent); } // Find the module path based on the given module and traversal address array and whether it is an entry module or not const filename = Module._findPath(request, paths, isMain); if (!filename) { const requireStack = []; for (let cursor = parent; cursor; cursor = cursor.parent) { requireStack.push(cursor.filename || cursor.id); } // Module not found, throw exception (is it a familiar error) let message = `Cannot find module '${request}'`; if (requireStack.length > 0) { message = message + "\nRequire stack:\n- " + requireStack.join("\n- "); } const err = new Error(message); err.code = "MODULE_NOT_FOUND"; err.requireStack = requireStack; throw err; } // Final return full path with file name return filename; };
The most prominent aspects of the code above are the use of the _resolveLookupPaths and _findPath methods.
- _resolveLookupPaths: Returns an array of traversal ranges used to provide _findPath by accepting the module name and the module caller.
// Address Array Method for Module File Addressing Module._resolveLookupPaths = function(request, parent) { if (NativeModule.canBeRequiredByUsers(request)) { debug("looking for %j in []", request); return null; } // If it is not a relative path if ( request.charAt(0) !== "." || (request.length > 1 && request.charAt(1) !== "." && request.charAt(1) !== "/" && (!isWindows || request.charAt(1) !== "\\")) ) { /** * Check the node_modules folder * modulePaths For user directories, the node_path environment variable specifies the directory, global node installation directory */ let paths = modulePaths; if (parent != null && parent.paths && parent.paths.length) { // The parent module's modulePath is also added to the child module's modulePath, looking backwards paths = parent.paths.concat(paths); } return paths.length > 0 ? paths : null; } // When using repl interaction, look for. /. /node_modules and modulePaths in turn if (!parent || !parent.id || !parent.filename) { const mainPaths = ["."].concat(Module._nodeModulePaths("."), modulePaths); return mainPaths; } // If relative paths are introduced, add parent folder paths to find paths const parentDir = [path.dirname(parent.filename)]; return parentDir; };
- _findPath: Finds the corresponding filename and returns based on the target module and the range found by the above function.
// Find the true path of the module based on the given module and traversal address array, and whether or not the top-level module Module._findPath = function(request, paths, isMain) { const absoluteRequest = path.isAbsolute(request); if (absoluteRequest) { // Absolute path, directly to specific module paths = [""]; } else if (!paths || paths.length === 0) { return false; } const cacheKey = request + "\x00" + (paths.length === 1 ? paths[0] : paths.join("\x00")); // Cache Path const entry = Module._pathCache[cacheKey]; if (entry) return entry; let exts; let trailingSlash = request.length > 0 && request.charCodeAt(request.length - 1) === CHAR_FORWARD_SLASH; // '/' if (!trailingSlash) { trailingSlash = /(?:^|\/)\.?\.$/.test(request); } // For each path for (let i = 0; i < paths.length; i++) { const curPath = paths[i]; if (curPath && stat(curPath) < 1) continue; const basePath = resolveExports(curPath, request, absoluteRequest); let filename; const rc = stat(basePath); if (!trailingSlash) { if (rc === 0) { // If the stat state returns 0, it is a file // File. if (!isMain) { if (preserveSymlinks) { // Command module loaders maintain symbolic connections when parsing and caching modules. filename = path.resolve(basePath); } else { // Do not keep symbolic links filename = toRealPath(basePath); } } else if (preserveSymlinksMain) { filename = path.resolve(basePath); } else { filename = toRealPath(basePath); } } if (!filename) { if (exts === undefined) exts = ObjectKeys(Module._extensions); // Resolve Suffix Name filename = tryExtensions(basePath, exts, isMain); } } if (!filename && rc === 1) { /** * stat A folder is considered if the status returns 1 and the file name does not exist * If the file suffix does not exist, try loading the file specified by the main entry in package.json in that directory * If it doesn't exist, then try the index [.js,.Node,.Json] file */ if (exts === undefined) exts = ObjectKeys(Module._extensions); filename = tryPackage(basePath, exts, isMain, request); } if (filename) { // If the file exists, add the file name to the cache Module._pathCache[cacheKey] = filename; return filename; } } const selfFilename = trySelf(paths, exts, isMain, trailingSlash, request); if (selfFilename) { // Set Cache for Path Module._pathCache[cacheKey] = selfFilename; return selfFilename; } return false; };
Module Loading
Standard Module Processing
After reading the code above, we find that when we encounter a module as a folder, we execute the logic of the tryPackage function. Here is a brief analysis of the implementation.
// Attempt to load standard modules function tryPackage(requestPath, exts, isMain, originalPath) { const pkg = readPackageMain(requestPath); if (!pkg) { // Without package.json this uses index directly as the default entry file return tryExtensions(path.resolve(requestPath, "index"), exts, isMain); } const filename = path.resolve(requestPath, pkg); let actual = tryFile(filename, isMain) || tryExtensions(filename, exts, isMain) || tryExtensions(path.resolve(filename, "index"), exts, isMain); //... return actual; } // Read the main field in package.json function readPackageMain(requestPath) { const pkg = readPackage(requestPath); return pkg ? pkg.main : undefined; }
The readPackage function is responsible for reading and parsing the contents of the package.json file, as described below:
function readPackage(requestPath) { const jsonPath = path.resolve(requestPath, "package.json"); const existing = packageJsonCache.get(jsonPath); if (existing !== undefined) return existing; // Call the execution logic of libuv uv_fs_open, read the package.json file, and cache const json = internalModuleReadJSON(path.toNamespacedPath(jsonPath)); if (json === undefined) { // Next, cache the file packageJsonCache.set(jsonPath, false); return false; } //... try { const parsed = JSONParse(json); const filtered = { name: parsed.name, main: parsed.main, exports: parsed.exports, type: parsed.type }; packageJsonCache.set(jsonPath, filtered); return filtered; } catch (e) { //... } }
The above two sections of code perfectly explain the role of the package.json file, the module's configuration entry (main field in package.json), and why the module's default file is an index, as shown in the following diagram:
Module File Processing
After locating the corresponding module, how do I load and parse it?The following is a code analysis:
Module.prototype.load = function(filename) { // Make sure the module is not loaded assert(!this.loaded); this.filename = filename; // Find the node_modules of the current folder this.paths = Module._nodeModulePaths(path.dirname(filename)); const extension = findLongestRegisteredExtension(filename); //... // Perform specific file suffix name parsing functions such as js / json / node Module._extensions[extension](this, filename); // Indicates that the module was loaded successfully this.loaded = true; // ...omit support for the esm module };
Suffix handling
As you can see, Node.js is loaded differently for different file suffixes. Here's a simple analysis of.Js,.Json,.Node.
- The.js suffix js file read is mainly achieved through Node's built-in API fs.readFileSync.
Module._extensions[".js"] = function(module, filename) { // Read File Content const content = fs.readFileSync(filename, "utf8"); // Compile Execution Code module._compile(content, filename); };
- The processing logic of.JSON suffix JSON file is relatively simple, the result can be obtained by executing JSONParse after reading the contents of the file.
Module._extensions[".json"] = function(module, filename) { // Load files directly in utf-8 format const content = fs.readFileSync(filename, "utf8"); //... try { // Export file contents in JSON object format module.exports = JSONParse(stripBOM(content)); } catch (err) { //... } };
- The.Node suffix.node file is a native module implemented by C/C++, which is read by the process.dlopen function, which actually calls the DLOpen function in C++ code, while the DLOpen function calls uv_dlopen, which loads the.Node file, similar to the OS load system class library file.
Module._extensions[".node"] = function(module, filename) { //... return process.dlopen(module, path.toNamespacedPath(filename)); };
From the three sources above, we can see and understand that only the JS suffix will execute the instance method_compile at the end. Let's remove some experimental features and debugging-related logic to briefly analyze this code.
Compile Execution
When the module is loaded, Node builds the running sandbox using the methods provided by the V8 engine and executes the function code as follows:
Module.prototype._compile = function(content, filename) { let moduleURL; let redirects; // Inject the public variable u dirname / u filename / module / exports / require into the module and compile the function const compiledWrapper = wrapSafe(filename, content, this); const dirname = path.dirname(filename); const require = makeRequireFunction(this, redirects); let result; const exports = this.exports; const thisValue = exports; const module = this; if (requireDepth === 0) statCache = new Map(); //... // Executing functions in modules result = compiledWrapper.call( thisValue, exports, require, module, filename, dirname ); hasLoadedAnyUserCJSModule = true; if (requireDepth === 0) statCache = null; return result; }; // Core logic for injecting variables function wrapSafe(filename, content, cjsModuleInstance) { if (patched) { const wrapper = Module.wrap(content); // vm sandbox runs, returning the result directly, env -> SetProtoMethod (script_tmpl,'runInThisContext', RunInThisContext); return vm.runInThisContext(wrapper, { filename, lineOffset: 0, displayErrors: true, // Dynamic Loading importModuleDynamically: async specifier => { const loader = asyncESM.ESMLoader; return loader.import(specifier, normalizeReferrerURL(filename)); } }); } let compiled; try { compiled = compileFunction( content, filename, 0, 0, undefined, false, undefined, [], ["exports", "require", "module", "__filename", "__dirname"] ); } catch (err) { //... } const { callbackMap } = internalBinding("module_wrap"); callbackMap.set(compiled.cacheKey, { importModuleDynamically: async specifier => { const loader = asyncESM.ESMLoader; return loader.import(specifier, normalizeReferrerURL(filename)); } }); return compiled.function; }
In the above code, we can see that the wrapwrapSafe function is called in the _compile function, the injection of the u dirname / u filename / module / exports / require public variable is executed, and the runInThisContext method in C++ (located in the src/node_contextify.cc file) is called to build a sandbox environment for the module code to run, and the compiledWrapper object is returned.Finally, the module is run through the compiledWrapper.call method.
epilogue
At this point, the module system analysis of Node.js has come to an end. The world of Node.js is wonderful and wonderful, and all of you are reluctant to learn.
Invite to one's side men of wisdom and Valor
ZooTeam, a young, passionate and creative front-end team, belongs to the Research and Development Department of Political Cloud Products. Base is in the picturesque Hangzhou.The team now has more than 50 small front-end partners, with an average age of 27 years and nearly 30% of them are full stack engineers and proper Youth Storm Corps.Membership is composed of old soldiers from Ali and NetEase as well as new recruits from schools such as Zhejiang University, China University of Science and Technology and Hangzhou University.In addition to the daily business docking, the team also carries out technological exploration and battle in the direction of material system, engineering platform, build platform, performance experience, cloud application, data analysis and visualization, pushing and landing a series of internal technology products, and continuously exploring the new boundary of front-end technology system.
If you want to change what you've been tossing about, you want to start tossing about it; if you want to change what you've been warned about, you need more ideas, but there's no way to break it; if you want to change what you have the ability to do that, you don't need you; if you want to change what you want to do, you need a team to support it, but there's no place for you to lead; if you want to change your established pace, it will be"5 years of work and 3 years of work experience"; if you want to change your mind, it's always a blurry layer of window paper...If you believe in the power of belief, the ability of ordinary people to accomplish extraordinary things, and the ability to meet better yourself.If you want to participate in the process of taking off your business and personally promote the growth of a front-end team with in-depth business understanding, sound technical system, value creation by technology, and overflow of influence, I think we should talk.Anytime, wait for you to write something to ZooTeam@cai-inc.com