Node.js Handbook - How `require()` Actually Works
Recently I’ve noticed a lack of resources on advanced Node.js topics. There are plenty of guides and tutorials for getting started, but very little is written on maintainable design or scalable architecture. This post is a part of The Node.js Handbook, a series created to address this gap by sharing tribal knowledge and best practices. You can read more here.
Almost any Node.js developer can tell you what the require()
function does, but how many of us actually know how it works? We use it every day to load libraries and modules, but its behavior otherwise is a mystery.
Curious, I dug into Node core to find out what was happening under the hood. But instead of finding a single function, I ended up at the heart of Node’s module system: module.js
. The file contains a surprisingly powerful yet relatively unknown core module that controls the loading, compiling, and caching of every file used. require()
, it turned out, was just the tip of the iceberg.
module.js
1 2 3 4 5 |
|
The Module type found in module.js
has two main roles inside of Node.js. First, it provides a foundation for all Node.js modules to build off of. Each file is given a new instance of this base module on load, which persists even after the file has run. This is why we are able attach properties to module.exports
and return them later as needed.
The module’s second big job is to handle Node’s module loading mechanism. The stand-alone require
function that we use is actually an abstraction over module.require
, which is itself just a simple wrapper around Module._load
. This load method handles the actual loading of each file, and is where we’ll begin our journey.
Module._load
1 2 3 4 5 6 7 8 9 10 |
|
Module._load
is responsible for loading new modules and managing the module cache. Caching each module on load reduces the number of redundant file reads and can speed up your application significantly. In addition, sharing module instances allows for singleton-like modules that can keep state across a project.
If a module doesn’t exist in the cache, Module._load
will create a new base module for that file. It will then tell the module to read in the new file’s contents before sending them to module._compile
.[1]
If you notice step #6 above, you’ll see that module.exports
is returned to the user. This is why you use exports
and module.exports
when defining your public interface, since that’s exactly what Module._load
and then require
will return. I was surprised that there wasn’t more magic going on here, but if anything that’s for the better.
module._compile
1 2 3 4 5 6 7 |
|
This is where the real magic happens. First, a special standalone require
function is created for that module. THIS is the require function that we are all familiar with. While the function itself is just a wrapper around Module.require
, it also contains some lesser-known helper properties and methods for us to use:
require()
: Loads an external modulerequire.resolve()
: Resolves a module name to its absolute pathrequire.main
: The main modulerequire.cache
: All cached modulesrequire.extensions
: Available compilation methods for each valid file type, based on its extension
Once require
is ready, the entire loaded source code is wrapped in a new function, which takes in require
, module
, exports
, and all other exposed variables as arguments. This creates a new functional scope just for that module so that there is no pollution of the rest of the Node.js environment.
1 2 3 |
|
Finally, the function wrapping the module is run. The entire Module._compile
method is executed synchronously, so the original call to Module._load
just waits for this code to run before finishing up and returning module.exports
back to the user.
Conclusion
And so we’ve reached the end of the require code path, and in doing so have come full circle by creating the very require
function that we had begun investigating in the first place.
If you’ve made it all this way, then you’re ready for the final secret: require('module')
. That’s right, the module system itself can be loaded VIA the module system. INCEPTION. This may sound strange, but it lets userland modules interact with the loading system without digging into Node.js core. Popular modules like mockery and rewire are built off of this.[2]
If you want to learn more, check out the module.js source code for yourself. There is plenty more there to keep you busy and blow your mind. Bonus points for the first person who can tell me what ‘NODE_MODULE_CONTEXTS’ is and why it was added.
[1] The module._compile
method is only used for running JavaScript files. JSON files are simply parsed and returned via JSON.parse()
[2] However, both of these modules are built on private Module methods, like Module._resolveLookupPaths
and Module._findPath
. You could argue that this isn’t much better…