Node.js inside-out - Modules API rediscovery...
Since its first release in 2011, Node.js has greatly changed, or should I say revolutionized JavaScript development and its use-cases. Being able to write code in their favorite language, and run in on the server-side, many web developer quickly notice the huge potential of the project. Fast forward to 2019 and Node.js is one of the most beloved and used runtimes in the entire programming market. It’s fast (thanks to V8), reliable and easy to use. And, with its own NPM package manager 📦, it has the biggest collection of open source libraries and tools in the world! These facts alone tell something about how popular Node.js has become. 🚀
For reasons above, in this series titled “Node.js inside-out”! we’re going to explore everything about the project. We’ll talk about what Node.js actually is and what APIs in-depth it provides. With the rise of NPM and number of Node.js frameworks, many developers prefer that instead of the lower-level stuff that Node.js itself provides. 👏 Don’t get me wrong - it’s fine to use various tools that made your development more enjoyable. It’s just that sometimes, when you need to squeeze some additional performance or want to know what’s going on under-the-hood, then it’s good to get back to the basics. Sadly, many people omit this step when starting with Node.js (unlike the web development - JS, HTML and CSS are standard milestones for beginners), going straight to using different frameworks without really understanding them or their true potential.
With this intro finally behind us, as I said, I’d like this series to provide an alternative, more beginner-friendly Node.js super-in-depth introduction in slightly more… acceptable way. 😉 So, I hope you’ll enjoy it and learn something new!
What exactly is Node.js?
For start - a bit of theory… but not really boring one. 😃 Node.js itself is a runtime environment for JavaScript. It’s open-source and cross-platform. Its development started in 2009, with the first official release in 2011. The idea behind it was simple - to allow JS to run in different environments than the browser. It’s nothing that hasn’t been done before, just not with that big success. Today its development is overseen by the Node.js foundation with additional help from a big number of contributors. It’s used by many big names in the industry and doesn’t seem to stop evolving and improving with time.
As a runtime environment, Node.js is powered by V8 🔋 - open-source JS engine (the fastest on the market), originally created by Google. Thus, it’s working similarly to any Chromium-based browser. The code is run in an event loop, on a single thread. The asynchronous I/O allows it to take care of multiple concurrent operations. ⚡ This approach has its downsides, but they’re related to JS in general.
Node.js also provides a lot of APIs for developers to use. They allow accessing features that aren’t possible through everyday browsers. They’re provided in the form of modules, as a standard way to handle core functionalities. Their features vary greatly - from file system access and cryptography to C++ add-ons, child processes, and V8 access. We’ll explore each of these later on in the series. 👍
With Node.js rapid development, more and more interesting tools appeared. With its robust architecture, you can create server-side code, CLI tools, real-time applications, which further means likes of games, social media and others! Of course, it’s all possible with the combination of client-side, which can be written, utilizing your current knowledge, in JS too! But I don’t only mean browsers! Based on, inspired by or built with Node.js, tools like Electron or NativeScript came to exist. Basically what they do is allow you to create fully native desktop or mobile applications… with JavaScript! And, IMHO, that was the key to Node.js success - one, single language to rule ‘em all! ✊
Node.js APIs
I’d like to commit the rest of this series to explore different APIs of Node.js. We’ll start with the (ECMAScript) Modules API. I think that’s a good-enough choice for the beginning. 😃 Modules are used almost everywhere, but you could be surprised by how many (possibly) unknown features they possess in Node.js. But, if you think this is too easy, then fear not! We’ll explore more advanced APIs in future posts! And, as a side-note - I’m using Node.js v10.15.3 - latest stable LTS version for the rest of this tutorial. Now, let’s get started! 🎉
Modules
Probably almost all of today’s web developers use some kind of module system to better organize their code. The most popular options being ES modules (newest standard) and CommonJS format (the one used in Node.js). But there’s a little more to Node.js module system that just importing and exporting stuff. 😅 And that’s what we’re going to explore!
CommonJS
Let’s first recall the rules of CommonJS (CJS) module format - the main one in Node.js. In Node.js, unlike in any other front-end TypeScript or Babel-based workflow, modules are real things. Your imports and exports are resolved at runtime - not at any kind of transpilation step. Your basically getting is a real module system. 😮 This, naturally, has its pros as well as cons. But, transpliation is still a nice option to have (especially when, e.g. doing micro-optimizations and not wanting to resolve modules at runtime) you can easily use Babel or any other tool you want - anytime anyhow! 😉
I guess many people refer to CJS as the one with require()
syntax. That’s because this particular keyword is probably the most recognizable symbol for this format.
Import / export
For exporting you can assign your value to the special module.exports
property to respective properties when dealing with objects. For the second purpose, you can also use the exports
object - a quick shortcut. Just don’t mess the two when assigning single values - exports won’t work with stuff like that! That’s because exports
is, in fact, a reference to modules.exports
, which defaults to empty object.
// module1.js
exports.numberValue = 10;
exports.stringValue = 'str';
// module2.js
module.exports = () => {
// code
}
Notice, that the arrow function ➡ syntax (and many other ES-Next features) is natively supported by Node.js (and other V8-based browsers).
Imports can be done with well-known require()
syntax:
const module1 = require('module1.js');
const module2 = require('module2.js');
module1.numberValue; // 10
module1.stringValue; // 'str'
module2();
I think it’s a well-known fact that the syntax above can be freely used to import core Node.js modules (like fs or path), modules located in relative paths ( ./
), node_modules directory, but also the global ones. Also, you can feel free to omit your .js, .json or .node (for native add-ons) file extensions. Or use the index.js files as folders’ main files and etc. Just the usual stuff related to JS modules. Most of the times, it goes unnoticed… ⚡
Wrappers & globals
Everything above is just pure basics. You can easily go and use that without any further understanding. But, in this series, we’re digging deep! And so, we want to know what require()
, module
and exports
really are.
Before execution, code from each imported module is put inside a wrapper function 🌯, looking something like this:
((exports, require, module, __filename, __dirname) => {
// module code
});
This is a very important concept to understand, and that’s for 2 main reasons:
- All what-seems-like global variables and other user-defined variables at the top scope of different modules are preserved in limited, module-only scope. You have to use
module.exports
/exports
to actually output something to the outer world. 📤 - This perfectly shows us where our
require()
function andmodule
object actually come from. It also hides from developers the fact of function wrappers in a nice form of what-seems-like globals. 👍
With that said, I think it’s a perfect time to explore what parameters of our top wrapper really do:
- exports - just a reference to
module.exports
(as said before); - require() - function used to import modules. It has some additional properties of its own:
- cache - object where all loaded modules are cached (more on that later);
- main - reference to a
Module
object representing entry module; - resolve() - returns the exact filename (complete path) to the file that the module would be imported from when using
require()
with the same argument:- paths() - returns an array of paths searched through when locating the provided module;
- module - a reference to the object (Module instance) representing the current module:
- children - an array of modules first imported in the given module;
- exports - an object used to export values from the given module;
- filename - absolute path to the given module;
- id - identifier for the given module. Usually equal to the filename (example exception being index files);
- loaded - indicating whether the module has already loaded. Especially important when using multiple requires in different places i.e. cycles. Ensure that they’re properly loaded first;
- parent - reference to the module that has loaded given module first;
- paths - an array of paths searched through when locating the given module;
- require() - provides a way to call require as if it was from the given module;
- filename - an absolute path of the module;
- dirname - directory name of the module;
Feels a bit like docs, does it? 😅 Hopefully it’s not bad. I tried to provide it in a form that’s both more understandable, shorter and simpler than the official documentation. The main point is just to understand where these seeming globals come from and what do they do. You’d most likely hardly-ever use any of properties above. Noticeable exceptions (beyond import/export syntax) include__dirname
and __filename
which many beginners may not know where they come from and what do they represent. Well, now you know. 😉
The Module (capped letter on purpose) is a structure that all modules instances mentioned above inherit from. Node.js allows you to access this as well, in a form of core module module 😂 (require('module')
). It has even fewer use-cases than the API above, as it provides only two additional properties:
- builtinModules - an array of Node.js built-in modules’ names;
- createRequireFromPath() - allows creating a relative requires that resolves to start from provided path, e.g. folder. Useful when using multiple imports from the same directory while not limiting readability;
As you can see, the properties above have their really, really specific use-cases. As such, I’d consider them more as internal properties rather than general-use ones. 😅 Although, if you’re developing a Node.js framework… who knows? 🤔
Caching
The last thing to note about modules is that they’re cached. This has a huge impact on how they work and the performance of actually loading them. Once loaded, your module won’t have to be reloaded the second time. Instead, its cached version will be used (stored in object referenced by require.cache
). This results in improved performance, but also has some additional, sometimes taken-as-granted, side-effects. You see, when a module is first loaded (that’s why the children and parent properties of module exist BTW, because they indicate specific relations between modules, i.e. where it was first loaded and thus cached), cached and then accessed, all of its code has been executed once and all exports of this module are carried throughout all files that imported given module. This allows for some cunning tricks, like a dedicated module for semi-globals (values that can be imported anywhere and changed, affecting other modules). 🛸
Of course, you can force reload of a module by messing with require.cache
object and removing given module (by its id). But, it’s not really recommended practice - unless you’re sure that this is exactly what you want.
ECMAScript Modules
Up to this point, we were talking only about CJS modules. But, as many web developer should now, there has been a new standard introduced in 2015 with ES6 (not that new any longer, huh? 😅) which is referred to as ECMAScript Modules (ESM for short). They’re the ones who brought us this fine import
/export
syntax and finally an industry-grade standard! Sadly, as we’ve already seen with Node.js itself, the old standards (CJS, AMD, etc.) still prevail in some, even as actively-developed places as Node. But, this has finally changed with the release of Node v8.x where support for ESM was introduced, although with an experimental flag ☢ (which stayed to current v11.x so far). But, that’s not something that would stop us from taking a closer look at ESM in Node.js, is it? 👍
Enable
As much as experimental status may not bother you (apart from some features still needed to be implemented or improved), it comes with some additional requirements. ESM (at the time of writing) isn’t supported out-of-the-box. You have to use --experimental-modules
flag to properly enable them whenever running Node. Also, you have to use the .mjs extension for your files to be properly loaded through ESM system. 👉
Quite frankly, ESM is mostly backward-compatible with CJS (with some API differences), meaning that you can freely import CJS modules through ESM without much hassle. On the other hand, what you cannot do, is importing ESM modules with CJS syntax. This is not allowed, as CJS uses different resolving method and timing (not forward-compatible 😅). Of course, the JSON files and C++ modules/native addons can freely be used with the ESM syntax.
Differences
Beyond cross-compatibility, there are a couple more differences between Node.js CJS and ESM implementations. ESM has completely different resolving system, based on URL and file: protocol. This means that you can e.g. pass additional query parameters to indicate that the following module should be loaded again (instead of using its cached version). 💾
import module from './module1.js?id=1';
import moduleClone from './module1.js?id=2';
For now, the external URL cannot be used. Although with schematics above, it may be possible in the near future.
The URL format is also used to identify modules inside cache (that’s why the example above works). But, as we don’t have access to the same values available to us as in CJS (require()
, module
, etc.) the cache object is stored separately. Also, unlike CJS, ESM doesn’t resolve NODE_PATH, which further means no way of importing globally-installed modules.
And finally, at its current state, import
provides one property of its own. It’s an object called import.meta
which, again, has one property called import.meta.url
, indicating the absolute URL of the current module.
import.meta.url
Hooks
The last new feature of Node.js ESM is called loader hooks. ⚡ As the name suggests, these hooks allow you to intercept the loading process of ESM modules with your own, custom code. 👏
There are 2 possible hooks for you to use - resolve()
and dynamicInstantiate()
. You can provide one or both of those in a form of asynchronous functions, in a single, separate JS file. You later can load and use them with a simple CLI argument:
node --experimental-modules --loader ./loader.mjs ./index.mjs
The resolve()
hook takes 3 parameters:
- specifier - an absolute path of the current module’s file;
- parentModuleURL - URL of the parent module (the one that loaded given module first). It follows file: protocol and defaults to undefined when used on the entry module (there’s no parent);
- defaultResolve() - default resolve function;
After appropriate processing, your resolve hook should return an object with two properties: url
and format
. The first indicates the URL resolved for the handled module (file:) and second - module’s format. 📦 While url
is a no-brainer, format
has a form of a string with 6 possible values:
- “esm” - indicates ESM module;
- “cjs” - indicates CJS module;
- “builtin” - indicates Node.js built-in modules, e.g. http or path;
- “json” - indicates JSON file;
- “addon” - indicates a C++ native addon;
- “dynamic” - indicates the use of dynamicInstantiate hook;
The dynamicInstantiate()
hook allows you to properly handle modules with "dynamic"
format. The hook itself is an async function taking a single url
argument (URL of the handled module), that should return an object with 2 properties:
- exports - an array of names for exported properties;
- execute() - functions taking above exports as an argument. It should access the previously defined property names on exports object and interact with them using .get() and .set() methods accordingly. It will be later executed at the time of module evaluation;
In general, this hook gives you an option to provide a somewhat alternative form for modules that require that (e.g. different file extensions). Just keep in mind that it doesn’t have to be limited to just setting completely different properties - you can use the provided URL to load and evaluate the file the way you want. As always in programming - options are almost* endless! 😉
We’re just getting started!
Yup, it’s been a while and we only managed to cover Modules API - just bare modules! Seemingly such a simple thing and has so much depth to it! 🤔 Again, don’t worry, there’s some even more interesting stuff in stock! I’m planning on covering the File System API next (that’s the big one!), but maybe you’d like to see something else? I’m very much open to different options! And remember that I plan on covering all Node.js APIs eventually!
So, let me know down in the comments what do you think about this article and what would you like to see next! Also, share this post with other for reach! 😃 As always, follow me on Twitter, on my Facebook page and sign up for the newsletter (coming soon!) below to keep up-to-date with the latest content about this series and other awesome JS stuff! 🛸 Thanks for reading and I see you in the next post! ✌
If you need
Custom Web App
I can help you get your next project, from idea to reality.