Extended CommonJS

This is going to contain more advanced background about what a general module systems do, and, finally, what distinguishes modul8 from plain CommonJS bundler.

CommonJS Parsing

Or, how a module system works.

JavaScript Modules

JavaScript has no module system.

we're off to a great start..

On the other hand, got functions. Functions with closures.

(function(){
  var private = 5;
  window.publicFn = function(){
    console.log(private);
  }
})();

This is the commonly employed method of encapsulating and exposing objects and functions that can reference private variable through a closure. This works; private is inaccessible outside this anonymous function.

Unfortunately, this just exposes publicFn to the global window object. This is not ideal, as anything, anywhere can just reference it, leaving us not much wiser. True modularity is clearly impossible when things are just lying around freely like this for everyone. It is fragile, and it is error prone as conflicting exports will actually just favour the last script to execute - as JavaScript simply runs top to bottom, attaching its exports to window as we go along. Clearly we need something better than this.

CommonJS Idea

There is a way to fix this, but first of all it assumes all modules need to support a stadardised format for exporting of modules. CommonJS is a such a standardization. It has very large traction at the moment, particularly driven by server side environments such as NodeJS.

Its ideas are simple. Each module avoids the above safety-wrapper, must assume it has a working require(), and instead of attaching its exports to a global object, it attaches them to an opaque exports object. Alternatively, it can replace the module.exports object to define all your exports at once.

By making sure each module is written this way, CommonJS parsers can implement clever trickery on top of it to make this behaviour work. I.e. having each module's exports objects stored somewhere for require() and allocating a singleton for each module.

CommonJS Basics

From the above rationale, it is clear that a CommonJS parser must turn this:

var private = 5;
var b = require('b');
exports.publicFn = function(){
  console.log(private);
};

into something like this:

var module = {};
(function(require, module, exports){
  var private = 5;
  var b = require('b');
  exports.publicFn = function(){
    console.log(private);
  };
})(makeRequire(location), module, stash[location])
if (module.exports) {
  delete stash[location];
  stash[location] = module.exports;
}

where location is a unique identifier passed down from the compiler to indicate where the module lives, so that require() can later retrieve it. The makeRequire() factory must be able to construct specifically crafted require() functions for given locations. Finally, stash will be a pre-defined object on which all modules are exported.

Wrapping up this behaviour inside a function, we can write something like this.

define(location, function(require, module, exports) {
  var private = 5;
  var b = require('b');
  exports.publicFn = function(){
    console.log(private);
  }
});

The makeRequire() and define() functions can cleverly be defined inside a closure with access to stash. This way only these functions can access your modules.

If the module system simply created a global namespace for where your modules resided, say, stash = window.ModuleSystem, then this would be bad. You could still bypass the system and end up requiring stuff implicitly again.

modul8 encapsulates such a stash inside a closure for require() and define(), so that only these functions + a few carefully constructed functions to debug export information and require strings.

Code Order

Now, a final problem we have glossed over is which order the modules must be included in. The module above requires the module b. What happens if this module has not yet been placed in the document? Syntax error. The indepentent modules must be included first.

To solve this problem, you can either give a safe ordering yourself - which will become increasingly difficult as your application grows in size - or you can resolve require() calls recursively to create a dependency tree.

modul8 in particular, does so via the excellently simple detective module that constructs a full Abstract Syntax Tree before it safely scans for require() calls. Using detective data, a tree structure representing the dependencies can be created. modul8 allows printing of a prettified form of this tree.

app::main
├───app::forms
├──┬app::controllers/user
│  └──┬app::models/user
│     └───app::forms
├──┬app::controllers/entries
│  └───app::models/entry
└──┬shared::validation
   └───shared::defs

It is clear that the modules on the edges of this tree must get required first, because they do not depend on anything. And similarly, the previous level should be safe having included the outmost level. Note here that app::forms is needed both by app:moduls/user and app::main so it must be included before both. Thus, we only care about a module's outmost level.

To order our modules correctly, we must therefore reduce the tree into an unique array of modules and their (maximum) level numbers, and simply sort this by their level numbers descending.

modul8's CommonJS Extensions

Require Path Problem

Whilst maintaining compatibility with the basic CommonJS spec, we have extended require() to ameliorate one common problem.

We wanted to be able to share code between the server and the client by essentially having multiple require paths. But require paths force you to scan all of them, with no way of specifying what path to do your look-up on. It also would make it very difficult to whitelist injected data from the server resolver - as it could simply find files with the same names as your data somewhere else.

The relation between the paths are also lost on the browser, so there is no sense in maintining any illusions about this by using traditional require paths.

Domains

In the end, namespacing each path became the accepted solution. To distinguish them from typical require paths, we refer to them as domains or require domains.

This also simplifies implementation as well, as we can create one object container directly on stash for each domain with key equal to its name.

Additionally, we can make require() functions that know which domains to look in by passing this extra parameter from the compiler down to define.

The result, is that with modul8, we can require() files relatively as if it was on a 100% CommonJS environment, but we could also do cross-domain require() by using C++ style namespacing. I.e. calls like require('shared::helper.js') will give access to code on a 'shared' domain.

To get the most out of this deal, having certain domains be completely server and client agnostic necessary: Code on reusable domains must not reference something from outside its base directory to work on the client (including npm modules), and it should not reference DOM/client specific elements to work on the server.

Domains also provide 3 more areas of use that each get their own reserved domain:

Arbiters

modul8 hates globals. They ruin otherwise solid modularity. Thus, it desperately tries to integrate globally exported libraries into its require system. With your permission, it removes the global shortcut(s) from your application code and inserts them onto the reserved M8 domain. Why we (can and sometimes) want to do this is explained in the modularity doc, whilst the feature itself is fully documented in the API doc.

node modules

The npm domain is really a domain with you local node modules folder as its root. It's, however, heavily special cased to deal with absolute requires internal to that domain. This means you can use a lot of npm installable modules right out of the box and with full control (via the logged dependency tree), over what is included. Usage is documented in the npm doc

Live Extensions

Because we have a require() function available in all the application code, and because this is synchronous (in the sense that it has been resolved on the server already), we migth want to extend our requiable data with results from third-party asynchronous script loaders. There's an external domain for that, and a client API for it. It's documented in the API doc.

Direct Extension

Finally, modul8 allows exporting of data that exists on the server, without having to add separate script tags for them. The data domain contains all such data, and like all the above, it can be gotten with require(). The API doc contains the how-to.