Node over Express - Autoload

Preface

This is the 2nd post of the Node over Express series (the previous one is Configuration). In this post, I’d like to discuss a famous pain point in Node.js.

Pain Point

There is well known Lisp joke:

A top hacker successfully stole the last 100 lines of a top secret program from the Pentagon. Because the program is written in Lisp, so the stolen code is just close brackets.

It is a joke that there are too many brackets in Lisp. In Node.js there is a similar issue that there are too many require. Open any node.js file, usually one could find several lines of require.

Due to the node’s sandbox model, the developer has to require resources time and time again in every files. It is not so exciting to write or read lines of meaningless require. And the worst, it could be a nightmare once a developer wishes to replace some library with another.

Rails Approaches

“Require hell” isn’t only for node.js, but also for Ruby apps. Rails has solved it gracefully, and the developer barely needs to require anything manually in Rails.

There are 2 kinds of dependencies in rails app, one is the external resource, another is the internal resource.

External Resources

External resources are classes encapsulated in ruby gems. In ruby application, developer describe the dependencies in Gemfile, and load them with Bundler. Some frameworks have already integrated with Bundler, such as Rails. When using them, developer doesn’t need to do anything manually, all the dependencies are required automatically. For others, use bundle execute to create the ruby runtime with all gems required.

Internal Resources

Internal Resources are the classes declared in the app, they could be models, the services or the controllers. Rails uses Railtie to require them automatically. The resource is loaded the first time it is used, the requiring process is “Lazy”. (In fact, this description isn’t that precise because Rails behaves differently in production environment. It loads all the classes during the launching for performance reason).

Autoload in Node.js

Rails avoids the “require-hell” with two “autoload” mechanisms. Although there are still debates about whether autoload is good or not. But at least, autoload frees the developer from the dull dependency management and increases the productivity of developers. Developers love autoload in most cases.

So to avoid “require-hell” in Node.js, I prefers autoload mechanism. But because there are significant differences in type system between Node.js and Ruby, we cannot copy the mechanism from ruby to node as is. Therefore before diving into the solution, we need to understand the differences first.

Node.js Module System

There are a number of similarities between Node.js and ruby; things in node.js usually have the equivalences in ruby. For example, package in node is similar to the gem in Ruby, npm equals to Gem and Bundler, package.json takes the responsibility of Gemfile and Gemfile.lock. The similarity enables porting autoload from ruby to node.

In some aspect, there are similarities between Node.js and Ruby, but there are also significant differences between them in some other aspects. One of the major differences is the type system and module sandbox in Node.js, which works in a quite different way to Ruby type system.

JavaScript isn’t a real OO language, so it doesn’t have real type system. All the types in JavaScript are actually functions, which are stored in local variables instead of in type system. Node.js loads files into different sandboxes, all the local variables are isolated between files to avoid “global leak”, a well-known deep-seated bad part of JavaScript. As a result, a Node.js developer needs to require used types again and again in every file.

In ruby, it is a lot better. With the help of the well designed type system, types are shared all over the runtime, a developer just needs to require the types not yet loaded.

So in node.js programs, there are many more require statements than in ruby. And due to the design of node.js and javascript, the issue is harder to be resolved.

Global Variable

In the browser, the JavaScript runtime other than node, global variables are very common. Global variable could be abused easily, which brings global leak to bad written JavaScript programs, and drives thousands of developers up to the wall. The JavaScript developers are scared of global leak so much so that they designed such a strict isolation model in node.js. But to my understanding, the isolation avoided global leaks effectively. But at the same time, it brought tens of require statements to every files, which is also not acceptable.

In fact, with the help of JSLint, CoffeScript and some other tools, developers can avoid global leak easily. And global sharing isn’t the source of evil. If abuse is avoided, I believes a reasonable level of global sharing could be useful and helpful. Actually Node.js have built-in a global sharing mechanism.

To share values across file, a special variable global is needed, which could be accessed in every file, and the value of which is also shared across files.

Besides sharing value around, global has another important feature: node treats global as default context, whose child you can refer to without explicitly identifying. So SomeType === global.SomeType.

With the help of global, we find a way to share types across files naturally.

JS Property

Rails’ autoload mechanism loads the classes lazily. It only loads the class when it is used for first time. It is a neat feature, and Rails achieve it by tracking the exception of “Uninitialized Constant”. To implement similar feature in Node.js, tracking exception is hardly feasible, so I choose a different approach, I use Property.

Property (Attribute in Ruby) enables method (function) being invoked as the field of an object is accessed. Property is a common feature among OO languages, but is a “new” feature to JavaScript. Property is a feature declared in ECMAScript 5 standard, which enables the developers to declare property on object by using the API Object.defineProperty. With the property, we’re able to hook the callback on the type variables, and require the types when the type is accessed. So the module won’t be required until it is used. On the other hand, node.js require function has built in the cache mechanism; it won’t load the file twice, instead it return the value from its cache.

With property, we make the autoload lazy!

My Implementation

To make autoload work, we need to create a magic host object to hold the type variables. In my implementation, I call the magic object Autoloader
we need to require a bootstrap script when the app starts, which is used to describe which types and how they should be required.

Bootstrap Script: initEnvironment.coffee
1
2
3
4
5
6
7
8
9
10
11
12
13
global.createAutoLoader = require('./services/AutoLoader')
global.createPathHelper = require('./services/PathHelper')
global.rootPath = createPathHelper(__dirname, true)
global.Configuration = require(rootPath.config('configuration'))
global.Services = createAutoLoader rootPath.services()
global.Routes = createAutoLoader rootPath.routes()
global.Records = createAutoLoader rootPath.records()
global.Models = createAutoLoader rootPath.models()
global.assets = {} # initialize this context for connect-assets helpers

The script sets-up the autoload hosts for all services, routes, records, models for my app. And we can reference the types as following:

Sample Usage
1
2
3
4
Records.User.findById uid, (err, user) ->
badge = new Models.Badget(badgeInfo)
user.addBadge badge
user.save()

In the initEnvironment.coffee script, there are 2 very important classes that are used:

  • AutoLoader: The class that works as the type variable hosts. All the magic happens here.
  • PathHelper: The class used to handle the path combination issue.

The detailed implementation is here:

Autoload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
_ = require('lodash')
path = require('path')
fs = require('fs')
createPathHelper = require('./PathHelper')
createLoaderMethod = (host, name, fullName) ->
host.__names.push name
Object.defineProperty host, name,
get: ->
require(fullName)
class AutoLoader
constructor: (source) ->
@__names = []
for name, fullName of source
extName = path.extname fullName
createLoaderMethod(this, name, fullName) if require.extensions[extName]? or extName == ''
expandPath = (rootPath) ->
createPathHelper(rootPath).toPathObject()
buildSource = (items) ->
result = {}
for item in items
extName = path.extname(item)
name = path.basename(item, extName)
result[name] = item
result
createAutoLoader = (option) ->
pathObj = switch typeof(option)
when 'string'
expandPath(option)
when 'object'
if option instanceof Array
buildSource(option)
else
option
new AutoLoader(pathObj)
createAutoLoader.AutoLoader = AutoLoader
exports = module.exports = createAutoLoader

PathHelper
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
_ = require('lodash')
fs = require('fs')
path = require('path')
createPathHelper = (rootPath, isConsolidated) ->
rootPath = path.normalize rootPath
result = (args...) ->
return rootPath if args.length == 0
parts = _.flatten [rootPath, args]
path.join.apply(this, parts)
result.toPathObject = ->
self = result()
files = fs.readdirSync(self)
pathObj = {}
for file in files
fullName = path.join(self, file)
extName = path.extname(file)
name = path.basename(file, extName)
pathObj[name] = fullName
pathObj
result.consolidate = ->
pathObj = result.toPathObject()
for name, fullName of pathObj
stats = fs.statSync(fullName)
result[name] = createPathHelper(fullName) if stats.isDirectory()
result
if isConsolidated
result.consolidate()
else
result
exports = module.exports = createPathHelper

The code above are part of the Express over Node, to access the complete codebase, please check out the repo on github.


Besides of the content, I want to say thank you to my English teacher Marina Sarg, who helped me on this series of blog a lot. Without her, there won’t be this series of blogs. Marina, thank you very much.