Node over Express - Configuration

Preface

I have been working on Node.js related projects for quite a while, and have built apps with node both for the clients or personal projects, such as LiveHall, CiMonitor, etc. I have promised some one to share my experience on node. Today I’ll begin to work on this. This will be the first blog of the series.

Background

In this blog, I would like to talk about the configuration in node, which is common problem we need to solve in apps.

Problems related to configuration aren’t new, and there have been a dozens of mature solutions, but for Node.js apps, there is still something worth to be discussed.

Perhaps configuration could be treated as a kind of special data. Usually developers prefer to use data language to describe their configurations. Here are some examples:

  • .net and Java developer usually uses Xml to describe their configuration
  • Ruby developer prefers Yaml as the configuration language
  • JavaScript developer tend to use Json

Data languages are convenient, because developers can easily build DSL on it, then they describe the configuration with the DSL. But is the data language the best option available? Is it really suitable to be used in all scnearios?

Before we answer the questions, I would like to say something about the problem we’re facing. There is one common requirement to all kinds of configuration solutions, which is default values and overriding.

For example, as a Web app default, we use port 80; but in development environment, we prefer to use a port number over 1024, 3000 is a popular choice. That means we need to provide 80 as the default value of the port, but we wish to override the value with 3000 in the development environment.

For the languages I mentioned above, except for Yaml, Xml and Json, doesn’t provide native support of inheritance and overriding. It means we need to implement the mechanism by our own. Take Json as example, we might write the configuration in this way:

Sample Json configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
"default": {
"port": 80,
"serveAssets": true
},
"development": {
"port": 3000,
"database": "mongodb://localhost/development"
},
"test": {
"database": "mongodb://localhost/test"
},
"production": {
"serveAssets": false,
"database": "mongodb://ds0123456.mongolab.com:43487/my_sample_app"
}
}

The previous Json snippet is a typical example of web app configuration; it has a default section to provide the default values for all environments. Three sections for specific environments. To apply it corecctly to our app, we need to load and parse the Json file to get all data first, then load the values of the default section, then override the value with the values from specific environment. In addition, we might wish to have the validation that yields error when the provided environment doesn’t exist.

This solution looks simple and seems to work, but when you try to apply this approach to your app in real life, you need to watch out some pitfalls.

Issue 1: Confidential Values

In the real world, values in configuration sometimes could be sensitive and need to be kept confidential. It could contain the credential to access your database, Or it could contain the key to decrypt the cookies. It may also contain private certificate that identifies and authenticates the app to other services. In these scenarios, you need to protect your configuration in order to avoid big trouble!

To solve the issue, you might think about adding new feature that enable you to to encrypt confidential values or to load it from a different safe source. To achieve it, you might need to add another layer of DSL which add more complexities to your app and make your code harder to debug or to maintain.

Issue 2: Dynamic Data

A solution to first issue, one could store the environment related but sensitive data in the environment variables. The solution is simple and works perfectly, so I highly recommend it. However, to do this means you need the capability to load the value not only from Json directly but also from the environment variables.

Sometimes, such as deploying your app to Heroku/Nojitsu, might give rise that make the case trickier. After deployed the app to Heroku/Nojitsu, the default values are provided in Json directly, and some of which need to be overrode with the values from environment variables or you need to do it vice versa. These tricky requirements might blow your mind and your code away easily. It causes complicated DSL design and hundreds lines of implementation, but just to load your configuration properly. Obviously it is not a good idea.

Issue 3: Complicated Inheritance Relationship

Scared about above cases? No, then how about complicated inheritance relationship between environments?

In some big and complicated web apps, there might be more than 3 basic environments, such as:

  • Development: for developers to develop the app locally
  • Test: for developers to run unit or function test locally, such as mocha tests
  • Regression: for developers or QAs to run regression tests, such as cucumber tests
  • Integration: for QAs or Ops to test the integration with other apps
  • Staging: for ops and QAs to test the app in production like environment before it really goes live
  • Production: the environment serves your real users

When try to write configurations for these environments, one might find there are only a few differences between environments. To make life easier, to avoid the redundancy, introducing the inheritance between configurations might be a good idea.

As the consequence, the whole configuration becomes environments with complex inheritance relationship. And to support this kind of configuration inheritance, a more complex DSL and hundreds lines of codes are needed.

Some Comments

My assumption above seems to be a little too complex. From some people, it might be the “WORST CASE SCENERIO” and hard to come by. But according to my experience, it is very common when building real web app with node. So if to solve it isn’t too hard, it could be better to consider it seriously and solve it gracefully.

Ruby developer might think they’re lucky because Yaml supports inheritance natively. But confidential data and dynamic data still troubles.

My Solution

After learnt a number of painful lessons, I figured out a simple but working solution: Configuration as Code - describe the configuration with the same language that the business logic is described!

Configuration as code isn’t a new concept, but it is extremely handy when you use it in node applications! Let me explain why and how it works:

To protect the confidential configuration values, one should store them with environment variables, which are only accessible in the specific server.
Then one can load these values from the environment variables as dynamically values.

To do it in a data language such as Xml, Json or Yaml could be hard, but it will become as easy as taking a candy from a baby if it is done in the programming language that application applied/used, such as ruby or javascript.

To the configuration inheritance, OO languages have already provided very handy inheritance mechanism. Why do we need to invent one? Why not just use it? To the value overriding, OO programming tells us that it is called polymorphism. The only difference here from the typical scenario is that we override the values instead of the behaviors. But it isn’t an issue, because the value could be the result of the behavior, right?

Now I assume that everyone got a pretty good idea of what I am saying. If that is the case, then the below code should be able to be understood quite clearly, which is a standard Node.js file written in coffee script:

Configuration as Code Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
process.env.NODE_ENV = process.env.NODE_ENV?.toLowerCase() ? 'development'
class Config
port: 80
cookieSecret: '!J@IOH$!BFBEI#KLjfelajf792fjdksi23989HKHD&&#^@'
class Config.development extends Config
port: 3009
redis:
uri: 'redis://localhost:6379'
mongo:
uri: 'mongodb://localhost'
class Config.test extends Config.development
class Config.heroku extends Config
cookieSecret: process.env.COOKIE_SECRET
redis:
uri: process.env.REDISCLOUD_URL
mongo:
uri: process.env.MONGOLAB_URI
module.exports = new Config[process.env.NODE_ENV]()

See, with the approach, one can describe the configuration easily and clearly in a few lines of code, but with built-in loading dynamical values capability and configuration inheritance and overriding capability.

In fact, with my suggestions, it might work better than expected! Here are the additional free benefits:

  1. Only one configuration is needed when the app deployed to the cloud. Because all the host specific configurations are usually provided via the environment variables in Paas.
  2. Have some simple and straightforward logic in the configuration, which could be very useful, especially if there is some naming convention in the configuration. But complicated or tricky logic should be strictly avoided, because it is hurts the readability and maintainability.
  3. Easy to write tests for configurations, to ensure the values are properly set. It could be very handy when there are complicated inheritance relationships between configurations, or have some simple logic in your configuration.
  4. Avoid to instantiate and execute the code that isn’t related to the current environment, which could be helpful to avoid overhead to instantiate unused expensive resources or to avoid errors caused because of incompatibility between environments.
  5. Get runtime error when the configuration for the environment doesn’t exist.

Besides of the content, I want to say thank you to my English teacher Marina Sarg, who helped me on this series of blog a lot. Without her, there won’t be this series of blogs. Marina, thank you very much.

Manage configuration in Rails way on node.js by using inheritance

Application is usually required to run in different environments. To manage the differences between the environments, we usually introduce the concept of Environment Specific Configuration.
In Rails application, by default, Rails have provided 3 different environments, they are the well known, development, test and production.
And we can use the environment variable RAILS_ENV to tell Rails which environment to be loaded, if the RAILS_ENV is not provided, Rails will load the app in development env by default.

This approach is very convenient, so we want to apply it to anywhere. But in node.js, Express doesn’t provide any configuration management. So we need to built the feature by ourselves.

The environment management usually provide the following functionalities:

  • Allow us to provide some configuration values as the default, which will be loaded in all environments, usually we call it common.
  • Specific configuration will be loaded according to the environment variable, and will override some values in the common if necessary.

Rails uses YAML to hold these configurations, which is concise but powerful enough for this purpose. And YAML provided inheritance mechanism by default, so you can reduce the duplication by using inheritance.

Inheritance in Rails YAML Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
development: &defaults
adapter: mysql
encoding: utf8
database: sample_app_development
username: root
test:
<<: *defaults
database: sample_app_test
cucumber:
<<: *defaults
database: sample_app_cucumber
production:
<<: *defaults
database: sample_app_production
username: sample_app
password: secret_word
host: ec2-10-18-1-115.us-west-2.compute.amazonaws.com

In express and node.js, if we follow the same approach, comparing to YAML, we prefer JSON, which is supported natively by Javascript.
But to me, JSON isn’t the best option, there are some disadvantages of JSON:

  • JSON Syntax is not concise enough
  • Matching the brackets and appending commas to the line end are distractions
  • Lack of flexility

As an answer to these issues, I chose coffee-script instead of JSON.
Coffee is concise. And similar to YAML, coffee uses indention to indicate the nested level. And coffee is executable, which provides a lot of flexibilities to the configuration. So we can implement a Domain Specific Language form

To do it, we need to solve 4 problems:

  1. Allow dev to declare default configuration.
  2. Load specific configuration besides of default one.
  3. Specific configuration can overrides the values in the default one.
  4. Code is concise, clean and reading-friendly.

Inspired by the YAML solution, I work out my first solution:

Configuration in coffee script
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
_ = require('underscore')
config = {}
config['common'] =
adapter: "mysql"
encoding: "utf8"
database: "sample_app_development"
username: "root"
config['development'] = {}
config['test] =
database:"sample_app_test"
config['cucumber'] =
database:"sample_app_cucumber"
config['production'] =
database:"sample_app_production"
username:"sample_app"
password:"secret_word"
host:"ec2-10-18-1-115.us-west-2.compute.amazonaws.com"
_.extend exports, config.common
specificConfig = config[process.env.NODE_ENV ?'development']
if specificConfig?
_.extend exports, specificConfig

YAML is data centric language, so its inheritance is more like “mixin” another piece of data. So I uses underscore to help me to mixin the specific configuration over the default one, which overrides the overlapped values.

But if we jump out of the YAML’s box, let us think about the Javascript itself, Javascript is a prototype language, which means it had already provide an overriding mechanism natively. Each object inherits and overrides the value from its prototype.
So I worked out the 2nd solution:

Prototype based Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
config = {}
config['common'] =
adapter: "mysql"
encoding: "utf8"
database: "sample_app_development"
username: "root"
config['development'] = {}
config['development'].__proto__ = config['common']
config['test] =
__proto__: config['common']
database:"sample_app_test"
config['cucumber'] =
__proto__: config['test']
database:"sample_app_cucumber"
config['production'] =
__proto__: config['common']
database:"sample_app_production"
username:"sample_app"
password:"secret_word"
host:"ec2-10-18-1-115.us-west-2.compute.amazonaws.com"
process.env.NODE_ENV = process.env.NODE_ENV?.toLowerCase() ?'development'
module.exports = config[process.env.NODE_ENV]

This approach works, but looks kind of ugly. Since we’re using coffee, which provides the syntax sugar for class and class inheritance.
So we have the 3rd version:

Class based configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
process.env.NODE_ENV = process.env.NODE_ENV?.toLowerCase() ? 'development'
class Config
adapter: "mysql"
encoding: "utf8"
database: "sample_app_development"
username: "root"
class Config.development extends Config
class Config.test extends Config
database: "sample_app_test"
class Config.cucumber extends Config
database: "sample_app_cucumber"
class Config.common extends Config
database: "sample_app_production"
username: "sample_app"
password: "secret_word"
host: "ec2-10-18-1-115.us-west-2.compute.amazonaws.com"
module.exports = new Config[process.env.NODE_ENV]()

Now the code looks clean, and we can improve it a step further if necessary. We can try to separate the configurations into files, and required by the file name:

Class based configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# config/config.coffee
configName = process.env.NODE_ENV = process.env.NODE_ENV?.toLowerCase() ? 'development'
SpecificConfig = requrie("./envs/#{configName}")
module.exports = new SpecificConfig()
# config/envs/commmon.coffee
class Common
adapter: "mysql"
encoding: "utf8"
database: "sample_app_development"
username: "root"
module.exports = Common
# config/envs/development.coffee
Common = require('./common')
class Development extends Common
module.exports = Development
# config/envs/test.coffee
Common = require('./common')
class Test extends Common
database: "sample_app_test"
module.exports = Test
# config/envs/cucumber.coffee
Test = require('./common')
class Cucumber extends Test
database: "sample_app_cucumber"
module.exports = Cucumber
# config/envs/production.coffee
Common = require('./common')
class Production extends Common
database: "sample_app_production"
username: "sample_app"
password: "secret_word"
host: "ec2-10-18-1-115.us-west-2.compute.amazonaws.com"
module.exports = Production

How to launch Mac OS Terminal as Interactive Shell rather than Log-in Shell

As described in previous post, Mac OS launch its terminal as Log-In shell rather than Interactive Shell, which is different to default behavior of Unix and Linux. As a result, Terminal will load “bash_profile” as its profile rather than the normal “bashrc”.

This unique behavior might cause some problem when you try to port CLI tool from Unix or Linux.
Because basically, the ported app infers that the bash_profile should be loaded only once, and only when user just logged in. But in Mac OS, this inference is wrong, which can cause some weird problem.

This default behavior sometimes is annoying, and in fact, this Mac OS Terminal’s “unique” behavior can be configured. And even more, you can use other shell program, such as ksh, rather than the default bash.

Mac user can customize this behavior in Terminal’s Preferences dialog of Terminal app.
Terminal Preferences Dialog

If you choose the command to launch bash, the launched shell will become a interactive shell, which will load .bashrc file rather than .bash_profile file.

Bash Profile on Mac OS X

In Linux and Unix world, there are 2 common used shell profiles: ~/.bashrc and ~/.bash_profile. These two profiles are usually used to initialize user bash environment, but there still are some slightly differences between them two.
According to bash manual, .bashrc is “interactive-shell startup file”, and .bash_profile is “login-shell startup file”.

What’s the difference between interactive-shell and login-shell

Basically, the login-shell means the shell opened when user log in via console. It could be the shell opened on local computer after you entered correct user name and password, or the shell opened when you ssh to a remote host.
So according to the bash_profile will be loaded only once, that’s right after you logged into a computer, either locally or remotely.

And, on the other hand, the interactive-shell could be more widely used, be seen more often. It is the shell opened after you logged in, such as the shell opened from KDE or Gnome.

Mac Terminal’s Pitfall

According to the manual, the Terminal App on Mac is the typical “interactive-shell”, so theoretically Terminal should load “.bashrc” to initialize the shell environment. But the fact is Terminal doesn’t load the “.bashrc”, instead it load “.bash_profile” for initialization.
So in a word, Mac’s Terminal doesn’t follow the routine strictly. We need to be aware it.

And not all the shell are interactive! If the shell is not interactive, the Terminal App won’t load the profile file to initialize the environment.
A typical non-interactive shell in the shell that TextMate used to run command script, which means in TextMate’s shell, these environment variables, path and even alias you used in you daily life might not be available for TextMate’s command.
And also the most hurt one, the rvm function also won’t be available in TextMate’s command shell, which means if you call rake or rails in TextMate’s command script, you are very possibly got error because it cannot find proper gem or other resources.
So you should always remember to source and run the “.bash_profile” file or setup these values once again.

Powershell script to serialize and deserialize hash-object to and from ini-like text

Powershell and .net fx provides a dozen of approaches to manipulate hash-object. And it is really easy and convenient to initialize hash-object with values from environment variables, registry or cli arguments.
And Hash-Object can be accessed and built into hierarchy easily, so to use powershell hash-object as deploy configuration is really powerful and convenient.

But in our system, the application uses the ini-like key-value pair plain text as the initial configuration file. So our deploy script need the ability to serialize and deserialize hash-object to and from ini-like config.

So I composed this piece of script.