This is essentially a follow-on from my previous post on Node, mainly for the purposes of helping testers like myself figure out if it’s worth spending time on.
Let’s cover a few facts here.
- Node uses JavaScript.
- JavaScript (and thus Node) is single threaded.
- That means JavaScript (and thus Node applications) can do only one thing at a time.
- JavaScript does have the concept of an event loop.
- Node does have the concept of an event-driven programming model.
- The event loop of JavaScript is used to schedule tasks in Node’s event-driven model.
- This event loop and event-driven model is what allows multiple tasks to appear to execute in parallel.
I state all that just to make it somewhat clear how JavaScript and Node benefit each other. The above is a basic bullet list of what I wish I had been told. Or, rather, told in that way.
Here’s a few other facts:
- Node operates its logic asynchronously.
- Specifically, Node utilizes continuation-passing style (CPS) programming.
- What this means is that any asynchronous function takes an extra argument.
- That extra argument is a function that is called after the asynchronous code has finished executing.
- This additional argument is referred to as a continuation or, more commonly, a callback function.
You actually saw this in the my examples from the previous post. I had a callback for the createServer() function and a callback for checking for path existence. In fact, the fs.readFile() call from my previous examples was an asynchronous function call. It basically looked like this:
1 2 3 |
fs.readFile(localPath, function(err, contents) { ... }); |
The readFile() function works asynchronously, reading in the file stored in localPath. Once the file is read, the anonymous callback function is invoked. The callback function takes two parameters, err and contents, which represent any error conditions and the contents of the file, respectively. Incidentally, some of you may come from Rails or other frameworks that are opinionated and have conventions. Node has some of those as well. For example, one convention is that if a method takes a callback function as an argument, it should be the final argument. Another convention is that if a method takes an error as an argument, it should be the first argument.
Something that’s really important to understand is that when readFile() was invoked in my example, it made a nonblocking I/O call to the file system. The fact that the I/O is nonblocking means that Node did not just sit around waiting for the file system to return the data. Instead, Node continued to the next statement. In my case, what that meant is that my server script could keep responding to requests for resources.
There’s a term in the JavaScript community called “callback hell” and that occurs when callbacks are nested within other callbacks several levels deep. This can lead to code that is confusing, difficult to read and even more difficult to maintain. Consider my code from the simple server example:
http.createServer(function (req, res) { fs.exists(localPath, function(exists) { fs.readFile(localPath, function(err, contents) {
This nesting isn’t too terrible, I suppose, but you can see how it could get a whole lot worse.
One way to handle this is to use modular functions as you would in most any programming language. Here’s an example of a server.js file that is a rewrite of the last code example from my previous post:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
var http = require('http'); var path = require('path'); var fs = require('fs'); var resources = { ".html": "text/html", ".css": "text/css", ".js": "application/javascript", ".png": "image/png" }; http.createServer(function (req, res) { var filename = path.basename(req.url) || 'index.html'; var ext = path.extname(filename); var dir = path.dirname(req.url).substring(1); var localPath = __dirname + '/public/'; if (resources[ext]) { localPath += (dir ? dir + "/" : "") + filename; fs.exists(localPath, function(exists) { if (exists) { loadFile(localPath, res, resources[ext]); } else { res.writeHead(404); res.end(); } }); } }).listen(9292, '127.0.0.1'); console.log('Server running at http://127.0.0.1:9292'); function loadFile(localPath, res, resource) { fs.readFile(localPath, function(err, contents) { if (!err) { res.writeHead(200, { "Content-Type": resource, "Content-Length": contents.length }); res.end(contents); } else { res.writeHead(500); res.end(); } }); } |
The main change is in line 21, which now calls a function. However, another approach that the JavaScript and/or Node community seems to utilize is to rely on named functions for callbacks rather than simply nesting anonymous functions, as I did. However, in the case of my exmaple I need to pass extra information to the callback functions — but all callback functions only have one parameter, which is the return from the remote function. Given that, I’m not entirely sure how to go about doing something like that with my example without using global variables. Here’s an example of what I did, however.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
var http = require('http'); var path = require('path'); var fs = require('fs'); var resources = { ".html": "text/html", ".css": "text/css", ".js": "application/javascript", ".png": "image/png" }; var response; var currentResource; http.createServer(function (req, res) { var filename = path.basename(req.url) || 'index.html'; var ext = path.extname(filename); var dir = path.dirname(req.url).substring(1); var localPath = __dirname + '/public/'; if (resources[ext]) { localPath += (dir ? dir + "/" : "") + filename; response = res; currentResource = resources[ext]; fs.exists(localPath, resourceExists); fs.readFile(localPath, resourceLoad); } }).listen(9292, '127.0.0.1'); console.log('Server running at http://127.0.0.1:9292'); function resourceExists(resource) { if (!resource) { response.writeHead(404); response.end(); } } function resourceLoad(err, contents) { if (!err) { response.writeHead(200, { "Content-Type": currentResource, "Content-Length": contents.length }); response.end(contents); } else { response.writeHead(500); response.end(); } } |
Lines 23 and 24 now make the response object and the resource available globally. This allows me to use lines 26 and 27 which are now relying on callback functions. Those callback functions require referencing the response object as well as the resource, which is why I had to make them available globally. Please note that I’m not sure the above is a good way to do this at all, but I guess it does show the flexibility that I’m allowed. Read that as: I was able to cobble something together, barely knowing what I was doing, and it managed to work. Whether that’s a good thing or not is a matter of opinion, I suppose.
In my previous post, I said: “I’m poised to enter into the wider Node ecosystem, in which there are thousands of modules and numerous frameworks (like Connect and Express) that make a lot of this easier.”
To provide an example of that, I’ll show you Connect. Connect is a very popular Node middleware framework. Connect was, in fact, inspired by Ruby’s Rack web server interface and was built on top of Node’s web server API. Essentially Connect allows small re-usable programs to be plugged into it to handle HTTP-specific functionalities. To see how this works, install Connect:
npm install connect
Note that this presumes you have npm (the Node Package Manager) installed as part of your Node distribution. Also note that I’m not doing a global install of Connect here, but rather a local one. How would you notice that? Any install that does not use the “-g” switch is a local install. That means the Connect resources will be stored in a directory called node_modules that exists within the directory where I make the above npm call.
Now I’m going to show how to replace some of my code from above, if I were to use Connect. Here’s the same application with a bit less code:
1 2 |
var connect = require("connect"); connect(connect.static(__dirname + "/public")).listen(9292); |
Did I say a bit less code? Sorry, I meant to say a metric crap-ton less code.
This certainly simplifies things. Sort of. Much like Rails, what this does is bury a lot of the implementation in a framework that has already provided that implementation for you. So while the above seems incredibly nice and tidy, there is a bit of the “magic happens here” effect. As such, it never hurts to actually understand what is going on behind the scenes, which is why I started down in the guts with my original examples. Still, the fact that you can reduce that utter mess of code I had to just two lines has to be counted as somewhat impressive, doesn’t it?
One more thing I want to do here is talk about another framework called Express. Express was inspired by Ruby’s Sinatra and it, like Connect, is built atop the Node web server API. Here’s the interesting thing: Express actually runs on top of Connect. There’s a lot of history to Connect and Express and how they came to be and why they ended up together. That’s covered adequately elsewhere so here I’ll just count myself a beneficiary of the past and use both. I won’t cover a ton of detail here because I just want to show my initial foray.
Express basically lets you create web applications as opposed to server applications, which I’ve been toying around with so far. To test this out, I created a directory called simple_express.
Since I’m going to make an application and since, to Node, applications seems to be modules, it turns out I need a manifest file. A manifest file is a file which contains meta data about your module/app. The content of the file can be used by your module to customize some aspect of itself. Node modules come with a manifest file named package.json. By the way, to ease cognitive friction, I should note that Node modules that come with a package.json file and can be installed using npm, like Express and Connect, are formally called Node packages. Most of the time people seem to use the term module and package interchangeably.
I now created a file called package.json (in my simple_express directory) and put the following in it:
1 2 3 4 5 6 7 8 9 10 11 |
{ "name": "simple-app", "version": "0.0.1", "private": true, "scripts": { "start": "node app" }, "dependencies": { "express": "4.2.0", } } |
Here I specify the name of my module (simple-app) and its version (0.0.1). Specifying private is optional but I did that because setting private to true means you do not intend to publish this module on the npm registry. The scripts key indicates the allowed npm commands for the module. I’m only supporting one, which is a start command that will essentially allow me to start the app with the command npm start
. That will execute the command node app
, which does mean the “app” above should be the name of my starting file. Finally, the dependencies are important because it says what other modules my own module depends on. Here I specified an exact version of Express but I could have just said, “grab the latest” in which case that part would look like this:
1 2 3 |
"dependencies": { "express": "*", } |
With that file in place, I’m now able to run npm install
, which will install any dependencies I have listed in a node_modules directory in my simple_express directory. Now I’ll start creating the application. In the simple_express directory, I create a app.js file and put the following in it:
1 2 3 4 5 6 7 8 9 10 11 |
var http = require('http'); var express = require('express'); var app = express(); http.createServer(app).listen(9292, function() { console.log('Express App running on port 9292.'); }); app.get('/', function(req, res) { res.send('Test Node with Express'); }); |
To get this running, I could do:
node app.js
or
node app
or
npm start
This corresponds fairly well with my simple apps that I started with in my last post.
There is a lot more that can be done with Express and that is a subject in and of itself, beyond just Node. What I hope I did here is provide enough to get my fellow testers interested in investigating Node and its frameworks.