Learning Node.js, Part 5

Here I’ll take a step back from previous posts and cover the HTTP API of Node.js in a focused way.

At Node’s core is a relatively simple streaming HTTP parser. When I say relatively simple what I mean is that it contains a shade under two thousand lines of optimized C code. This parser, in combination with a low-level TCP API that Node exposes to JavaScript, provides you with a very minimal, but very flexible, HTTP server.

All of this is provided to you by a module called http. Like most modules in Node’s core, the http module favors simplicity. High-level — so-called “sugar” — APIs are left for third-party frameworks which provide simplified mechanisms for building web applications. Two examples of this would be Connect and Express.

In Node, the low-level APIs remain at the core, while abstractions and implementations are built on top of the fundamental building blocks the core provides. What this means is that Node’s core APIs are always lightweight. Because of this design, high-level concepts like sessions or even fundamentals like HTTP cookies are not provided within Node’s core.

This leaves “opinionated” framework-style elements, syntactic sugar, and specific details up to modules that are written by the wider Node.js community. Community members take the core APIs and create modules that allow you to get tasks done more easily. They do this by building infrastructure or functionality for you.

All of this should give you some insight into Node’s philosophy of construction, which is to provide small but robust networking APIs that can be built upon. This means Node makes no pretense to competing with high-level frameworks like Rails, Django, or Zend. Instead Node serves as a platform for similar such frameworks to be constructed.

The HTTP Module

Node.js is a modular framework built with a modern module system that is based on the CommonJS standard. Everything in Node.js is built as modules running in the V8 JavaScript engine. Since all the functionality in Node.js is built as modules, you need to import it in your code using a require statement. Node.js has several modules compiled in a binary form, called core modules. The HTTP module is one of those core modules.

The Basic HTTP API

To create an HTTP server, you call a createServer() function on an http object that holds a reference to the http module. Let’s try out a simple example. Create a file called server.js with the following:

With the first statement, Node will load the core HTTP module, and that module will be available in a variable called http.

The createServer() function accepts a single argument, which is a request listener. The request listener is a function that handles the incoming requests. In this case, the function is passed inline but you could also call out to a separate function that you define. To be completely accurate, the request listener here is a callback function. This callback function will be called on each HTTP request received by the server. This request callback function receives, as arguments, the request and response objects that are part of the HTTP communication.

Understand that every request that comes in to the server triggers an event, which is then handled by the event handler asynchronously. The event in this case is receiving a request to serve. The request object will have all the information about the request, such as the URL, the HTTP method, any relevant headers, and data, if any, that is being sent along with the request. The response object contains any information that gets sent back to the client and it exposes various methods to write the response to the client.

So, with this code, when the callback function has been triggered, the write() method, on the response object, writes response data to the socket. Then the end() method, also called on the response object, ends the response.

The final line of code has you bind the server to a port so that you can listen for incoming requests. To test this out, simply run it:

node server.js

If you visit http://localhost:3000 you will see the text ‘testing’ returned to the browser. This admittedly simple execution demonstrates the bare minimum required for a proper HTTP response.

For every HTTP request received by the server, the request callback function will be invoked with new req and res objects. Prior to the callback being triggered, Node will parse the request up through the HTTP headers and provide those headers as part of the request object.

You can check the request headers if you want to see what gets sent in:

Incidentally, for any change you make you have to stop the server and restart it — unless you are using something like nodemon, node-supervisor, or something similar. See the first post in this series, under the section “Monitoring for Changes” if you want to save yourself the stop-start process.

Just to be clear regarding the above example, while that shows you the request headers — i.e., those headers sent to the server — there are also headers being sent as part of the response as well. If you say nothing to the contrary in your code, Node uses the default status code of 200 (indicating success) and the default response headers. So the above code is executing as if the following line were in place:

The writeHead() method does just what it sounds like: writes out a header to the response stream. In this case, you pass a value for the status code and hash that indicates a specific header key-value.

Now let’s set some of the headers for the response, just so you can see how to do it.

Regarding that Content-Length entry, the idea there is that to speed up responses, the Content-Length field should be sent with your response when possible. Setting the Content-Length header implicitly disables what’s called “chunked encoding.” The reason provides a performance boost is because less data needs to be transferred.

It’s very important to note that tou can set headers — and remove them — as well as change status codes up to the first write() or end() action. Once the first write() action occurs, Node flushes the HTTP headers.

To show why the headers matter, let’s do a slight variation on the above.

You will find that the output, though in HTML format, is rendered by the browser as text because that’s what the headers state will be returned as part of the response object. (Which is also the default, as I showed earlier.) A simple change will take care of this problem:

Now the response header indicates that the content being returned is in HTML format and the browser will respond accordingly, parsing the HTML rather than simply displaying it.

Now let’s try setting the status code. To do this, I’ll need to make a few modifications.

Here I set a header saying that the location of this response is a particular URL. Further, while the rendering of the page would be a status code of 200, since the page successfully gets returned as part of the request, I have specifically set the status code to 302 to indicate that there was a redirect.

So the key things to note with these examples are that you can control entirely what gets sent back as a response and, in fact, Node’s HTTP module is assuming very little about how you want your server to operate.

Routing

All examples shown so far worked by passing in the request listener — the callback — inline. I also mentioned it is possible to call a function directly. So I’ll do that while showing an example of how to perform routing.

Just about any web application will serve more than one resource, where “resource” can mean an HTML page, an image, a stylesheet, and so on. At this point you know how to serve content using an HTTP server, but how do you handle multiple resources? That’s what routing is for. Specifically, your server needs to understand the incoming request and map it to the appropriate request handler, which means the request is routed.

To demonstrate the routing of requests, we’ll build a slight modification of the above. This will still be a very simple application that serves two resources at /hello and /goodbye, displaying some specific text for each.

If you run this, you should be able to go to http://localhost:3000/goodbye and http://localhost:3000/hello and see the appropriate text.

Looking at the code, you’ll see I’m requiring another module of Node, called url. The reason for this is because the first thing that you need in order to route a request is to parse the URL that comes in as part of the request. When a URL string is parsed using the url module, it returns an instance of the URL. In this case, I’m interested in the pathname. So in line 5 I’m passing the url string from the request, and parsing it using the url module, to get the pathname.

What this means is that when I type in the url http://localhost:3000/hello in my browser, the pathname variable will hold “/hello”.

Notice how my call to createServer now calls the onRequest function, which serves as my named callback function. Prior to this the callback function was defined inline and thus was anonymous.

The next step is to send an appropriate response, based on the path being accessed. In this code, I simply check the value of pathname and then re-use very similar logic from the previous examples to write out a response. Do note that if no route can be matched, I generate a 404 response.

I trust it’s clear that this is an incredibly inefficient way to go about routing since I would have to add many more if/else statements. That said, if my server needs were simple enough, this might be the most effective solution. But let’s go on the assumption that it’s not scalable enough and make this functionality a little more robust.

Instead of the above logic, I can put the handlers in a route object. The handlers will be mapped by their paths. This will allow me provide a small API to extend the functionality provided by the route object. Here’s a modified example:

The functionality of the application is exactly the same as before but in this approach I don’t need to continually add “else if” blocks. Instead, I can add new routes by calling the route.for method from any module. Note that for is a method provided by the route object. The route has a map of the path to the handler function and also has an on method to add new routes. So in this case, I added two handlers for the paths /hello and /goodbye.

Take some time to study the above code and make sure you can see what it’s doing. Do note that the signature for the handlers is similar to the main request handler in that I expect the handlers to get the request (req) and response (res) so that the handler has everything it needs to process the request and send the response.

In the if condition (line 26), I check whether the route for the pathname is present, and whether it’s a function. If I find a handler for the requested path, I simply execute the handler function, passing the request and response to it (which is what line 27 is doing). If a handler is not found, I respond with the HTTP 404 response as I did in the previous example when no route could be found.

The nice thing about this approach is that to add a new path, I can just call the route.for method with the path and its handler to register it.

However, you can be forgiven for thinking something like this: “Okay, this started out nice and simple. But it’s getting a little … messy.” Well, the trick with Node.js is that you are building your own server. However, given Node’s modular system, the nice thing for you is that someone else has probably already done what you need. And, in fact, that’s the case, such as with the Express framework as well as others.

An HTTP-Compliant App Using the API

Everything I dealt with above was utilizing a GET request. It helps to understand the HTTP verbs when building your own web server, as you can probably imagine. GET is one of those verbs and, so far, it’s the only one I’m handling. There was no need to worry about handling, say, a POST request. So first let’s enhance our above app to handle one different HTTP method: POST.

As the first step towards this, we will add the ability to add different handlers for different methods. We will add the methods in the mapping.

To test a different HTTP method, like a POST call, we have to have some way to post a request to a handler. There are various ways to do this but let’s actually create a simple HTML form that will allow us to post data to the server. To do this we’ll add an event handler for a /form path.

If you go to http://localhost:3000/form, you will see a text field along with echo button which you can click. Clicking the button will generate a POST event and since the current app doesn’t handle POST events — there is no route for one — a 404 will be returned. So let’s add the route to handle a POST event.

Here I’m adding a new handler for the POST request on the /form path. Since req is an event emitter, I’m able to attach an event handler to it for each task: for handling chunks of incoming data and for completing the request processing once all the data is received. It’s important to understand that when Node’s HTTP parser reads in and parses request data, it makes that data available in the form of data events that contain chunks of parsed data ready to be handled by the program.

So on lines 40 to 42 I add a listener on the request to handle chunks of incoming data. Specically, I’m saying that I’m looking to handle the ‘data’ event on the request object. In this case, all I do is accumulate the incoming data into a message string.

The event handler for the ‘end’ event will be invoked once all the data sent in POST has been received. This is the time at which the server finishes receiving all the data. To echo all this back, I just send back all the accumulated data, which is what line 46 is doing.

Wrapping Up

This post covered a lot of detail. As you can tell, the goal was not to end up with the best looking server in the world. In fact, there’s two aspects to this immediate code that could use improvement:

  1. Make it serve static files.
  2. Modularize it so the server and router are not packed with the handlers.

You also no doubt see that you are doing all of the server building work yourself whereas normally you would let a framework like express handle those details for you. Some of these topics, like serving static files and using Express, were covered in other posts in this series. The reason I went back to basics with those post is I feel it’s really helpful to do that after you’ve explored beyond those basics.

About Jeff Nyman

Anything I put here is an approximation of the truth. You're getting a particular view of myself ... and it's the view I'm choosing to present to you. If you've never met me before in person, please realize I'm not the same in person as I am in writing. That's because I can only put part of myself down into words. If you have met me before in person then I'd ask you to consider that the view you've formed that way and the view you come to by reading what I say here may, in fact, both be true. I'd advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.
This entry was posted in Node.js. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *