epubjs

<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"><head><title>Chapter 4. Core APIs</title><link rel="stylesheet" href="core.css" type="text/css"/><meta name="generator" content="DocBook XSL Stylesheets V1.74.0"/></head><body><div class="chapter" title="Chapter 4. Core APIs"><div class="titlepage"><div><div><h1 class="title"><a id="chapter_6"/>Chapter 4. Core APIs</h1></div></div></div><p>There are a lot of APIs in Node, <a id="no4.0" class="indexterm"/><a id="ap4.0" class="indexterm"/><a id="I_indexterm1_d1e3866" class="indexterm"/>but some of them are more important than others. These core APIs will form the backbone of any Node app, and you’ll find yourself using them again and again.</p><div class="sect1" title="Events"><div class="titlepage"><div><div><h1 class="title"><a id="chap6_id35941616"/>Events</h1></div></div></div><p><a id="chap6_id35941797"/>The first API we are going to look at is <a id="I_indexterm1_d1e3877" class="indexterm"/>the <code class="literal">Events</code> API. This is because, while abstract, it is a fundamental piece of making every other API work. By having a good grip on this API, you’ll be able to use all the other APIs effectively.</p><p><a id="chap6_id35941805"/>If you’ve ever programmed JavaScript in the browser, you’ll have used events before. However, the event model used in the browser comes from the DOM rather than JavaScript itself, and a lot of the concepts in the DOM don’t necessarily make sense out of that context. Let’s look at the DOM model of events and compare it to the implementation in Node.</p><p><a id="chap6_id35941811"/>The DOM has a user-driven event model based on user interaction, with a set of interface elements arranged in a tree structure (HTML, XML, etc.). This means that when a user interacts with a particular part of the interface, there is an event and a context, which is the HTML/XML element on which the click or other activity took place. That context has a parent and potentially children. Because the context is within a tree, the model includes the concepts of bubbling and capturing, which allow elements either up or down the tree to receive the event that was called.</p><p><a id="chap6_id35951007"/>For example, in an HTML list, a click event on an <code class="literal"><li></code> can be captured by a listener on the <code class="literal"><ul></code> that is its parent. Conversely, a click on the <code class="literal"><ul></code> can be bubbled down to a listener on the <code class="literal"><li></code>. Because JavaScript objects don’t have this kind of tree structure, the model in Node is much simpler.</p><div class="sect2" title="EventEmitter"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35941832"/>EventEmitter</h2></div></div></div><p><a id="chap6_id35941838"/>Because the event model is tied to the DOM in browsers, Node created the<a id="I_indexterm1_d1e3910" class="indexterm"/><a id="I_indexterm1_d1e3915" class="indexterm"/> <code class="literal">Event</code><code class="literal">Emitter</code> class to provide some basic event functionality. All event functionality in Node revolves around <code class="literal">EventEmitter</code> because it is also designed to be an interface class for other classes to extend. It would be unusual to call an <code class="literal">EventEmitter</code> instance directly.</p><p><a id="chap6_id35951058"/><code class="literal">EventEmitter</code> has a handful of methods, the main two <a id="I_indexterm1_d1e3936" class="indexterm"/><a id="I_indexterm1_d1e3941" class="indexterm"/>being <code class="literal">on</code> and <code class="literal">emit</code>. The class provides these methods for use by <a id="I_indexterm1_d1e3953" class="indexterm"/>other classes. The <code class="literal">on</code> method creates an <a id="I_indexterm1_d1e3963" class="indexterm"/><a id="I_indexterm1_d1e3968" class="indexterm"/><a id="I_indexterm1_d1e3973" class="indexterm"/>event listener for an event, as shown in <a class="xref" href="ch04.html#chap6_id35941862" title="Example 4-1. Listening for an event with the on method">Example 4-1</a>.</p><div class="example"><a id="chap6_id35941862"/><p class="title">Example 4-1. Listening for an event with the on method</p><div class="example-contents"><a id="chap6_id35941868"/><pre class="programlisting">server.on('event', function(a, b, c) { //do things });</pre></div></div><p><a id="chap6_id35941878"/>The <code class="literal">on</code> method takes two parameters: the name of the event to listen for and the function to call when that event is emitted. Because <code class="literal">EventEmitter</code> is an interface pseudoclass, the class that inherits from <code class="literal">EventEmitter</code> is expected to be invoked <a id="I_indexterm1_d1e3997" class="indexterm"/>with the <code class="literal">new</code> keyword. Let’s look at <a class="xref" href="ch04.html#chap6_id35941905" title="Example 4-2. Creating a new class that supports events with EventEmitter">Example 4-2</a> to see how we create a new class as a listener.</p><div class="example"><a id="chap6_id35941905"/><p class="title">Example 4-2. Creating a new class that supports events with EventEmitter</p><div class="example-contents"><a id="chap6_id35941911"/><pre class="programlisting">var utils = require('utils'), EventEmitter = require('events').EventEmitter; var Server = function() { console.log('init'); }; utils.inherits(Server, EventEmitter); var s = new Server(); s.on('abc', function() { console.log('abc'); });</pre></div></div><p><a id="chap6_id35941922"/>We begin this example by including the <code class="literal">utils</code> <a id="I_indexterm1_d1e4017" class="indexterm"/>module so we can use the<a id="I_indexterm1_d1e4023" class="indexterm"/> <code class="literal">inherits</code> method. <code class="literal">inherits</code> provides a way for the <code class="literal">EventEmitter</code> class to add its methods to the <code class="literal">Server</code> class we created. This means all new instances of <code class="literal">Server</code> can be used as <code class="literal">Event</code><code class="literal">Emitter</code>s.</p><p><a id="chap6_id35951154"/>We then include the <code class="literal">events</code> module. However, we want to access just the specific <code class="literal">EventEmitter</code> class inside that module. Note how <code class="literal">EventEmitter</code> is capitalized to show it is a class. We didn’t use a <code class="literal">createEventEmitter</code> method, because we aren’t planning to use an <code class="literal">EventEmitter</code> directly. We simply want to attach its methods to the <code class="literal">Server</code> class we are going to make.</p><p><a id="chap6_id35941951"/>Once we have included the modules we need, the next step is to create our basic <code class="literal">Server</code> class. This offers just one simple function, which logs a message when it is initialized. In a real implementation, we would decorate the <code class="literal">Server</code> class prototype with the functions that the class would use. For the sake of simplicity, we’ve skipped that. The important step is to use <code class="literal">sys.inherits</code> to add <code class="literal">EventEmitter</code> as a superclass of our <code class="literal">Server</code> class.</p><p><a id="chap6_id35941967"/>When we want to use the <code class="literal">Server</code> class, we instantiate it with <code class="literal">new Server()</code>. This instance of <code class="literal">Server</code> will have access to the methods in the superclass (<code class="literal">EventEmitter</code>), which means we can add a listener to our instance using the <code class="literal">on</code> method.</p><p><a id="chap6_id35941987"/>Right now, however, the event listener we added will never be called, because the <code class="literal">abc</code> event isn’t fired. We can fix this by adding the code in <a class="xref" href="ch04.html#chap6_id35941998" title="Example 4-3. Emitting an event">Example 4-3</a> to <code class="literal">emit</code> the <a id="I_indexterm1_d1e4116" class="indexterm"/>event.</p><div class="example"><a id="chap6_id35941998"/><p class="title">Example 4-3. Emitting an event</p><div class="example-contents"><a id="chap6_id35942004"/><pre class="programlisting">s.emit('abc');</pre></div></div><p><a id="chap6_id35942014"/>Firing the <a id="I_indexterm1_d1e4130" class="indexterm"/>event listener is as simple as calling the <code class="literal">emit</code> method that the <code class="literal">Server</code> instance inherited from <code class="literal">EventEmitter</code>. It’s important to note that these events are instance-based. There are no <span class="emphasis"><em>global</em></span> events. When you call the <code class="literal">on</code> method, you attach to a specific <code class="literal">EventEmitter</code>-based object. Even the various instances of the <code class="literal">Server</code> class don’t share events. <code class="literal">s</code> from the code in <a class="xref" href="ch04.html#chap6_id35941998" title="Example 4-3. Emitting an event">Example 4-3</a> will not share the same events as another <code class="literal">Server</code> instance, such as one created by <code class="literal">var z = new Server();</code>.</p></div><div class="sect2" title="Callback Syntax"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35942050"/>Callback Syntax</h2></div></div></div><p><a id="chap6_id35942056"/>An important part of using <a id="ev4.1.2" class="indexterm"/><a id="ca4.1.2" class="indexterm"/><a id="eva4.1.2" class="indexterm"/>events is dealing with callbacks. Chapter 3 looks at best practices in much more depth, but we’ll look here at the mechanics of callbacks in Node. They use a few standard <a id="I_indexterm1_d1e4191" class="indexterm"/>patterns, but first let’s discuss what is possible.</p><p><a id="chap6_id35942081"/>When <a id="I_indexterm1_d1e4199" class="indexterm"/><a id="I_indexterm1_d1e4204" class="indexterm"/>calling <code class="literal">emit</code>, in addition to the event name, you can also pass an arbitrary list of parameters. <a class="xref" href="ch04.html#chap6_id35942065" title="Example 4-4. Passing parameters when emitting an event">Example 4-4</a> includes three such parameters. These will be passed to the function listening to the event. When you receive <a id="I_indexterm1_d1e4215" class="indexterm"/>a <code class="literal">request</code> event from the <code class="literal">http</code> server, for example, you receive two parameters: <code class="literal">req</code> and <code class="literal">res</code>. When the<a id="I_indexterm1_d1e4232" class="indexterm"/> <code class="literal">request</code> event was emitted, those parameters were passed as the second and third arguments to the <code class="literal">emit</code>.</p><div class="example"><a id="chap6_id35942065"/><p class="title">Example 4-4. Passing parameters when emitting an event</p><div class="example-contents"><a id="chap6_id35942071"/><pre class="programlisting">s.emit('abc', a, b, c);</pre></div></div><p><a id="chap6_id35942109"/>It is important to understand how Node calls the event listeners because it will affect your programming style. When <code class="literal">emit()</code> is called with arguments, the code in <a class="xref" href="ch04.html#chap6_id35942117" title="Example 4-5. Calling event listeners from emit">Example 4-5</a> is used to call each <a id="I_indexterm1_d1e4254" class="indexterm"/>event listener.</p><div class="example"><a id="chap6_id35942117"/><p class="title">Example 4-5. Calling event listeners from emit</p><div class="example-contents"><a id="chap6_id35942123"/><pre class="programlisting">if (arguments.length <= 3) { // fast case handler.call(this, arguments[1], arguments[2]); } else { // slower var args = Array.prototype.slice.call(arguments, 1); handler.apply(this, args); }</pre></div></div><p><a id="chap6_id35942133"/>This code uses both of the JavaScript methods for calling a function from code. If <code class="literal">emit()</code> is passed with three or fewer arguments, the method takes a shortcut and uses <code class="literal">call</code>. Otherwise, it uses the slower <code class="literal">apply</code> to pass all the arguments as an <code class="literal">array</code>. The important thing to recognize here, though, is that Node makes both of these calls using the <code class="literal">this</code> argument directly. This means that the context in which the event listeners are called is the context of <code class="literal">EventEmitter</code>—<span class="emphasis"><em>not</em></span> their original context. Using Node REPL, you can see what is happening when things get called by <code class="literal">EventEmitter</code> (<a class="xref" href="ch04.html#chap6_id35942160" title="Example 4-6. The changes in context caused by EventEmitter">Example 4-6</a>).</p><div class="example"><a id="chap6_id35942160"/><p class="title">Example 4-6. The changes in context caused by EventEmitter</p><div class="example-contents"><a id="chap6_id35942165"/><pre class="programlisting">> var EventEmitter = require('events').EventEmitter, ... util = require('util'); > > var Server = function() {}; > util.inherits(Server, EventEmitter); > Server.prototype.outputThis= function(output) { ... console.log(this); ... console.log(output); ... }; [Function] > > Server.prototype.emitOutput = function(input) { ... this.emit('output', input); ... }; [Function] > > Server.prototype.callEmitOutput = function() { ... this.emitOutput('innerEmitOutput'); ... }; [Function] > > var s = new Server(); > s.on('output', s.outputThis); { _events: { output: [Function] } } > s.emitOutput('outerEmitOutput'); { _events: { output: [Function] } } outerEmitOutput > s.callEmitOutput(); { _events: { output: [Function] } } innerEmitOutput > s.emit('output', 'Direct'); { _events: { output: [Function] } } Direct true ></pre></div></div><p><a id="chap6_id35942176"/>The sample output first sets up a <code class="literal">Server</code> class. It includes functions to <code class="literal">emit</code> the <code class="literal">output</code> event. The <code class="literal">outputThis</code> method is attached to the <code class="literal">output</code> event as an event listener. When we <code class="literal">emit</code> the <code class="literal">output</code> event from various contexts, we stay within the scope of the <code class="literal">EventEmitter</code> object, so the value of <code class="literal">this</code> that <code class="literal">s.outputThis</code> has access to is the one belonging to the <code class="literal">EventEmitter</code>. Consequently, the <code class="literal">this</code> variable must be passed in as a parameter and assigned to a variable if we wish to make use of it in event callback functions<a id="I_indexterm1_d1e4339" class="indexterm"/><a id="I_indexterm1_d1e4341" class="indexterm"/><a id="I_indexterm1_d1e4343" class="indexterm"/>.</p></div></div><div class="sect1" title="HTTP"><div class="titlepage"><div><div><h1 class="title"><a id="chap6_id35942219"/>HTTP</h1></div></div></div><p><a id="chap6_id35942224"/>One of the core tasks of Node.js is to act as a <a id="I_indexterm1_d1e4351" class="indexterm"/>web server. This is such a key part of the system that when <a id="I_indexterm1_d1e4357" class="indexterm"/>Ryan Dahl started the project, he rewrote the HTTP stack for V8 to make it nonblocking. Although both the API and the internals for the original HTTP <a id="I_indexterm1_d1e4361" class="indexterm"/>implementation have morphed a lot since it was created, the core activities are still the same. The Node implementation of HTTP is nonblocking and fast. Much of the code has moved from C into JavaScript.</p><p><a id="chap6_id35942231"/>HTTP uses a pattern that is common in Node. Pseudoclass <a id="I_indexterm1_d1e4369" class="indexterm"/>factories provide an easy way to create a new <a id="I_indexterm1_d1e4373" class="indexterm"/>server.<sup>[<a id="id828643" href="#ftn.id828643" class="footnote">7</a>]</sup> The <code class="literal">http.createServer()</code> <a id="I_indexterm1_d1e4391" class="indexterm"/>method provides us with a new instance of the HTTP <code class="literal">Server</code> class, which is the class we use to define the actions taken when Node receives incoming HTTP requests. There are a few other main pieces of the HTTP module and other Node modules in general. These are the events the <code class="literal">Server</code> class fires and the data structures that are passed to the callbacks. Knowing about these three types of class allows you to use the HTTP module well.</p><div class="sect2" title="HTTP Servers"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35942268"/>HTTP Servers</h2></div></div></div><p><a id="chap6_id35942274"/>Acting as an<a id="hts4.2.1" class="indexterm"/><a id="htm4.2.1" class="indexterm"/> HTTP server is probably the most common current use case for Node. In <a class="xref" href="ch01.html" title="Chapter 1. A Very Brief Introduction to Node.js">Chapter 1</a>, we set up an HTTP server and used it to serve a very simple request. However, HTTP is a lot more multifaceted than that. The server component of the HTTP module provides the raw tools to build complex and comprehensive web servers. In this chapter, we are going to explore the mechanics of dealing with requests and issuing responses. Even if you end up using a higher-level server such as Express, many of the concepts it uses are extensions of those defined here.</p><p><a id="chap6_id35942281"/>As we’ve already seen, the first step in using HTTP servers is to create a new server using the <code class="literal">http.createServer()</code> <a id="I_indexterm1_d1e4427" class="indexterm"/>method. This returns a new instance of <a id="I_indexterm1_d1e4433" class="indexterm"/><a id="I_indexterm1_d1e4438" class="indexterm"/>the <code class="literal">Server</code> class, which has only a few methods because most of the functionality is going to be provided through using events. The <code class="literal">http</code> server class has six events and three methods. The other thing to notice is how most of the methods are used to initialize the server, whereas events are used during its operation.</p><p><a id="chap6_id35942295"/>Let’s start by creating the smallest basic HTTP server code we can in <a class="xref" href="ch04.html#chap6_id35942299" title="Example 4-7. A simple, and very short, HTTP server">Example 4-7</a>.</p><div class="example"><a id="chap6_id35942299"/><p class="title">Example 4-7. A simple, and very short, HTTP server</p><div class="example-contents"><a id="chap6_id35942304"/><pre class="programlisting">require('http').createServer(function(req,res){res.writeHead(200, {}); res.end('hello world');}).listen(8125);</pre></div></div><p><a id="chap6_id35942315"/>This example is <span class="emphasis"><em>not</em></span> good code. However, it illustrates some important points. We’ll fix the style shortly. The first thing we do is<a id="I_indexterm1_d1e4464" class="indexterm"/> <code class="literal">require</code> the <code class="literal">http</code> module. Notice how we can chain methods to access the module without first assigning it to a variable. Many things in Node return a function,<sup>[<a id="id828838" href="#ftn.id828838" class="footnote">8</a>]</sup> which allows us to invoke those functions immediately. From the included <code class="literal">http</code> module, we call <code class="literal">createServer</code>. This doesn’t have to take any arguments, but we pass it a function to attach to the <code class="literal">request</code> event. Finally, we tell the server created with <code class="literal">createServer</code> to <code class="literal">listen</code> on <a id="I_indexterm1_d1e4494" class="indexterm"/><a id="I_indexterm1_d1e4500" class="indexterm"/>port 8125.</p><p><a id="chap6_id35942351"/>We hope you never write code like this in real situations, but it does show the flexibility of the syntax and the potential brevity of the language. Let’s be a lot more explicit about our code. The rewrite in <a class="xref" href="ch04.html#chap6_id35942357" title="Example 4-8. A simple, but more descriptive, HTTP server">Example 4-8</a> should make it a lot easier to understand and maintain.</p><div class="example"><a id="chap6_id35942357"/><p class="title">Example 4-8. A simple, but more descriptive, HTTP server</p><div class="example-contents"><a id="chap6_id35942362"/><pre class="programlisting">var http = require('http'); var server = http.createServer(); var handleReq = function(req,res){ res.writeHead(200, {}); res.end('hello world'); }; server.on('request', handleReq); server.listen(8125);</pre></div></div><p><a id="chap6_id35942373"/>This example implements the minimal web server again. However, we’ve started assigning things to named variables. This not only makes the code easier to read than when it’s chained, but also means you can reuse it. For example, it’s not uncommon to use <code class="literal">http</code> more than once in a file. You want to have both an HTTP server and an HTTP client, so reusing the module object is really helpful. Even though JavaScript doesn’t force you to think about memory, that doesn’t mean you should thoughtlessly litter unnecessary objects everywhere. So rather than use an anonymous callback, we’ve named the function that handles the <code class="literal">request</code> event. This is less about memory usage and more about readability. We’re not saying you shouldn’t use anonymous functions, but if you can lay out your code so it’s easy to find, that helps a lot when maintaining it.</p><div class="note" title="Note"><h3 class="title">Note</h3><p><a id="chap6_id35942392"/>Remember to look at <a class="xref" href="pt01.html" title="Part I. Up and Running">Part I</a> of the book for more help with programming style. Chapters <a class="xref" href="ch01.html" title="Chapter 1. A Very Brief Introduction to Node.js">1</a> and <a class="xref" href="ch02.html" title="Chapter 2. Doing Interesting Things">2</a> deal with programming style in particular.</p></div><p><a id="chap6_id35942397"/>Because we didn’t pass the <code class="literal">request</code> event listener as part of the factory method for the <code class="literal">http Server</code> object, we need to add an event listener explicitly. Calling the <code class="literal">on</code> method from <code class="literal">EventEmitter</code> does this. Finally, as with the previous example, we call the <code class="literal">listen</code> method with the port we want to listen on. The <code class="literal">http</code> class provides other functions, but this example illustrates the most important ones.</p><p><a id="chap6_id35942423"/>The <code class="literal">http</code> server supports a number of events, which are associated with either the TCP or HTTP connection to the client. The <code class="literal">connection</code> and <code class="literal">close</code> events <a id="I_indexterm1_d1e4565" class="indexterm"/><a id="I_indexterm1_d1e4570" class="indexterm"/><a id="I_indexterm1_d1e4575" class="indexterm"/><a id="I_indexterm1_d1e4578" class="indexterm"/>indicate the buildup or teardown of a TCP connection to a client. It’s important to remember that some clients will be using HTTP 1.1, which supports keepalive. This means that their TCP connections may remain open across multiple HTTP requests.</p><p><a id="chap6_id35942452"/>The <code class="literal">request</code>, <code class="literal">checkContinue</code>, <code class="literal">upgrade</code>, and <code class="literal">clientError</code> events <a id="I_indexterm1_d1e4596" class="indexterm"/><a id="I_indexterm1_d1e4601" class="indexterm"/><a id="I_indexterm1_d1e4607" class="indexterm"/><a id="I_indexterm1_d1e4612" class="indexterm"/><a id="I_indexterm1_d1e4615" class="indexterm"/><a id="I_indexterm1_d1e4618" class="indexterm"/>are associated with HTTP requests. We’ve already used the <code class="literal">request</code> event, which signals a new HTTP request.</p><p><a id="chap6_id35949883"/>The <code class="literal">checkContinue</code> event indicates a special event. It allows you to take more direct control of an HTTP request in which the client streams chunks of data to the server. As the client sends data to the server, it will check whether it can continue, at which point this event will fire. If an event handler is created for this event, the <code class="literal">request</code> event will <span class="emphasis"><em>not</em></span> be emitted.</p><p><a id="chap6_id35949899"/>The <code class="literal">upgrade</code> event is emitted when a client asks for a protocol upgrade. The <code class="literal">http</code> server will deny HTTP upgrade requests unless there is an event handler for this event.</p><p><a id="chap6_id35949910"/>Finally, the <code class="literal">clientError</code> event passes on any error events sent by the client.</p><p><a id="chap6_id35960502"/>The HTTP server can throw a few events. The most common one is <code class="literal">request</code>, but you can also get events associated with the <code class="literal">TCP</code> connection for the request as well as other parts of the request life cycle.</p><p><a id="chap6_id35960513"/>When a new TCP stream is created for a request, a <code class="literal">connection</code> event is emitted. This event passes the TCP stream for the request as a parameter. The stream is also available as a <code class="literal">request.connection</code> variable for each request that happens through it. However, only one <code class="literal">connection</code> event will be emitted for each stream. This means that many <code class="literal">request</code>s can happen from a client with only one<a id="I_indexterm1_d1e4671" class="indexterm"/> <a id="I_indexterm1_d1e4675" class="indexterm"/><code class="literal">connection</code> event.</p></div><div class="sect2" title="HTTP Clients"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35942512"/>HTTP Clients</h2></div></div></div><p><a id="chap6_id35942517"/>Node is also great when you want to make outgoing HTTP connections. This is useful in many contexts, such as using web services, connecting to document store databases, or just scraping websites. You can use the same <code class="literal">http</code> module when doing HTTP requests, but should use the <code class="literal">http.ClientRequest</code> class. <a id="htc4.2.2" class="indexterm"/><a id="htmh4.2.2" class="indexterm"/><a id="htmc4.2.2" class="indexterm"/><a id="cl4.2.2" class="indexterm"/>There are two factory methods for this class: a general-purpose one and a convenience method. Let’s take a look at the general-purpose case in <a class="xref" href="ch04.html#chap6_id35942537" title="Example 4-9. Creating an HTTP request">Example 4-9</a>.</p><div class="example"><a id="chap6_id35942537"/><p class="title">Example 4-9. Creating an HTTP request</p><div class="example-contents"><a id="chap6_id35942542"/><pre class="programlisting">var http = require('http'); var opts = { host: 'www.google.com' port: 80, path: '/', method: 'GET' }; var req = http.request(opts, function(res) { console.log(res); res.on('data', function(data) { console.log(data); }); }); req.end();</pre></div></div><p><a id="chap6_id35942553"/>The first thing you can see is that an <code class="literal">options</code> object defines a lot of the functionality of the request. We must provide the <code class="literal">host</code> name (although an IP address is also acceptable), the <code class="literal">port</code>, and the <code class="literal">path</code>. The <code class="literal">method</code> is optional and defaults to a value of <code class="literal">GET</code> if none is specified. In essence, the example is specifying that the request should be an <code class="literal">HTTP GET</code> request to <code class="literal">http://www.google.com/</code> on port <code class="literal">80</code>.</p><p><a id="chap6_id35950047"/>The next thing we do is use the <code class="literal">options</code> object to construct an instance of <code class="literal">http.</code><code class="literal">Client</code><code class="literal">Request</code> using the factory method <code class="literal">http.request()</code>. This <a id="I_indexterm1_d1e4764" class="indexterm"/>method takes an <code class="literal">options</code> object and an optional callback argument. The passed callback listens to the <code class="literal">response</code> event<a id="I_indexterm1_d1e4777" class="indexterm"/>, and when a <code class="literal">response</code> event is received, we can process the results of the request. In the previous example, we simply output the response object to the console. However, it’s important to notice that the body of the HTTP request is actually received via a stream in the <code class="literal">response</code> object. Thus, you can subscribe to the <code class="literal">data</code> event of the <code class="literal">response</code> object to get the data as it becomes available (see the section <a class="xref" href="ch04.html#chap6_id35817238" title="Readable streams">Readable streams</a> for more information).</p><p><a id="chap6_id35950088"/>The final important point to notice is that we had <a id="I_indexterm1_d1e4797" class="indexterm"/>to <code class="literal">end()</code> the <code class="literal">request</code>. Because this was a <code class="literal">GET</code> request, we didn’t write any data to the server, but for other <code class="literal">HTTP</code> methods, such as <code class="literal">PUT</code> or <code class="literal">POST</code>, you may need to. Until we call the <code class="literal">end()</code> method, <code class="literal">request</code> won’t initiate the <code class="literal">HTTP</code> request, because it doesn’t know whether it should still be waiting for us to send data.</p><div class="sect3" title="Making HTTP GET requests"><div class="titlepage"><div><div><h3 class="title"><a id="chap6_id35942651"/>Making HTTP GET requests</h3></div></div></div><p><a id="chap6_id35942657"/>Since <code class="literal">GET</code> is such a common HTTP use <a id="ht4.2.2.1" class="indexterm"/><a id="ge4.2.2.1" class="indexterm"/>case, there is a special factory method to support it in a more convenient way, as shown in <a class="xref" href="ch04.html#chap6_id35942667" title="Example 4-10. Simple HTTP GET requests">Example 4-10</a>.</p><div class="example"><a id="chap6_id35942667"/><p class="title">Example 4-10. Simple HTTP GET requests</p><div class="example-contents"><a id="chap6_id35942673"/><pre class="programlisting">var http = require('http'); var opts = { host: 'www.google.com' port: 80, path: '/', }; var req = http.get(opts, function(res) { console.log(res); res.on('data', function(data) { console.log(data); }); });</pre></div></div><p><a id="chap6_id35942684"/>This example of <code class="literal">http.get()</code> does exactly the same thing as the previous example, but it’s slightly more concise. We’ve lost the <code class="literal">method</code> attribute of the config object, and left out the call <code class="literal">request.end()</code> because it’s implied.</p><p><a id="chap6_id35941321"/>If you run the previous two examples, you are going to get back raw <code class="literal">Buffer</code> objects. As described later in this chapter, <a id="I_indexterm1_d1e4872" class="indexterm"/>a <code class="literal">Buffer</code> is a special class defined in Node to support the storage of arbitrary, binary data. Although it’s certainly possible to work with these, you often want a specific encoding, such as UTF-8 (an encoding for Unicode characters). You can specify this with the <code class="literal">response.setEncoding()</code> method (see <a class="xref" href="ch04.html#chap6_id35941336" title="Example 4-11. Comparing raw Buffer output to output with a specified encoding">Example 4-11</a>).</p><div class="example"><a id="chap6_id35941336"/><p class="title">Example 4-11. Comparing raw Buffer output to output with a specified encoding</p><div class="example-contents"><a id="chap6_id35941151"/><pre class="programlisting">> var http = require('http'); > var req = http.get({host:'www.google.com', port:80, path:'/'}, function(res) { ... console.log(res); ... res.on('data', function(c) { console.log(c); }); ... }); > <Buffer 3c 21 64 6f 63 74 79 70 ... 65 2e 73 74> <Buffer 61 72 74 54 69 ... 69 70 74 3e> > > var req = http.get({host:'www.google.com', port:80, path:'/'}, function(res) { ... res.setEncoding('utf8'); ... res.on('data', function(c) { console.log(c); }); ... }); > <!doctype html><html><head><meta http-equiv="content-type ... load.t.prt=(f=(new Date).getTime()); })(); </script> ></pre></div></div><p><a id="chap6_id35941161"/>In the first case, we do not <a id="I_indexterm1_d1e4893" class="indexterm"/><a id="I_indexterm1_d1e4898" class="indexterm"/>pass <code class="literal">ClientResponse.setEncoding()</code>, and we get chunks of data in <code class="literal">Buffer</code>s. Although the output is abridged in the printout, you can see that it isn’t just a single <code class="literal">Buffer</code>, but that several <code class="literal">Buffer</code>s have been returned with data. In the second example, the data is returned as UTF-8 because we specified <code class="literal">res.setEncoding('utf8')</code>. The chunks of data returned from the server are still the same, but are given to the program as <code class="literal">string</code>s in the correct encoding rather than as raw <code class="literal">Buffer</code>s. Although the printout may not make this clear, there is one <code class="literal">string</code> for each of the <a id="I_indexterm1_d1e4929" class="indexterm"/><a id="I_indexterm1_d1e4931" class="indexterm"/>original <code class="literal">Buffer</code>s.</p></div><div class="sect3" title="Uploading data for HTTP POST and PUT"><div class="titlepage"><div><div><h3 class="title"><a id="chap6_id35941199"/>Uploading data for HTTP POST and PUT</h3></div></div></div><p><a id="chap6_id35941205"/>Not all HTTP is <code class="literal">GET</code>. You might also <a id="I_indexterm1_d1e4946" class="indexterm"/><a id="I_indexterm1_d1e4949" class="indexterm"/>need to call <code class="literal">POST</code>, <code class="literal">PUT</code>, and other <code class="literal">HTTP</code> methods that alter data on the other end. This is functionally the same as making a <code class="literal">GET</code> request, except you are going to write some data <span class="emphasis"><em>upstream</em></span>, as shown in <a class="xref" href="ch04.html#chap6_id35941232" title="Example 4-12. Writing data to an upstream service">Example 4-12</a>.</p><div class="example"><a id="chap6_id35941232"/><p class="title">Example 4-12. Writing data to an upstream service</p><div class="example-contents"><a id="chap6_id35941238"/><pre class="programlisting">var options = { host: 'www.example.com', port: 80, path: '/submit', method: 'POST' }; var req = http.request(options, function(res) { res.setEncoding('utf8'); res.on('data', function (chunk) { console.log('BODY: ' + chunk); }); }); req.write("my data"); req.write("more of my data"); req.end();</pre></div></div><p><a id="chap6_id35941248"/>This example is very similar to <a class="xref" href="ch04.html#chap6_id35942667" title="Example 4-10. Simple HTTP GET requests">Example 4-10</a>, but uses <a id="I_indexterm1_d1e4981" class="indexterm"/>the <code class="literal">http.ClientRequest</code><code class="literal">.write()</code> method. This method allows you to send data upstream, and as explained earlier, it requires you to explicitly call <code class="literal">http.ClientRequest.end()</code> to indicate <a id="I_indexterm1_d1e4995" class="indexterm"/>you’re finished sending data. Whenever <code class="literal">ClientRequest.write()</code> is called, the data is sent upstream (it isn’t buffered), but the server will not respond until <code class="literal">ClientRequest.end()</code> is called.</p><p><a id="chap6_id35941268"/>You can stream data to a server using <code class="literal">ClientRequest.write()</code> by coupling the writes to the <code class="literal">data</code> event of a <code class="literal">Stream</code>. This is ideal if you need to, for example, send a file from disk to a remote server over <a id="I_indexterm1_d1e5018" class="indexterm"/><a id="I_indexterm1_d1e5020" class="indexterm"/>HTTP.</p></div><div class="sect3" title="The ClientResponse object"><div class="titlepage"><div><div><h3 class="title"><a id="chap6_id35941284"/>The ClientResponse object</h3></div></div></div><p><a id="chap6_id35941290"/>The <code class="literal">ClientResponse</code> object stores <a id="I_indexterm1_d1e5031" class="indexterm"/><a id="I_indexterm1_d1e5036" class="indexterm"/>a variety of information about the request. In general, it is pretty intuitive. Some of its obvious properties that are often useful <a id="I_indexterm1_d1e5042" class="indexterm"/>include <code class="literal">statusCode</code> (which contains the HTTP status) <a id="I_indexterm1_d1e5051" class="indexterm"/>and <code class="literal">header</code> (which is the response header object). Also hung off of <code class="literal">ClientResponse</code> are various streams and properties that you may or may not want to interact with <a id="I_indexterm1_d1e5064" class="indexterm"/><a id="I_indexterm1_d1e5066" class="indexterm"/>directly.</p></div></div><div class="sect2" title="URL"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35940796"/>URL</h2></div></div></div><p><a id="chap6_id35940802"/>The <code class="literal">URL</code> module <a id="I_indexterm1_d1e5077" class="indexterm"/>provides tools for easily parsing and dealing with URL strings. It’s extremely useful when you have to deal with URLs. The module offers three methods: <code class="literal">parse</code>, <code class="literal">format</code>, and <code class="literal">resolve</code>. Let’s <a id="I_indexterm1_d1e5093" class="indexterm"/><a id="I_indexterm1_d1e5098" class="indexterm"/>start by looking at <a class="xref" href="ch04.html#chap6_id35940825" title="Example 4-13. Parsing a URL using the URL module">Example 4-13</a>, which <a id="ur4.2.3" class="indexterm"/>demonstrates <code class="literal">parse</code> using Node REPL.</p><div class="example"><a id="chap6_id35940825"/><p class="title">Example 4-13. Parsing a URL using the URL module</p><div class="example-contents"><a id="chap6_id35940831"/><pre class="programlisting">> var URL = require('url'); > var myUrl = "http://www.nodejs.org/some/url/?with=query&param=that&are=awesome #alsoahash"; > myUrl 'http://www.nodejs.org/some/url/?with=query&param=that&are=awesome#alsoahash' > parsedUrl = URL.parse(myUrl); { href: 'http://www.nodejs.org/some/url/?with=query&param=that&are=awesome#alsoahash' , protocol: 'http:' , slashes: true , host: 'www.nodejs.org' , hostname: 'www.nodejs.org' , hash: '#alsoahash' , search: '?with=query&param=that&are=awesome' , query: 'with=query&param=that&are=awesome' , pathname: '/some/url/' } > parsedUrl = URL.parse(myUrl, true); { href: 'http://www.nodejs.org/some/url/?with=query&param=that&are=awesome#alsoahash' , protocol: 'http:' , slashes: true , host: 'www.nodejs.org' , hostname: 'www.nodejs.org' , hash: '#alsoahash' , search: '?with=query&param=that&are=awesome' , query: { with: 'query' , param: 'that' , are: 'awesome' }, pathname: '/some/url/' } ></pre></div></div><p><a id="chap6_id35940842"/>The first thing we do, of course, is require the <code class="literal">URL</code> module. Note that the names of modules are always lowercase. We’ve created a <code class="literal">url</code> as a string containing all the parts that will be parsed out. Parsing is really easy: we just call the <code class="literal">parse</code> method from the <code class="literal">URL</code> module on the string. It returns a data structure representing the parts of the parsed URL. The components it produces are:</p><div class="itemizedlist"><a id="chap6_id35940858"/><ul class="itemizedlist"><li class="listitem"><p><a id="chap6_id35940860"/><a id="chap6_id35940861"/><code class="literal">href</code></p></li><li class="listitem"><p><a id="chap6_id35940865"/><a id="chap6_id35940866"/><code class="literal">protocol</code></p></li><li class="listitem"><p><a id="chap6_id35940869"/><a id="chap6_id35940870"/><code class="literal">host</code></p></li><li class="listitem"><p><a id="chap6_id35940873"/><a id="chap6_id35940874"/><code class="literal">auth</code></p></li><li class="listitem"><p><a id="chap6_id35940877"/><a id="chap6_id35940878"/><code class="literal">hostname</code></p></li><li class="listitem"><p><a id="chap6_id35940882"/><a id="chap6_id35940883"/><code class="literal">port</code></p></li><li class="listitem"><p><a id="chap6_id35940886"/><a id="chap6_id35940887"/><code class="literal">pathname</code></p></li><li class="listitem"><p><a id="chap6_id35940890"/><a id="chap6_id35940891"/><code class="literal">search</code></p></li><li class="listitem"><p><a id="chap6_id35940894"/><a id="chap6_id35940896"/><code class="literal">query</code></p></li><li class="listitem"><p><a id="chap6_id35940899"/><a id="chap6_id35940900"/><code class="literal">hash</code></p></li></ul></div><p><a id="chap6_id35940904"/>The <code class="literal">href</code> is the full <code class="literal">URL</code> that was originally passed to <code class="literal">parse</code>. The protocol is the protocol used in the <code class="literal">URL</code> (e.g., <code class="literal">http://</code>, <code class="literal">https://</code>, <code class="literal">ftp://</code>, etc.). <code class="literal">host</code> is the fully qualified hostname of the <code class="literal">URL</code>. This could be as simple as the hostname for a local server, such as <code class="literal">print server</code>, or a fully qualified domain name such as <code class="literal">www.google.com</code>. It might also include a port number, such as <code class="literal">8080</code>, or username and password credentials like <code class="literal">un:pw@ftpserver.com</code>. The various parts of the hostname are broken down further into <code class="literal">auth</code>, containing just the user credentials; <code class="literal">port</code>, containing just the port; and <code class="literal">hostname</code>, containing the hostname portion of the <code class="literal">URL</code>. An important thing to know about <code class="literal">hostname</code> is that it is still the full hostname, including the top-level domain (TLD; e.g., <code class="literal">.com</code>, <code class="literal">.net</code>, etc.) and the specific server. If the <code class="literal">URL</code> were <code class="literal">http://sport.yahoo.com/nhl</code>, <code class="literal">hostname</code> would not give you just the TLD (<code class="literal">yahoo.com</code>) or just the host (<code class="literal">sport</code>), but the entire hostname (<code class="literal">sport.yahoo.com</code>). The <code class="literal">URL</code> module doesn’t have the capability to split the hostname down into its components, such as domain or TLD.</p><p><a id="chap6_id35941009"/>The next set of components of the URL relates to everything after the <code class="literal">host</code>. The <code class="literal">pathname</code> is the entire filepath after the <code class="literal">host</code>. In <code class="literal">http://sports.yahoo.com/nhl</code>, it is <code class="literal">/nhl</code>. The next component is the <code class="literal">search</code> component, which stores the <code class="literal">HTTP GET</code> parameters in the URL. For example, if the URL were <code class="literal">http://mydomain.com/?foo=bar&baz=qux</code>, the <code class="literal">search</code> component would be <code class="literal">?foo=bar&baz=qux</code>. Note the inclusion of the <code class="literal">?</code>. The <code class="literal">query</code> parameter is similar to the <code class="literal">search</code> component. It contains one of two things, depending on how <code class="literal">parse</code> was called.</p><p><a id="chap6_id35949031"/><code class="literal">parse</code> takes two arguments: the <code class="literal">url</code> string and an optional Boolean that determines whether the <code class="literal">queryString</code> should be parsed using the <code class="literal">querystring</code> module, discussed in the next section. If the second argument is false, <code class="literal">query</code> will just contain a string similar to that of <code class="literal">search</code> but without the leading <code class="literal">?</code>. If you don’t pass anything for the second argument, it defaults to <code class="literal">false</code>.</p><p><a id="chap6_id35949062"/>The final component is the <code class="literal">fragment</code> portion of the URL. This is the part of the URL after the <code class="literal">#</code>. Commonly, this is used to refer to named anchors in <code class="literal">HTML</code> pages. For instance, <code class="literal">http://abook.com/#chapter2</code> might refer to the second chapter on a web page hosting a whole book. The <code class="literal">hash</code> component in this case would contain <code class="literal">#chapter2</code>. Again, note the included <code class="literal">#</code> in the string. Some sites, such as <code class="literal">http://twitter.com</code>, use more complex fragments for AJAX applications, but the same rules apply. So the URL for the Twitter <span class="emphasis"><em>mentions</em></span> account, <code class="literal">http://twitter.com/#!/mentions</code>, would have a <code class="literal">pathname</code> of <code class="literal">/</code> but a <a id="I_indexterm1_d1e5376" class="indexterm"/>hash of <code class="literal">#!/mentions</code>.</p></div><div class="sect2" title="querystring"><div class="titlepage"><div><div><h2 class="title"><a id="chap6_id35816873"/>querystring</h2></div></div></div><p><a id="chap6_id35816878"/>The <code class="literal">querystring</code> module <a id="qu4.2.4" class="indexterm"/>is a very simple helper module to deal with query strings. As discussed in the previous section, query strings are the parameters encoded at the end of a URL. However, when reported back as just a JavaScript string, the parameters are fiddly to deal with. The <code class="literal">querystring</code> module provides an easy way to create objects from the query strings. The main methods it <a id="qup4.2.4" class="indexterm"/><a id="I_indexterm1_d1e5404" class="indexterm"/>offers are <code class="literal">parse</code> and <code class="literal">decode</code>, but some internal helper functions, <a id="I_indexterm1_d1e5417" class="indexterm"/><a id="I_indexterm1_d1e5422" class="indexterm"/><a id="I_indexterm1_d1e5427" class="indexterm"/>—<a id="I_indexterm1_d1e5433" class="indexterm"/><a id="I_indexterm1_d1e5438" class="indexterm"/>such as <code class="literal">escape</code>, <code class="literal">unescape</code>, <code class="literal">unescapeBuffer</code>, <code class="literal">encode</code>, and <code class="literal">stringify</code>, are al