Introduction
In this article I’m going to demonstrate how to use the Twitter Streaming API to send a stream of status updates for real time display in a web browser using Web Sockets. We’ll implement a backend server module that streams the events over a web socket using the node.websocket.js framework, which provides a simple realtime HTTP server that implements the web socket protocol. Node.websocket.js is built on top of Node.js which provides an Asynchronous API using callbacks. Node.js is a server side javascript library that enforces an event driven style of programming which allows you to develop non-blocking code easily. This in turn allows you to write simple servers that are very CPU and Memory efficient because you don’t have multiple threads taking up shared resources. Node.js is implemented using Google’s V8 javascript engine and also the CommonJS specification which is defining a standard library for server side javascript.
Getting Started
First things first, you need node.js, which you can get from github. Install in the usual way
$ git clone git://github.com/ry/node.git $ cd node $ ./configure $ make $ sudo make install
Then we’ll need node.websocket.js which you can also get from github
$ git clone git://github.com/Guille/node.websocket.js.git
This is basically an experimental implementation of the web socket API. You can create simple server side modules and then a client side implementation that opens up a web socket to the server in which you can then exchange data in both directions. There are a few examples included that you can take a look at including a simple echo server and a chat server. Just fire up the server like so
$ node runserver.js
and then open up one of the html files in the test/websocket/ directory in your browser. One catch though, you’ll need a current browser such as Google Chrome. I’d recommend it anyway, as browsing the web with it does run a bit smoother.
You’ll also need curl, which is pretty common on any linux box these days and can be installed through your package manager.
Setting Up the Server
The way we’re going to implement the server is to use the curl command to pull from the twitter stream into a file. Twitter gives you a bunch of JSON objects back which you can then parse and display. We’ll use this file in a moment to send the data over the web socket to the browser. There is some documentation on what you can do with the API but we’ll keep it simple and search for any tweets with ‘nyc’ in them
$ curl -dtrack=nyc http://stream.twitter.com/1/statuses/filter.json -uUSERNAME:PASSWORD > sample.json
Now its time to write some server side javascript. In the node.websocket.js checkout there is a modules/ directory. We’ll create a new module called gardenhose.js. So the full path is node.websocket.js/modules/gardenhose.js
In a nutshell this is waiting for the client to establish a connection, and then it creates a child process that tails our file from the curl command above. Anytime the file is written to the “output” listener is invoked, which runs our callback to parse the JSON into objects that we can then use to send a string back to the client with some readable information from the stream.
Lets break it down just a bit more in case you are not familiar with Node. First we are requiring the system and filesystem modules from Node. Now, node.websocket.js basically just uses the node API to implement a server in the websocket.js file. It looks at the request header to see which module to instantiate and then invokes your onData() method when a client sends over data.
Therefore the onData method is the one we need to implement in our module above. We’ll use the process object in Node to create a child process that emits an event called “output” each time the child sends data to stdout. So the addListener call sets up a callback that will be invoked when our file receives more data from the twitter stream. That data comes in the form of JSON objects, one per line. So we split on the lines to create an array of JSON objects, and loop through them. Each time through the loop we’re sending this data back to the client, which is the web browser.
Just make sure the file you pass in is the correct path to the file you are outputting to from the curl command in the monitor_file() function. Then to run the node.websocket.js server you can just invoke it like so
$ node runserver.js
However if you are running the server from another host, you may want to listen on more than just the default of localhost
$ node runserver.js --host=0.0.0.0
Setting Up the Client
The goal here is to get the realtime tweets pumping through our web browser so our client will just be a web page with a little bit of web socket javascript.
This is probably a bit more straightforward. We’re just implementing a few of the functions from the Websocket interface. First we’re instantiating the WebSocket class with the hostname and port that we’re running the server on. The gardenhose in the path is to tell our server that we want to run the gardenhose.js module that we wrote above. The onopen() function is invoked when the socket is opened, and we send over the word “start” which if you recall from above understands the client has connected and to start the child process that runs tail on our file. The onmessage() function is invoked anytime the server is sending data over the web socket, which is the information we want to show on the page, so we append it to the HTML of our hose div. If the server closes the socket then onclose() is invoked, and we display that on the page.
Conclusion
We’ve written a server side module using Node.js and the node.websocket.js framework that will send tweets over a web socket connection. We have also looked at the WebSocket API and learned how to implement some of the functions defined by its interface. Of course, you could use JQuery or your favorite javascript libraries to enhance how this looks to the user, but the basics are all here in how the communication of a real time display can work with web sockets.
References
Async I/O – http://en.wikipedia.org/wiki/Asynchronous_I/O
CommonJS – http://www.commonjs.org/ and http://wiki.commonjs.org/wiki/CommonJS
V8 – http://code.google.com/p/v8
Node.JS – http://nodejs.org
Node.Websocket.JS – http://github.com/guille/node.websocket.js
Twitter Streaming API (Gardenhose) – http://apiwiki.twitter.com/Streaming-API-Documentation
Websocket API – http://dev.w3.org/html5/websockets


March 15th, 2010 at 10:42 PM
Social comments and analytics for this post…
This post was mentioned on Reddit by nazbot: So basically it’s a webserver that posts twitter content to a webpage through a socket? That the gist of it?…
March 16th, 2010 at 10:15 AM
Any difference between doing something like:
http://gist.github.com/334069
tail -f /file/to/watch | node process_stdio.js
vs creating a child process and monitoring it’s output as you’ve done?
Also, have you tested your method when watching rapidly changing files? I found two issues when consuming rapidly changing streams (at least when using something like process_stdio.js): First, nodejs will “drop” whole lines from a tail’d stream, I believe it has something to do with the same callback being executed while another copy of that callback is still executing. Second, partial results (results before EOL) may be delivered. I solved both these issues by making the “output” callback as fast as possible and by detecting EOL and storing partial chunks for next callback.
March 16th, 2010 at 10:33 AM
Jake, that method looks fine, main reason I implemented the way its done above is because the example is part of the node.websocket.js framework and doesn’t generally lend itself to piping in a file like that.
For the second question, its definitely possible the callback is invoked with a partial line (which is most of the reason for the try/catch) and the way you are appending data looks like the right way to handle that. Again this was only a proof of concept so I tried to keep the code simple.
March 16th, 2010 at 11:01 AM
[...] 16th, 2010 How I develop Clojure with Vim : :wq – blog (tags: clojure editor vim repl) John Goulah » Node.js, Websockets, and the Twitter Gardenhose (tags: javascript node.js programming websocket nodejs html5 [...]
March 16th, 2010 at 3:02 PM
[...] John Goulah » Node.js, Websockets, and the Twitter Gardenhose Node.js, Websockets, and the Twitter Gardenhose http://ff.im/-hAtLK (tags: via:packrati.us) [...]
March 17th, 2010 at 1:11 PM
Really great post. I traced your steps and found a few gotchas along the way. Node.js has moved to v0.1.31 and it’s not compatible with websocket.js so I had to check out v0.1.29.
I changed the routine that is tailing the file with twitter events to simply send back the entire JSON object. This means the javascript in the browser gets a JSON object which is super simple to deal with, and provides all the data in the tweet in addition to just the screen name and tweet. It’s really powerful to send JSON straight back to the browser.
In changing to the above I found that curl tends to write incomplete buffers to the file, so the last tweet is often incomplete. It usually didn’t matter with your version because the full tweet text usually came out and you parsed it away, but when sending the whole thing you have to account for an incomplete read on the last entry, and then patch that entry up with the first read in the next data push from the tail process.
Otherwise, this is a really well done blog and a great demo of the outstanding potential of websockets. Thanks!
Btw, I’d be willing to contribute back any changes I made.
March 17th, 2010 at 1:19 PM
TGautier
The examples were written with the github versions of both node and node.websocket.js. They worked for me, so thats odd that you had to rollback node.
As for partial tweets yes see comment above. I probably should have shown how to handle that but tried to keep it simple. And yes parsing JSON on both server or the client would work fine, just depends how much data you want to send over the socket.
March 18th, 2010 at 12:47 AM
I think node was updated after you published your blog but before I tried it – at least that’s what it looked like to me. I posted a bug to the github issues for node.websocket.js.
April 26th, 2011 at 8:37 PM
[...] http://blog.johngoulah.com/2010/03/nodejs-websockets-and-the-twitter-gardenhose/ http://blog.andregoncalves.com/2009/12/29/Nodejs-twitter-streaming-with- html5-websockets.html [...]
July 19th, 2011 at 7:58 AM
[...] http://blog.johngoulah.com/2010/03/nodejs-websockets-and-the-twitter-gardenhose/http://blog.andregoncalves.com/2009/12/29/Nodejs-twitter-streaming-with- html5-websockets.html [...]