Mar 15 2010

Node.js, Websockets, and the Twitter Gardenhose

Category: Real-time Webjgoulah @ 3:08 PM

Introduction

In this article I’m going to demonstrate how to use the Twitter Streaming API to send a stream of status updates for real time display in a web browser using Web Sockets. We’ll implement a backend server module that streams the events over a web socket using the node.websocket.js framework, which provides a simple realtime HTTP server that implements the web socket protocol. Node.websocket.js is built on top of Node.js which provides an Asynchronous API using callbacks. Node.js is a server side javascript library that enforces an event driven style of programming which allows you to develop non-blocking code easily. This in turn allows you to write simple servers that are very CPU and Memory efficient because you don’t have multiple threads taking up shared resources. Node.js is implemented using Google’s V8 javascript engine and also the CommonJS specification which is defining a standard library for server side javascript.

Getting Started

First things first, you need node.js, which you can get from github. Install in the usual way

$ git clone git://github.com/ry/node.git
$ cd node
$ ./configure
$ make
$ sudo make install

Then we’ll need node.websocket.js which you can also get from github

$ git clone git://github.com/Guille/node.websocket.js.git

This is basically an experimental implementation of the web socket API. You can create simple server side modules and then a client side implementation that opens up a web socket to the server in which you can then exchange data in both directions. There are a few examples included that you can take a look at including a simple echo server and a chat server. Just fire up the server like so

$ node runserver.js

and then open up one of the html files in the test/websocket/ directory in your browser. One catch though, you’ll need a current browser such as Google Chrome. I’d recommend it anyway, as browsing the web with it does run a bit smoother.

You’ll also need curl, which is pretty common on any linux box these days and can be installed through your package manager.

Setting Up the Server

The way we’re going to implement the server is to use the curl command to pull from the twitter stream into a file. Twitter gives you a bunch of JSON objects back which you can then parse and display. We’ll use this file in a moment to send the data over the web socket to the browser. There is some documentation on what you can do with the API but we’ll keep it simple and search for any tweets with ‘nyc’ in them

$ curl -dtrack=nyc http://stream.twitter.com/1/statuses/filter.json  -uUSERNAME:PASSWORD > sample.json

Now its time to write some server side javascript. In the node.websocket.js checkout there is a modules/ directory. We’ll create a new module called gardenhose.js. So the full path is node.websocket.js/modules/gardenhose.js

In a nutshell this is waiting for the client to establish a connection, and then it creates a child process that tails our file from the curl command above. Anytime the file is written to the “output” listener is invoked, which runs our callback to parse the JSON into objects that we can then use to send a string back to the client with some readable information from the stream.

Lets break it down just a bit more in case you are not familiar with Node. First we are requiring the system and filesystem modules from Node. Now, node.websocket.js basically just uses the node API to implement a server in the websocket.js file. It looks at the request header to see which module to instantiate and then invokes your onData() method when a client sends over data.

Therefore the onData method is the one we need to implement in our module above. We’ll use the process object in Node to create a child process that emits an event called “output” each time the child sends data to stdout. So the addListener call sets up a callback that will be invoked when our file receives more data from the twitter stream. That data comes in the form of JSON objects, one per line. So we split on the lines to create an array of JSON objects, and loop through them. Each time through the loop we’re sending this data back to the client, which is the web browser.

Just make sure the file you pass in is the correct path to the file you are outputting to from the curl command in the monitor_file() function. Then to run the node.websocket.js server you can just invoke it like so

$ node runserver.js

However if you are running the server from another host, you may want to listen on more than just the default of localhost

$ node runserver.js --host=0.0.0.0

Setting Up the Client

The goal here is to get the realtime tweets pumping through our web browser so our client will just be a web page with a little bit of web socket javascript.

This is probably a bit more straightforward. We’re just implementing a few of the functions from the Websocket interface. First we’re instantiating the WebSocket class with the hostname and port that we’re running the server on. The gardenhose in the path is to tell our server that we want to run the gardenhose.js module that we wrote above. The onopen() function is invoked when the socket is opened, and we send over the word “start” which if you recall from above understands the client has connected and to start the child process that runs tail on our file. The onmessage() function is invoked anytime the server is sending data over the web socket, which is the information we want to show on the page, so we append it to the HTML of our hose div. If the server closes the socket then onclose() is invoked, and we display that on the page.

Conclusion

We’ve written a server side module using Node.js and the node.websocket.js framework that will send tweets over a web socket connection. We have also looked at the WebSocket API and learned how to implement some of the functions defined by its interface. Of course, you could use JQuery or your favorite javascript libraries to enhance how this looks to the user, but the basics are all here in how the communication of a real time display can work with web sockets.

References

Async I/O – http://en.wikipedia.org/wiki/Asynchronous_I/O

CommonJS – http://www.commonjs.org/ and http://wiki.commonjs.org/wiki/CommonJS

V8 – http://code.google.com/p/v8

Node.JS – http://nodejs.org

Node.Websocket.JS – http://github.com/guille/node.websocket.js

Twitter Streaming API (Gardenhose) – http://apiwiki.twitter.com/Streaming-API-Documentation

Websocket API – http://dev.w3.org/html5/websockets

Tags: , , , , , , , , , , ,

10 Responses to “Node.js, Websockets, and the Twitter Gardenhose”

  1. uberVU - social comments says:

    Social comments and analytics for this post…

    This post was mentioned on Reddit by nazbot: So basically it’s a webserver that posts twitter content to a webpage through a socket? That the gist of it?…

  2. jakemcgraw says:

    Any difference between doing something like:

    http://gist.github.com/334069
    tail -f /file/to/watch | node process_stdio.js

    vs creating a child process and monitoring it’s output as you’ve done?

    Also, have you tested your method when watching rapidly changing files? I found two issues when consuming rapidly changing streams (at least when using something like process_stdio.js): First, nodejs will “drop” whole lines from a tail’d stream, I believe it has something to do with the same callback being executed while another copy of that callback is still executing. Second, partial results (results before EOL) may be delivered. I solved both these issues by making the “output” callback as fast as possible and by detecting EOL and storing partial chunks for next callback.

  3. jgoulah says:

    Jake, that method looks fine, main reason I implemented the way its done above is because the example is part of the node.websocket.js framework and doesn’t generally lend itself to piping in a file like that.

    For the second question, its definitely possible the callback is invoked with a partial line (which is most of the reason for the try/catch) and the way you are appending data looks like the right way to handle that. Again this was only a proof of concept so I tried to keep the code simple.

  4. InVisible Blog » links for 2010-03-16 says:

    [...] 16th, 2010 How I develop Clojure with Vim : :wq – blog (tags: clojure editor vim repl) John Goulah » Node.js, Websockets, and the Twitter Gardenhose (tags: javascript node.js programming websocket nodejs html5 [...]

  5. » links for 2010-03-16 (Dhananjay Nene) says:

    [...] John Goulah » Node.js, Websockets, and the Twitter Gardenhose Node.js, Websockets, and the Twitter Gardenhose http://ff.im/-hAtLK (tags: via:packrati.us) [...]

  6. tgautier says:

    Really great post. I traced your steps and found a few gotchas along the way. Node.js has moved to v0.1.31 and it’s not compatible with websocket.js so I had to check out v0.1.29.

    I changed the routine that is tailing the file with twitter events to simply send back the entire JSON object. This means the javascript in the browser gets a JSON object which is super simple to deal with, and provides all the data in the tweet in addition to just the screen name and tweet. It’s really powerful to send JSON straight back to the browser.

    In changing to the above I found that curl tends to write incomplete buffers to the file, so the last tweet is often incomplete. It usually didn’t matter with your version because the full tweet text usually came out and you parsed it away, but when sending the whole thing you have to account for an incomplete read on the last entry, and then patch that entry up with the first read in the next data push from the tail process.

    Otherwise, this is a really well done blog and a great demo of the outstanding potential of websockets. Thanks!

    Btw, I’d be willing to contribute back any changes I made.

  7. jgoulah says:

    TGautier
    The examples were written with the github versions of both node and node.websocket.js. They worked for me, so thats odd that you had to rollback node.

    As for partial tweets yes see comment above. I probably should have shown how to handle that but tried to keep it simple. And yes parsing JSON on both server or the client would work fine, just depends how much data you want to send over the socket.

  8. tgautier says:

    I think node was updated after you published your blog but before I tried it – at least that’s what it looked like to me. I posted a bug to the github issues for node.websocket.js.

Leave a Reply

You must be logged in to post a comment.



  • new england patriots 98.5
  • hp support 530
  • zara phillips kids
  • la ink ink
  • tubing
  • new england patriots 1996 roster
  • connecticut 30 news
  • princes
  • neptune
  • bea 460 bosch
  • juliana
  • mtv rivals
  • bea 00037
  • baer
  • hong
  • search lsu.edu
  • transfers
  • cspan journal
  • dist 95
  • getaway
  • chad ochocinco age
  • bea karp
  • zara phillips husband
  • randy moss jail
  • connecticut education
  • randy moss legal issues
  • search engines for jobs
  • waffle
  • fundamentals
  • zara phillips wedding date
  • chicago bears 17 lisa lampanelli
  • bea goldfishberg
  • search engines no follow
  • zara phillips wedding hat
  • breaks
  • vince young yahoo stats
  • chicago bears tickets
  • search protocol host
  • joshua
  • xanadu bengals
  • connecticut football
  • plated
  • rambler
  • search 2.0
  • bea 71 series staples
  • chad ochocinco ultimate catch cast
  • tea party birthday
  • randy moss college
  • mtv executivesmtv fantasy factory
  • chicago bears zip hoodie
  • bengals for adoption
  • hp support greece
  • bengals tryouts
  • hp support englandhp support forum
  • bengals forum
  • lists
  • bengals arrests
  • mesa
  • dis windsor wi
  • randy moss future
  • bea fox
  • battleship yamato wreck
  • battleship aurora
  • chicago bears rumors 2011
  • hp support id
  • painted
  • chad ochocinco quotes video
  • vince young 6
  • chicago bears football club
  • vince young redskins
  • conditions
  • chad ochocinco nascar
  • battleship classes
  • frequency
  • havelock
  • new england patriots wiki
  • connecticut 104.1
  • dis systems
  • connecticut lakes
  • chicago bears garter
  • beagle
  • skate
  • trusted
  • chad ochocinco quickstep
  • tea party nj
  • dangerous
  • hp support helpline
  • cspan government shutdown
  • dually
  • c span 4 to 5
  • bend
  • mtv jams
  • search engines rankings 2011
  • zara phillips facebookzara phillips gossip
  • new england patriots kim kardashian
  • la ink book an appointment
  • battleship layout
  • tea party agenda
  • new england patriots jake locker
  • randy moss mix
  • search with image
  • units
  • la ink map
  • input
  • dist 91
  • $200
  • mtv 2 schedule
  • hp support englandhp support forum
  • new england patriots 65
  • searchbugsearch engines
  • la ink season 6
  • new england patriots needs
  • connecticut 97.7connecticut attorney general
  • chad ochocinco yesterday
  • randy moss vikings 2011
  • dis 2012 conference
  • sperry
  • lease
  • chad ochocinco 15
  • foster
  • cameras
  • chicago bears gifts
  • zara phillips baby
  • la ink youtube pixie
  • connecticut 5 star resorts
  • mtv 5 cover
  • connecticut 5th district
  • search comcast net
  • gelatin
  • zara phillips tongue
  • teck
  • la ink cast
  • resistivity
  • bangles eternal flame mp3bengals forum
  • buss
  • protector
  • la ink season 5
  • vince young 3rd 30
  • connecticut department of labor
  • new england patriots helmet
  • connecticut limo
  • discjuggler
  • chicago bears pictures
  • mtv oddities
  • hp support error 1005
  • bea taylor
  • search engines zuula
  • cspan facebook
  • mtv music awards
  • vince young injury
  • search engines 9
  • battleship hacked
  • gregg olsen books
  • bengals job fair
  • dis poem
  • bengals 09 record
  • search 4
  • timbaland
  • freida pinto can't act
  • hp support hard drive replacement
  • login
  • dis x
  • marathon
  • dis pater
  • search chuck norris
  • nubian
  • vince young uncle rico
  • connecticut transit
  • chicago bears 4th phase
  • bea spells a lot
  • tea party obama
  • di's hallmark
  • battleship aurora
  • albuterol
  • c span shelby foote
  • chad ochocinco to patriots
  • seized
  • search xml file
  • vince young football camp
  • vince young released
  • chad ochocinco wedding date
  • tea party lies
  • bea 2011 map
  • connecticut renaissance faire
  • zara phillips and the queen
  • zara phillips dating
  • fond
  • lavigne
  • ronda
  • chad ochocinco free agent
  • vince young rumors
  • cspan presidents
  • spares
  • search operatorssearch people