However, regardless its amazing performance, node.js is also famous for not being a performant competitor when it comes to serving static files ... is there anything we can do?
Avoiding Multiple Reads
Every time we usefs.readFile we schedule a reading operation. This is non blocking and async, but what happens if two users ask for the same file in a fraction of time between the first read, and the second?
// simulating disk I/O
disk.read.push("file.txt");
  // .. while reading
disk.read.push("file.txt");
<< disk.read.shift();
  // .. while reading
<< disk.read.shift();
Wouldn't be better to have this kind of flow instead?
// simulating disk I/O disk.read["file.txt"].readIfNotAlready(); // .. while reading disk.read["file.txt"].readIfNotAlready(); << disk.read.shift();We basically satisfied two read request to the same file at once.
Handled fs.readFile
During the refactoring of a personal project of mine, I've benchmarked the current piece of code:
//!@author Andrea Giammarchi
// @module file-read.js
// Concurrent I/O Handling
var
  // still needed, of course
  fs = require('fs'),
  // single shared module cache
  cache = Object.create(null)
;
// same fs.readFile API
this.readFile = function read(name, then) {
  // already requested, let I/O complete
  if (name in cache) {
    // and notify this callback after anyway
    cache[name].push(then);
  } else {
    // first time somebody asked for this
    // create the collection of callbacks
    cache[name] = [then];
    // perform the I/O operation 
    fs.readFile(name, function readFile(err, data) {
      var
        // grab all callbacks waiting for a result
        list = cache[name],
        // loop over such list
        i = 0,
        length = list.length
      ;
      // after erasing it
      delete cache[name];
      while (i < length) {
        // ditch a boring/slow try/catch
        // and notify all callbacks ^_^
        list[i++](err, data);
      }
    });
  }
};
The code used to benchmark is basically this one:
// bench.js
for (var
  // require another file or 'fs' by default
  fs = require(process.argv[2] || 'fs'),
  i = 0,
  // start time
  t = Date.now();
  // 25000 simultaneous requests
  i < 25000; i++
) {
  fs.readFile(__filename, function () {
    // end time if all requests have been satisfied
    // will be async anyway ^_^
    if (!--i) console.log(Date.now() - t);
  });
}
Here results on my MacBook Pro (lower is better):- node bench.js fs: 505ms
- node bench.js ./read-file.js: 19ms
About 25X Faster On Concurrent Requestes!
Which I believe is a good achievement when it comes to static files serving since these files usually don't change often so there's actually no need at all to require them more than once per group of users ...The best part of it? That an array creation compared to I/O operations means almost nothing for V8 so that even if the file is required only once per time and no concurrent connections are in place, the overall performance will be the same, at least in my Mac, and around 9ms per request.
enjoy
 
2 comments:
this actually looks great.
did you put this on production somewhere?
I've written a module that simplify this concept for any sort of similar needs: Boosting I/O Holding Requests
Post a Comment