Database Disruption From The Couch

Web development typically follows a specific set of steps. There are different workflows for different people, and I’m not trying to establish a best-practices workflow here (just the opposite, as you’ll see), but the differences are usually in the order in which the steps are performed, not the actual steps themselves.

The first step (again, conceptually the first, not necessarily chronologically) is to design a database for the application’s needs – the database tier. That means breaking down the various components of your blog into tables. For example, if you are building a blog engine, you would need at least a table for articles and another for comments.

The second step is to write a “middleware” tier – that just means software that sits in between the client and the the database. You don’t want people to connect to the database directly or they’d be able to wreak havoc. The middleware tends to be where you do things like keeping track of who’s allowed on what page.

The last step would be to design a “view” or “presentation” tier to let your clients interact with your database in a (hopefully) safe manner. The view tier, among other things, takes the data that is broken up into tables and puts it back together again.

Ruby on Rails, which I’ve loved for years, excels at connecting these dots. ActiveRecord, which I consider the crown jewel of Rails, does a fantastic job at simplifying the complex job of creating and managing the database, for example. Rails also handles the middleware tier quite well with ActionController. The view tier (ActionView) doesn’t add all that much, but server-side view tiers rarely do when it comes to the web.

This is all going to be obsolete with CouchDB*.

What makes CouchDB so damn disruptive is not the fact that it beats frameworks like Rails – it’s that it completely side-steps them.  CouchDB dispenses with structured tables altogether, relying on “documents” instead. Documents are arbitrarily complex data structures that contain as much or as little data as you need. The (partially completed) free book on the official site uses the example of the business card.  Some business cards have fax numbers, and some don’t.  Some have website addresses, and some don’t.  In a classic, structured database, you would need to have a column in the database for fax numbers, even if only one business card in your database has one, because the structure of the database has to be in place before the data is added.  If the structure doesn’t accommodate the data, then the structure has to be altered, which is always a potentially dangerous thing (Rails handles this with migrations).  CouchDB lets you put whatever data you want in whatever document (in this case, business card) you want.  That means that instead of having to add a fax number column to every business card in the database, you could simply add a fax number to the new document and only the new document.

ActiveRecord is amazing at simplifying all the structured SQL stuff.  But once that complexity starts to go away, the value added by ActiveRecord starts dropping off dramatically.  If the database tier is already super-simple, why bother with a fairly hefty helper library on top of that?  CouchDB itself makes ActiveRecord (and other fancy ORM’s, like Hibernate) largely obsolete (and even quaint).

Another cool feature of CouchDB is that it uses HTTP for all interactions.  That means that the database runs on a web server.  Since CouchDB has built-in HTTP features like authentication, cookies, and caching, there’s not much the middleware tier is needed for, especially given that CouchDB works directly with AJAX, so much more logic can be put in the browser.

The last thing is the view tier – getting the data back to the user**.  CouchDB uses something called MapReduce, invented at Google, to deal with searching through existing data.  It’s a little tricky, and possibly the only thing preventing me from jumping fully on board with CouchDB.  Instead of SQL queries, you write “view functions” for CouchDB.  The end result is pretty much the same.

As an aside, one of the coolest thing about view functions is that they can be stored in CouchDB directly.  You just give them a unique identifier (like “view-blog-articles-by-date”) and store them as documents in CouchDB.  That means application code can be stored in the database, which is great because backing up the database will backup a good chunk of your code as well.

So in conclusion, a lot of what makes Rails great, and it is great, is deprecated by this humble open source application.  I didn’t even get to the built-in scalability that CouchDB has going for it.  CouchDB is a Big Deal, and I’m keeping my eye on it.

–Daniel Tsadok

* To qualify slightly, the concepts introduced by CouchDB are going to be disruptive, whatever form they take (i.e. even if CouchDB itself doesn’t take off).

** I’m fudging the presentation tier and the view tier a bit, in case you’re nitpicky ;-)  This post is not MVC-compliant.


5 Comments on “Database Disruption From The Couch”

  1. mattly says:

    Nice writeup, but you seem to be confusing the differing parts of middleware with the statement “This is all going to be obsolete”.

    While couchdb has its own authentication system, it’s not the sort of thing that would allow say, user signups. If you’re going to allow anonymous creation of something like blog comments, there’s little besides a thin javascript validation function to prevent spam from getting through; you can’t use third-party APIs for determining spamminess nor do you have access to various aspects of the request which can help determine if the comment is legit.

    You’re still going to want some layer of middleware, but perhaps not as much.

    That said, there are some big wins to be had that you barely covered:
    1) documents aren’t just ‘schemaless’, you can throw any type of json in there. A lot of stuff that is otherwise done by join tables can now be done inline in the document itself with an array or hash
    2) since views are computed functions against the document that return a key and value, you can basically treat them like a big key / array pair, or key/further-computed-value with reduce. Views aren’t just for indexing documents.

  2. danieltsadok says:

    @mattly – I’m working on a new post to respond to your comment, as well as others.

  3. […] mattly pointed out that there is still need for a middleware tier, since client-side Javascript is not a reliable way to do validation or security, since Javascript can simply be turned off.  Andrea Schiavini wrote that Rails is much more than a database tier, and that Rails has plenty to offer beyond ActiveRecord (Rails’ database connector). Both objected to my referring to middleware tiers like Rails as “obsolete”. […]

  4. J Chris A says:

    Daniel, at this point I have to plug the CouchApp framework we’ve been working on. There’s a mailing list here:

    http://groups.google.com/group/couchapp

    CouchApp is designed to structure standalone CouchDB application development for maximum application portability.

  5. […] browsing through planet@CouchDB I have found an excellent post about how CouchDB can make all ORM/GORM stuff obsolete. Since CounchDB does not enforce particular […]


Leave a reply to J Chris A Cancel reply