Clarifications to My CouchDB Post

I got great feedback on my previous post about CouchDB, and I wanted to clarify a few points.

mattly pointed out that there is still need for a middleware tier, since client-side Javascript is not a reliable way to do validation or security, since Javascript can simply be turned off.  Andrea Schiavini wrote that Rails is much more than a database tier, and that Rails has plenty to offer beyond ActiveRecord (Rails’ database connector). Both objected to my referring to middleware tiers like Rails as “obsolete”.

Regarding mattly’s point, I really should have said “most of this is going to be obsolete”, not all.  He gives several examples where a middleware tier is necessary because CouchDB doesn’t have a system in place to deal with certain tasks (like comment spam detection).  However, my understanding is that CouchDB’s upcoming security model really is supposed to replace the middleware tier.  What prompts me to say this is the focus on “CouchApps”, which would run on the user’s computer (CouchDB and all), and synchronize the database with a master (“Eventual Consistency”) when the client was finished.  In essence, it’s a way to cache the database on each client.  Security/validation then becomes a matter of which client databases are allowed to update the master.  As far as I understand, it’s very similar to the git source control security model.  More here.

For a more traditional web-based application, I would put CouchDB behind a firewall, open up GET’s via a proxy (like Apache) but restrict other methods to go through a very simple middleware tier (this helps with Javascript domain restrictions as well).  The middleware tier is not completely gone, as mattly correctly noted, but it’s down to a handful of scripts, not a full-blown application stack.  This assumes that CouchDB’s upcoming security system doesn’t already provide this functionality.

As far as Rails goes, I happen to be a full-time Rails developer, so I’m quite familiar with the benefits of Rails!  However, much of what Rails offers beyond ActiveRecord is also deprecated in CouchDB’s model.  For example, routing becomes less important when you’re building an AJAX-driven application with CouchDB.  Simply grab the view or document you need with AJAX, and render your web page client-side based on the result.  CouchDB takes care of the routes to the various views and documents already, so you’re left with a few static HTML documents and Javascript libraries that don’t particularly need routing.  Environments in Rails are also less useful when working with CouchDB.  In CouchDB, each database is its own environment.  As long as the databases have the same name on each server, no code has to change.

Anyway, thanks for the feedback, and I look forward to talking more about CouchDB in the future!

:-Daniel Tsadok

Database Disruption From The Couch

Web development typically follows a specific set of steps. There are different workflows for different people, and I’m not trying to establish a best-practices workflow here (just the opposite, as you’ll see), but the differences are usually in the order in which the steps are performed, not the actual steps themselves.

The first step (again, conceptually the first, not necessarily chronologically) is to design a database for the application’s needs – the database tier. That means breaking down the various components of your blog into tables. For example, if you are building a blog engine, you would need at least a table for articles and another for comments.

The second step is to write a “middleware” tier – that just means software that sits in between the client and the the database. You don’t want people to connect to the database directly or they’d be able to wreak havoc. The middleware tends to be where you do things like keeping track of who’s allowed on what page.

The last step would be to design a “view” or “presentation” tier to let your clients interact with your database in a (hopefully) safe manner. The view tier, among other things, takes the data that is broken up into tables and puts it back together again.

Ruby on Rails, which I’ve loved for years, excels at connecting these dots. ActiveRecord, which I consider the crown jewel of Rails, does a fantastic job at simplifying the complex job of creating and managing the database, for example. Rails also handles the middleware tier quite well with ActionController. The view tier (ActionView) doesn’t add all that much, but server-side view tiers rarely do when it comes to the web.

This is all going to be obsolete with CouchDB*.

What makes CouchDB so damn disruptive is not the fact that it beats frameworks like Rails – it’s that it completely side-steps them.  CouchDB dispenses with structured tables altogether, relying on “documents” instead. Documents are arbitrarily complex data structures that contain as much or as little data as you need. The (partially completed) free book on the official site uses the example of the business card.  Some business cards have fax numbers, and some don’t.  Some have website addresses, and some don’t.  In a classic, structured database, you would need to have a column in the database for fax numbers, even if only one business card in your database has one, because the structure of the database has to be in place before the data is added.  If the structure doesn’t accommodate the data, then the structure has to be altered, which is always a potentially dangerous thing (Rails handles this with migrations).  CouchDB lets you put whatever data you want in whatever document (in this case, business card) you want.  That means that instead of having to add a fax number column to every business card in the database, you could simply add a fax number to the new document and only the new document.

ActiveRecord is amazing at simplifying all the structured SQL stuff.  But once that complexity starts to go away, the value added by ActiveRecord starts dropping off dramatically.  If the database tier is already super-simple, why bother with a fairly hefty helper library on top of that?  CouchDB itself makes ActiveRecord (and other fancy ORM’s, like Hibernate) largely obsolete (and even quaint).

Another cool feature of CouchDB is that it uses HTTP for all interactions.  That means that the database runs on a web server.  Since CouchDB has built-in HTTP features like authentication, cookies, and caching, there’s not much the middleware tier is needed for, especially given that CouchDB works directly with AJAX, so much more logic can be put in the browser.

The last thing is the view tier – getting the data back to the user**.  CouchDB uses something called MapReduce, invented at Google, to deal with searching through existing data.  It’s a little tricky, and possibly the only thing preventing me from jumping fully on board with CouchDB.  Instead of SQL queries, you write “view functions” for CouchDB.  The end result is pretty much the same.

As an aside, one of the coolest thing about view functions is that they can be stored in CouchDB directly.  You just give them a unique identifier (like “view-blog-articles-by-date”) and store them as documents in CouchDB.  That means application code can be stored in the database, which is great because backing up the database will backup a good chunk of your code as well.

So in conclusion, a lot of what makes Rails great, and it is great, is deprecated by this humble open source application.  I didn’t even get to the built-in scalability that CouchDB has going for it.  CouchDB is a Big Deal, and I’m keeping my eye on it.

–Daniel Tsadok

* To qualify slightly, the concepts introduced by CouchDB are going to be disruptive, whatever form they take (i.e. even if CouchDB itself doesn’t take off).

** I’m fudging the presentation tier and the view tier a bit, in case you’re nitpicky ;-)  This post is not MVC-compliant.