Monday, May 24, 2010

quick blurb on NoSQL

I've spent about as much time thinking about this as NoSQL developers spend thinking about their schema, but here it is anyway.

At SourceForge I'm presently developing and maintaining a few different systems using all kinds of web tech's and languages - PHP, python, solr, Postgres, MySQL, and mongo. One thing I'm noticing is that the mongo systems are something of a breeze to write, and then a real challenge to maintain - especially debugging. Our mongo experts mostly say that the tooling for mongo is just 'immature.' I'm sure they're right, but that also points toward what I think might be a fundamental difference in the two modes of development.

AFAIK, there aren't any "old" NoSQL systems around? Mongo is only out since 2007, and Cassandra since 2008? We started using mongo early 2009, and even just one year out it feels so much more painful to maintain than our Postgres or MySQL systems that have been around since 1999! My theory is that NoSQL sacrifices maintenance and future development effort for the sake of startup development. I even made a neat drawing:



Initially mongo seems to save on effort until the first valley - initial launch. At this point, the system launches and typically starts interacting with other systems and with users - data requirements change towards reality, which means code - i.e., function and logic - changes, not just model. In our environment, all other systems that use the data must also change their code which seems harder than the originating code. The code and the data are so intermixed that seemingly any and every change in either domain makes knock-on effects that have to be addressed.

In a typical schema data system, we front-load a bit of data modeling effort. After launch, when we get new and changing data requirements we typically address the schema changes that might be involved, and may have to write a data migration/transformation script. But beyond that, it seems we don't have to worry about data integrity or any other knock-on effects. We can change some data-access or model classes and be on our way.

So, am I just an old crusty developer shouting at these NoSQL kids to "get off my lawn!" ? Or has anyone else noticed this too? Maybe it's just the heterogeneous mix of NoSQL + schema that's killing me. Just seems like such a pain for not enough benefit?