Friday, April 15, 2011
Monday, May 24, 2010
quick blurb on NoSQL
I've spent about as much time thinking about this as NoSQL developers spend thinking about their schema, but here it is anyway.
At SourceForge I'm presently developing and maintaining a few different systems using all kinds of web tech's and languages - PHP, python, solr, Postgres, MySQL, and mongo. One thing I'm noticing is that the mongo systems are something of a breeze to write, and then a real challenge to maintain - especially debugging. Our mongo experts mostly say that the tooling for mongo is just 'immature.' I'm sure they're right, but that also points toward what I think might be a fundamental difference in the two modes of development.
AFAIK, there aren't any "old" NoSQL systems around? Mongo is only out since 2007, and Cassandra since 2008? We started using mongo early 2009, and even just one year out it feels so much more painful to maintain than our Postgres or MySQL systems that have been around since 1999! My theory is that NoSQL sacrifices maintenance and future development effort for the sake of startup development. I even made a neat drawing:
Initially mongo seems to save on effort until the first valley - initial launch. At this point, the system launches and typically starts interacting with other systems and with users - data requirements change towards reality, which means code - i.e., function and logic - changes, not just model. In our environment, all other systems that use the data must also change their code which seems harder than the originating code. The code and the data are so intermixed that seemingly any and every change in either domain makes knock-on effects that have to be addressed.
In a typical schema data system, we front-load a bit of data modeling effort. After launch, when we get new and changing data requirements we typically address the schema changes that might be involved, and may have to write a data migration/transformation script. But beyond that, it seems we don't have to worry about data integrity or any other knock-on effects. We can change some data-access or model classes and be on our way.
So, am I just an old crusty developer shouting at these NoSQL kids to "get off my lawn!" ? Or has anyone else noticed this too? Maybe it's just the heterogeneous mix of NoSQL + schema that's killing me. Just seems like such a pain for not enough benefit?
At SourceForge I'm presently developing and maintaining a few different systems using all kinds of web tech's and languages - PHP, python, solr, Postgres, MySQL, and mongo. One thing I'm noticing is that the mongo systems are something of a breeze to write, and then a real challenge to maintain - especially debugging. Our mongo experts mostly say that the tooling for mongo is just 'immature.' I'm sure they're right, but that also points toward what I think might be a fundamental difference in the two modes of development.
AFAIK, there aren't any "old" NoSQL systems around? Mongo is only out since 2007, and Cassandra since 2008? We started using mongo early 2009, and even just one year out it feels so much more painful to maintain than our Postgres or MySQL systems that have been around since 1999! My theory is that NoSQL sacrifices maintenance and future development effort for the sake of startup development. I even made a neat drawing:
Initially mongo seems to save on effort until the first valley - initial launch. At this point, the system launches and typically starts interacting with other systems and with users - data requirements change towards reality, which means code - i.e., function and logic - changes, not just model. In our environment, all other systems that use the data must also change their code which seems harder than the originating code. The code and the data are so intermixed that seemingly any and every change in either domain makes knock-on effects that have to be addressed.
In a typical schema data system, we front-load a bit of data modeling effort. After launch, when we get new and changing data requirements we typically address the schema changes that might be involved, and may have to write a data migration/transformation script. But beyond that, it seems we don't have to worry about data integrity or any other knock-on effects. We can change some data-access or model classes and be on our way.
So, am I just an old crusty developer shouting at these NoSQL kids to "get off my lawn!" ? Or has anyone else noticed this too? Maybe it's just the heterogeneous mix of NoSQL + schema that's killing me. Just seems like such a pain for not enough benefit?
Tuesday, January 26, 2010
a rant about ranting
Disclaimer: this post is totally my own opinion and does not reflect anything from SourceForge at all. that's why it's here on this blog.
I'm angry and want to shoot my mouth off - perfect opportunity for a long-lost blog.
We - i.e., SourceForge are getting some crap for blocking sanctioned countries from our site. That's fine - I'm actually ticked off about it too. And many people out there are making sound and solid comments about the action - not just the ones defending SourceForge; there are some good solid critical comments too.
But then you have people who say something like this:
With love from pyalot. Well pyalot, since we're all good to judge and criticize each other, let's get started ...
So you are Florian Bösch. Okay Florian, let's see here ... you've worked at Systor(?), Accenture, and DWS. Systor doesn't seem too keen on open-source?, nor does DWS. Ah, looks like Accenture has some good open source work; but what's this?! It's right alongside Microsoft and Oracle solutions?! OMFG! You are the scum of the earth for working with them! GRARRR!
Or, if I take an extra minute, I find you're actually a stand-up guy and developer and a good contributor to open-source!
Couple lessons here:
I actually sympathize with Florian's sentiments - blocking access from countries goes against the FLOSS ideal. But at the end of the day, SourceForge is a US company under US law. And if we're not law experts we should probably speak our opinion quietly or not at all.
I'm angry and want to shoot my mouth off - perfect opportunity for a long-lost blog.
We - i.e., SourceForge are getting some crap for blocking sanctioned countries from our site. That's fine - I'm actually ticked off about it too. And many people out there are making sound and solid comments about the action - not just the ones defending SourceForge; there are some good solid critical comments too.
But then you have people who say something like this:
Sourceforge, you suck! You suck so badly, I’ll hereby guarantee you that I’ll not only recommend *anybody* stay the heck away from you scumbags, I’ll actively let everybody know that you’re the scum of the earth. Shame on you! Shame!
With love from pyalot. Well pyalot, since we're all good to judge and criticize each other, let's get started ...
So you are Florian Bösch. Okay Florian, let's see here ... you've worked at Systor(?), Accenture, and DWS. Systor doesn't seem too keen on open-source?, nor does DWS. Ah, looks like Accenture has some good open source work; but what's this?! It's right alongside Microsoft and Oracle solutions?! OMFG! You are the scum of the earth for working with them! GRARRR!
Or, if I take an extra minute, I find you're actually a stand-up guy and developer and a good contributor to open-source!
Couple lessons here:
- we're not anonymous on the internet anymore; I found all of this info on Florian starting from his sf.net user page
- when we only look at a single facet of any news story or party, we get a very distorted view
I actually sympathize with Florian's sentiments - blocking access from countries goes against the FLOSS ideal. But at the end of the day, SourceForge is a US company under US law. And if we're not law experts we should probably speak our opinion quietly or not at all.
Tuesday, July 21, 2009
OSCON quotes - day 1
I want to share quotes I overhear at OSCON 2009. Most of these are from fellow SourceForgers ...
- I'm a fan of the minimalist beauty of the electronic device.
- Your API is not a beautiful fucking snowflake.
- I am as asymptotically close to clean as possible.
- You're going to be happy about not being happy.
- I'm German, we know how to deal with crowds.
- It doesn't matter, you eat it with rice and bread.
- I fucked the grower to get this shit.
- It's amazing what you can fit up your ass with a little practice.
- I don't like my balls soaked in sugar syrup.
- People shouldn't call each other tar pit.
- There's nothing you can think of with an olive that I haven't already video'd and sold on the internet.
- Is this the placenta thing?
- All eating human flesh stories start with, "I was going to med school."
Friday, February 06, 2009
Test-Driven [Design|Development]
Today I learned to appreciate Test-Driven Design a little bit more. Here's the story.
I'm writing some RSS feeds that will contain extensions and other non-RSS elements using XML Namespaces. I'm using Zend_View and Zend_Feed and I thought the best place to put the namespace would of course be at the top of my default.rss.phtml template file - that way I can register all the namespaces at once at the top of the feed. Instead of writing the test first, I wrote the code first. Took maybe 10-20m and seems to work fine:
Then I go to write the test. Lo and behold - it's a big pain in the ass to consume the feed using SimpleXML.
It's easy enough to create a SimpleXML element out of the feed, but I can't create SimpleXML elements from the content:encoded XML data:
Because all the namespaces used in the DOAP class aren't in the content. Argh! My first thought is to screw SimpleXML and do a raw string search/parse in the test. But then I had my epiphany: "If I were an actual client of this feed, I would want to be able to parse it easily with SimpleXML or with any other XML library."
I ended up pushing the xml namespace declarations right down into the appropriate elements - where I now think they are *supposed* to be:
Voila - SimpleXML starts parsing everything very easily.
This is one of the biggest boons for Test-Driven Development - the effects it has on the way you design your code. If I had not tested my code as an actual client would use it, I would have produced some pretty shoddy feeds with useless XML namespacing.
I'm writing some RSS feeds that will contain extensions and other non-RSS elements using XML Namespaces. I'm using Zend_View and Zend_Feed and I thought the best place to put the namespace would of course be at the top of my default.rss.phtml template file - that way I can register all the namespaces at once at the top of the feed. Instead of writing the test first, I wrote the code first. Took maybe 10-20m and seems to work fine:
<rss content="http://purl.org/rss/1.0/modules/content/"
doap="http://usefulinc.com/ns/doap#"
sf="http://sf.net/api/sfelements.rdf#"
foaf="http://xmlns.com/foaf/0.1/"
rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
version="2.0">
....
</rss>
Then I go to write the test. Lo and behold - it's a big pain in the ass to consume the feed using SimpleXML.
It's easy enough to create a SimpleXML element out of the feed, but I can't create SimpleXML elements from the content:encoded XML data:
<content:encoded>
<!--[CDATA[<doap:version>
<doap:name>Project 1.1 - Foobaj</doap:name>
<doap:created>1202221896</doap:created>
<doap:helper>
<foaf:person>
<foaf:name>admin1</foaf:name>
<foaf:homepage resource="http://lcrouch-703.sb.sf.net/users/admin1">
<foaf:mbox_sha1sum>6dd817a0f71590a68131a5e83b1bd73944654e8d</foaf:mbox_sha1sum>
</foaf:Person>
</doap:helper>
<doap:file-release>proj1.file1.tgz</doap:file-release>
<sf:download-count>0</sf:download-count>
</doap:Version>]]-->
</content:encoded>
Because all the namespaces used in the DOAP class aren't in the content. Argh! My first thought is to screw SimpleXML and do a raw string search/parse in the test. But then I had my epiphany: "If I were an actual client of this feed, I would want to be able to parse it easily with SimpleXML or with any other XML library."
I ended up pushing the xml namespace declarations right down into the appropriate elements - where I now think they are *supposed* to be:
<content:encoded>
<!--[CDATA[<doap:version
doap="http://usefulinc.com/ns/doap#"
sf="http://lcrouch-703.sb.sf.net/api/sfelements.rdf#">
<doap:name>Project 1.1 - Foobaj</doap:name>
<doap:created>1202221896</doap:created>
<doap:helper>
<foaf:person
foaf="http://xmlns.com/foaf/0.1/"
rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<foaf:name>admin1</foaf:name>
<foaf:homepage resource="http://lcrouch-703.sb.sf.net/users/admin1">
<foaf:mbox_sha1sum>6dd817a0f71590a68131a5e83b1bd73944654e8d</foaf:mbox_sha1sum>
</foaf:Person>
</doap:helper>
<doap:file-release>proj1.file1.tgz</doap:file-release>
<sf:download-count>0</sf:download-count>
</doap:Version>]]-->
</content:encoded>
Voila - SimpleXML starts parsing everything very easily.
This is one of the biggest boons for Test-Driven Development - the effects it has on the way you design your code. If I had not tested my code as an actual client would use it, I would have produced some pretty shoddy feeds with useless XML namespacing.
Friday, January 23, 2009
Leave the editor open
I've been trying to adopt some practices from The Productive Programmer. Mostly by using more keyboard shortcuts and productivity tools like Quicksilver, Jumpcut, etc.
Yesterday and today I realized a productivity tactic that isn't in the book - just leave your work open when you "go home" for the night. Don't close the program. In fact, don't even close any files, tabs, or any background programs either. Just save everything and walk away.
The effectiveness of this trick is related to something Joel wrote about a while back ...
In the bike metaphor, leaving all your work open is like leaving the bike poised on a down-hill slope. All you have to do is get back to it and hop on. If I sit down at a blank desktop, I'm more likely to open my email, read my RSS feeds, open work email, and THEN, finally, open my code editors. If I sit down in front of a code editor, I'm likely to start editing code immediately.
Yesterday and today I realized a productivity tactic that isn't in the book - just leave your work open when you "go home" for the night. Don't close the program. In fact, don't even close any files, tabs, or any background programs either. Just save everything and walk away.
The effectiveness of this trick is related to something Joel wrote about a while back ...
For me, just getting started is the only hard thing. An object at rest tends to remain at rest. There's something incredible heavy in my brain that is extremely hard to get up to speed, but once it's rolling at full speed, it takes no effort to keep it going. Like a bicycle decked out for a cross-country, self-supported bike trip -- when you first start riding a bike with all that gear, it's hard to believe how much work it takes to get rolling, but once you are rolling, it feels just as easy as riding a bike without any gear.
Maybe this is the key to productivity: just getting started. Maybe when pair programming works it works because when you schedule a pair programming session with your buddy, you force each other to get started.
In the bike metaphor, leaving all your work open is like leaving the bike poised on a down-hill slope. All you have to do is get back to it and hop on. If I sit down at a blank desktop, I'm more likely to open my email, read my RSS feeds, open work email, and THEN, finally, open my code editors. If I sit down in front of a code editor, I'm likely to start editing code immediately.
Friday, January 09, 2009
Seven things that probably you may not know about me
Anderson tagged me, so I'll give this a try, though I'm going to have a tough time finding 7 other people who haven't been tagged already.
I'll tag ...
So tagging it back to Brazil, from whence I was tagged. :)
- I have a black belt in the hodge-podge kick-boxing-jujutsu-taekwondo-karate style of fighting they teach at Apollo's Karate.
- I have an identical twin brother, and 2 older brothers, one of whom is also a PHP developer.
- I am emerging Catholic.
- I brew my own beer.
- I love soccer. I try to play every weekend. Also, GO REDS!
- I can speak conversational Russian. I also speak a little French, a tiny bit of German and Portuguese, and I'm starting to learn Spanish. I'm only fluent in English though. :(
- I landed my job at SourceForge after I made an OSS project there. So go make one yourself! :)
I'll tag ...
- Matt Crouch - above-mentioned brother.
- Travis West - SourceForge colleague and long-time friend.
- Steven Osborn - developer from Vidoop who helped me run the Tulsa PHP User Group until he ditched us for Portland. ;)
- Noah Everett - Tulsa PHP developer I met thru TPUG; he created and maintains twitpic.
- Brad Vernon - Tulsa PHP + Ruby developer I met thru TPUG.
- Vance Lucas - Another PHP developer here in Oklahoma. Met him at Tulsa Tech Fest 2008.
- Rafael Dohms - PHP developer in Brazil; he saved my ass at PHP Conference Brasil '08 when he found a DVI-VGA adapter for me to present my keynote.
So tagging it back to Brazil, from whence I was tagged. :)
Subscribe to:
Posts (Atom)