SPIN meeting next week
12 Apr
The Cape Town Software Process Improvement Network is meeting next week Wednesday (ie, 2007/04/18) at the Bandwidth Barn in the CBD next week. I won't be there (prior commitments), but it might be a good place to meet people who care about the theory behind developing software, not just putting together glorified web sites like I do...
So far, I've been looking at modifying existing pages in Gibe (my still as-yet-unreleased TurboGears blog application) - adding widgets to post pages, dynamically adding the comment field and handling it for different comment formats, and adding additional fields at blog entry create/edit-time and handling these fields to add tagging (or whatever). Adding new pages (or replacing the default ones) is pretty much necessary - for example, to add a page where there is a list of all pages with a particular tag.
I use Routes for dispatching incoming URLs to functions in Gibe. It's not the default dispatcher in TurboGears, but it's pretty easy to set up (there's a TurboGears/Routes integration recipe on the TurboGears wiki).
Why go through the bother?
It makes adding new pages easy - no matter how complicated the URL structure is and where the dynamic portions are. It also makes it easy to pass through the dynamic portions, and also to pass through defaults if the dynamic portions don't exist. The killer function is named routes, which allows me to look up where something is (ie, generate the URL for it), and not hard-code the link to where the page is. That means that I can totally change the URL structure of the site without changing any code.
I'm growing a little tired by how the "industry" complains about lack of skills. Now, they're saying software graduates are lazy.
I think the "industry" has a few problems. Frankly, it's boring. And, well, it's wrong-headed. Why does "the industry" always look for a quantity of software graduates? I've seen job adverts for "10 Junior PHP programmers", for example. It's not unusual for a large number of people with similar, low-end, skills being asked for. At the same time, there's a massive gap between that low experience level and the early-career experience level.
My first job was great. I was well-paid, well-treated, and was surrounded by intelligent people. Three rather decent jobs after that, I wasn't even earning the inflation-adjusted amount I was earning in my first job. And, yet, somehow, a few months after that meant a difference of 40% or so in the salaries of the types of positions I was being asked to apply for.
I can only imagine it's as irritating for other people, working their way up between the fifteen billion other graduates that compete for the sorts of jobs you're able to apply to with your experience. You're at least 20 times more effective than the average of those fifteen billion people, and somehow you're being extortionate for asking for 10-30% more salary. And you need that salary, since, being interested in the field, you want to buy books, be online, and so forth.
The reason there are fifteen billion other graduates is because the companies are asking for quantity of staff, not quality of them. People see lots of jobs open, and so decide to go "into IT". Those people who go "into IT" because of the available jobs are just not worth as much as those actually interested in the particular subject. 10 low-experience programmers earning R5k a month (take home) aren't nearly as valuable as 2 higher-experience programmers earning R25k a month (take home), and the 10 low-experience programmers also cost more because they use more desk space, more parking bays, and so forth.
That's bad enough, of course, but now they're calling IT graduates lazy, because the average IT graduate probably is lazy, because the average IT graduate is in the wrong field. If you want someone who isn't lazy, don't ask for 10 graduates - ask for 2 higher-experience people.
The worst reason I see for hiring more junior people is that more senior people - people with more love for the field - tend to move jobs, which is a lot worse than if you only have one or two of the ten people churning at a time. But, frankly, look at the way you treat the more senior people, and you'll quickly find why they leave - because they're not given the things they need to perform.
Unless you're grossly underpaying your senior staff, they're likely to stay if they're treated well. Sure, that may mean forking out more so that the development area is properly lit. Or that the environment has enough air flow. That the temperature is managed. That there are enough plug points. That there are two LCD monitors on their desks. Heck, offices for every, or every two or three, developers, so that they're not constantly surrounded by noisy colleagues who sing along to music playing on their earphones, or are making sales calls, or who just talk to themselves loudly or discuss the cricket or how to make money fast with property with other members of the staff. But, you'll find, they're worth it. They're worth more than five other people, and don't cost five times as much. And they're there and if you treat them well, they'll stay there. And they know what they're doing already! Less time wasted on training!
Now, people will say that I'm being elitist - that I'm not thinking about ways for junior people to join the industry. Well, firstly, boo hoo! Why should we care about all of the artificially high number of people who go into an industry for the wrong reason and into an industry that doesn't actually need them? We should care about those that are in it for the right reason, and those that would be in it if given the opportunity.
With less chaff, there will be less competition for those that are in it for the jobs available. Those that would be in it if given the opportunity are not a problem that is solved by having tons of low-experience jobs. That requires work before they even decide what job they want to go into - they need to know that they're interested and/or suitable in it by then.
Of course, this does leave a lot of people who've been through all these courses and so forth without something to do. Maybe we can buy all of them a series of books by W. Richard Stevens, Frederick Brooks, and Donald Knuth, and see who floats. It'll be cheaper than the fly-by-night or utterly useless "programming course" they've been on and will go on again when they're conned into thinking it'll get them a well-paying job.
Over the weekend, I did the first PHP programming I've done this year, on a whim project to write a Amazon S3 storage manager for KnowledgeTree.
In KnowledgeTree, a storage manager describes where to and how to store and otherwise manage the files attached to documents in KT. S3 is a remote storage web service from Amazon, allowing reliable, scalable, and net-speed-quick storage for data. Most of the code ended up being around configuration - particularly in terms of making it easy to check whether the S3 storage manager is properly configured.
This starts with a status dashlet (a dashboard "portlet" - a little bundle of information on the "front page"), which makes it hard to miss if you haven't configured things yet:

(It also tells you if you haven't told KnowledgeTree to use the S3 storage manager.)
The S3 Storage adds an administration page, allowing through-the-web configuration of the plugin. One of the things I'm sorry I never got around to was getting configuration in KT pushed more into the database - the constant push for "cool" features meant the basics that affect actual users never got any time. Before configuration, the page looks like this:

The "No" next to testing reminds administrators whether they've done a test of the settings they've entered. The "Test now" makes it easy to see if the settings are correct without trying to add a document.
The Amazon Web Services settings are quite boring, since they can't be detected. But, for the S3 configuration, we can query the available storage buckets and present the information to the user:

Once you've set it up, it's actually quite sad how it "just works". There's no difference in terms of using KnowledgeTree to add, bulk add, or delete documents (except that it's a bit slower for those poor South Africans with not-so-wonderful connectivity). Downloading is fast, since the files are cached locally.
The entire storage manager is only 250 lines of code, most of which is caused by the immaturity in the storage manager framework, because pretty much the only non-standard storage manager that we wrote was a in-database storage manager (which was quite slow, and thus was forgotten). The admin page, status dashlet, and so forth come to 500 lines of code. You can download the plugin in ZIP and tar.gz formats.
Contemplating success
22 Feb
Getting out of the never-ending mad dash that ruled my life for much of last year is starting to give me time to think about and hopefully learn from the events from that time.
Lately this thinking has revolved around how to do technology right, and how technology should be treated by business. There's always lip service about what technology should be to a business, and how the engagements should work, and so forth. But I think we so often find ourselves working from a flawed understanding of what success is to the business that is initiating the project.
A common position I end up in is taking over what pretty much anyone would call a failed project - a project where a lot of money has been spent and the resultant product is not even of sufficient quality to expose to others, let alone started to make money. Aspects of the project that have already been paid for have been found not to be of sufficient quality (or even delivered at all), and need to have more money spent on them.
But I hack on the existing project for a few months to get it to the point where I wouldn't put a bag over my head to avoid being associated with it. The project launches, and after a few rough patches and late nights, I get the last major kinks out, and the project starts making money. Some might even call it a "success".
But is it? If I'd been on the project from the beginning, my arrogance forces me to insist that the project would have been delivered properly the first time, with less back and forth to QA, with fewer publically-visible problems post-launch, and thus would be cheaper in terms of direct costs and indirect costs to regain the confidence lost during this whole process. So, while the second part of the project (which some may call another project entirely) in itself is successful, the entire process of developing the technology was more expensive than it needed to be.
And that's not even the most damning difference. If I'd been on the project from the beginning, my arrogance forces me to believe that the end result would be better in the way that most matters about technology in the long run - how it facilitates or hinders changes in future. Or, put simply, a solution I'm involved in would give the company agility.
Many companies have a great idea, and get to market by the skin of the teeth of the people they could find and afford to do their initial technology. And, because of the way the technology was rushed and was cobbled together, changing it is often scarily hard. Which isn't a problem at first (when they have the existing developers who remember all the ins and outs of the system), and so the company's great idea and timing and efforts make them profitable and renowned. Fast forward a few years, and they're cursing their main source of income and often the thing they're most famous for (if you're talking a web site, for example).
Why?
The developers are usually gone by now. Even if they aren't, they no longer have the ability to keep in their heads all the hacks they've been forced to put in to the original system to keep it ticking over "until we rebuild it all". There's entire pages of code dedicated to special cases for particular user names, groups, and so forth. When changes do happen, all effort is put into reducing the intrusion on the original system, because the company has been bitten by the hard-to-detect consequences of previous changes.
Success or failure?
At the beginning, a great idea was all that was needed to get into the market, and delivery could be made timeously. Now, there are more people available, and there's more money available. Sounds good... But, despite the additional people and money, even small ideas and small changes take a long time to make, and the next great idea is near impossible to implement. These are the small and big ideas and changes necessary to continue being relevant - to not become an also-ran of the very area they once owned.
Unless your business usage of the technology is once-off - a gimmick web site that has a specified shelf life - then you have got to think about the cost of change built into that technology. (By the way, this is also why open source and/or open standards are such a no-brainer in the long term - you never end up having your data and processes tied up in some proprietary system that makes the cost of change too high when you need to change something about your business.)
I don't claim to have the answer to how one can define success, but this is the simplest way to describe what I'm feeling now:
Success can't be measured only by what you have achieved - the resources, the accolades, and the good will. More important than those, it should be measured also by what you can achieve from this point on.
(This also applies to any development project independently of the business that initiates it. Taking the laziness as a virtue approach, you can measure your success by how much effort you save yourself by building something that's easy to maintain and extend in future.)
For me, ToscaWidgets is one of the most exciting things I'm watching grow at the moment.
From the first Hello World test with TurboGears, I realised there was just something special about its widgets system (and the videos certainly didn't hurt either). It felt much like when I started using WebWare and FunFormKit, and subsequently got to know FunFormKit a little better, coming to appreciate simple things like the validators/converters and then auto-generating admin pages for objects using adapters and SQLObject, but amplified.
If you haven't used either, the winning idea is bundling together not only the visual/behavioural/content aspects of a component in terms of Javascript, form fields, or other HTML, but also the validation and conversion of whatever is entered into the browser into something useful to you as a programmer. Reusably - just steal it from someone else - and extensibly. And with standard reactions to invalid input - to the point that you don't have to worry about 99% of the problem cases.
But as much as I like TurboGears, I'm not going to be using it exclusively. I'm using Django at work, and I've been look admiringly at Pylons recently for a little personal project. The biggest advantage I see in TurboGears is the re-use of good default existing components and being able to use alternates from the existing components out there. Which means that when you're not using the standard dispatching mechanism (by using Routes, for example), the standard ORM (by using SQLAlchemy before TG 1.1, for example), or the standard templating engine (by using Genshi before TG 1.1, or using Brevé), it's sometimes hard to motivate for using TurboGears.
Anyway, despite often not using the components chosen by TurboGears, it's that the components I do use are generally available that appeals to me. For example, Django's ORM doesn't interest me until I can use it in a TurboGears or Pylons project (and until it doesn't have those hideous __exact things in the parameters). But I liked what I saw in TurboGears's widgets, and FunFormKit wasn't a realistically active competitor.
So, when ToscaWidgets was announced - the TurboGears widgets system rearchitected to not rely on TurboGears or CherryPy, I was really happy. While it would probably be a hard sell to make it the One True Way to other mega-framework folks, here's to hoping that it'll be easy to use whether you're using TurboGears, Django, Pylons, web.py, or any old WSGI or even CGI application. (And don't forget Zope!)
(and here's to hoping that the new layout I'm using hasn't introduced ugly problems preventing Python luminaries from posting again...)
Simpleblog series continues
21 Nov
One can get quited used to life with setuptools. While developing and deploying gibe, the install_requires setting in setup.py has come to be my friend, ensuring that everything I need is installed in the environment I'm working in. But when investigating anti-spam options after the twenty or so spam messages overnight, I suddenly realised that there is a scary world without eggs.
Two options showed up in the Cheese Shop - spambayes and akismet Python API.
I'd used spambayes before, adding it to my vellum install to reduce spam. So, I just popped it into install_requires and reran python setup.py develop to get it installed. But there was no egg package available for setuptools. It felt a lot like culture shock. Anyway, I didn't give up immediately, and found out I could store the spam information in the RDBMS. But then I decided to see what else there was.
I'd heard of akismet before - I saw it on the KnowledgeTree People blogs that use WordPress, but I went with moderation on those instead. But since akismet keeps itself automatically up to date and I just felt like trying something different, I figured it was worth a shot. Again, no egg file. Thankfully, it is just a single file, though, and that means I can just bundle it, and things will work on a from-scratch Python environment (since I've been using virtual-python extensively lately).
Anyway, akismet has so far prevented 10 or so spam messages in the past few hours. I know, because I've also integrated TurboMail, which now notifies me on all comments, whether they pass the spam test or not. Besides a typo that prevents message delivery (who cares about that aspect of mail?), deployment with setuptools was flawless - just a simple easy_install of my updated gibe package.
Of course, now I need to figure out how I can send patches to the akismet and spambayes people to get them egged-up.
Introducing gibe
07 Oct
On and off over the last two weeks I've been developing gibe, to replace vellum as my web log software. Gibe is written in Python (of course), and uses the TurboGears web-based development mega-framework. Well, with an alternate set of tools - Routes for dispatching, Genshi for templating, and SQLAlchemy for database connecitivty and ORM, to facilitate my learning of these tools.
How has it been going?
Well, Genshi really helps to ensure valid HTML everywhere. Vellum's templating system, unfortunately, was one of those build-it-with-strings and occasional embedded Python code. Genshi's XML-based templating is spot on for almost all uses - separating a list with some character is not one of those, although I found a nice solution for that. It does silently swallow certain types of errors, which is quite confusing, and also quite surprising in a Python module. But the HTML sanitiser is really great, and I can see myself writing a few filters for it, and maybe writing some code to make applying filters to particular streams in a larger template easier (to make a comma-separated list relatively trivial).
I've become a total Routes convert, especially as I have been contemplating the plugin architecture I want to add. Currently, I have a couple of routes added to provide backwards-comaptibility for Vellum URLs, and these could trivially be done with passing the routes mapper to plugins to add their own paths. Which means that adding new admin pages, new user pages, or entire content management systems wouldn't require any changes to the core code.
SQLAlchemy is taking a while to get used to. I like the declarative ActiveMapper style, but it too silently swallowed some errors that cause relationships between tables/objects to be lost. But, I'm warming to it.
TurboGears, despite all these replacements, continues to function and be useful - the automatic application of templates, the automatic validation of forms, and automatic error handling is a potent combination. That it doesn't tie you into a particular templating engine or modeling system is comforting, but the opinionated defaults are welcome too. And the TurboGears widget system continues to impress me.
Still much for me to do - automatic excerpt generation, theme support, plugin architecture, anti-spam support, and so forth. And tagging, so that I don't have to edit the database to show entries to those subscribed to particular topic feeds. But it's probably the most enjoyable programming I've done in years - simple specification, tools of my own choosing, and no deadlines makes a great change...