Monday, February 28, 2005

Great XSLT article and update on stuff...

First off, if you ever have to write XSLT, or more importantly teach someone how to write good XSLT, then you should bookmark this article. I especially liked the part where part of his solution to the XSLT problem was to restructure the XML he was transforming. While not always an option it's something I've done many times and it really can help.

You have probably noticed that I've not been posting much of late. Lots going on with the new boss and ongoing recruitment. Over the last few days at work I've had much fun writing some useful code. Of course, as one of my collegues pointed out all of our code is useful, but my point is that over the last three months I've mostly been writing one off Java and XSLT transformation code, probably a couple of thousand lines of XSLT alone.

The last few days I have been re-writing our database creation and default seeding code. Previously we've been using SQLServer specific scripts but it was finally time to bite the bullet and make all of this database independant. Not as big a job as you might think. We have the DataStoreInterface api which should abstracts the SQL into objects. So all I had to do was some up with some XML grammer to represent the table definitions and then to describe the default data to go in them. The only slightly hard part is managing and resolving all the cross-references because of foreign keys. Well it's been a nice diversion from the other work.

Wednesday, February 16, 2005

Just how sacred is your data?...

The project I am working on at the moment involves putting a lot of information that our clients publish into our system for them to apply metadata to. All of this published information is structured hierarchically, although it isn't obvious from the printed versions what these hierarchies are. So we have been spending the last few weeks moving this information into an XML format that we can load into the CMS.

Some of this work has been done before, so that information was easily exported from another copy of our CMS, however that is only a small part. What we have been finding is that the only machine readable versions of much of this data is in held Quark files and they can only offer us Word document exports from this. There is no semantic markup and the WordML structure involves lots of tables which are used for layout purposes. In the end most of this information has had to be cut and pasted by hand.

I have to admit that I am a little shocked by all of this. The information in question is the life blood of this organisation, in fact it is pretty much the only reason that the department that we are dealing with exists. Having this data locked into a format that is strictly for layout publishing purposes seems absolutely crazy. From what I can see all editing is done directly to this format. I think that the reason this has happened is that our clients have always seen the end product, i.e. the printed versions, as sacred without ever thinking that somewhere in there is pure data which is actually what they should be concerned about.

The true extent of this problem came to light on a related project with the same client and data set. The printed versions have marginal notes which provide cross-references between parts of the information and we needed to know if there were any reciprocal links in there. Our client couldn't tell us this without checking through all the relevant parts of a printed copy!

It is easy for us developers to forget that a client's perspective on something can be very different to our own. We had assumed that there would be a way to get some of this information in an electronic format that we could at least begin to use and transform. I think this has been something of a learning curve for our client, and we are helping them to understand the implications. Of course the great outcome of this project is that they will have these pure data versions in XML. They are now seeing all the possible benefits of this perspective change from thinking that the Quark files are their only precious commodity to thinking about the underlying data as being more valuable.


We have a position open for a Java/XML/XSLT developer at our offices in London (£26-£30K + benefits). If you are interested there is more information available on our website.

Thursday, February 10, 2005

Java/XML/XSLT developer position, London...

Sorry that I haven't posted in a little while, but there has been a lot going on. As you will no doubt have guessed from the title, we have a position open at Simulacra for a new developer. All the details are on our website, you will need to fill in an application form, which can be downloaded from that site, and send it in to us with your CV. The closing date is the 23rd Feb.

There are many things that you cannot truly convey with a job advert, and still more that cannot even be posted on the website. This job will not be a free ride for anyone, but I can say that if you get it you will learn stuff, constantly. You will get to work on interesting projects, alongside some very talented people. If you are someone that is willing to take responsibility for your work, and understands how that makes the job better for yourself, then this is definitely for you.

We are a small company and there are pros and cons to that depending on the type of person you are. I think the benefits far out weight the drawbacks. The social side of the company alone would make it worth it.

It is true that I often moan on this blog, but that is mostly because I am writing late at night and need to vent. You don't get the good side as much, this is also because the good side is often covered in NDAs and things, but to give you an idea of some of this...

1) Check out our previous work. Design Council, National Theatre Stagework, QCA National Curriculum Online.

2) Other past and present clients, the work for whom is not as publicly visible, include The British Museum, BBC, Department for Education and Skills, Jane's Information Group, Channel 4, Pearson, Countryside Agency, Museums, Libraries and Archives Council.

3) Check out our open source technology Open Harmonise. See the system you will help continue developing.

4) The quality of our partners, including Convera and Illumina.

I look forward to seeing your CVs and application forms arriving soon.

Wednesday, February 02, 2005

Hands up if you use one?...

It would appear that the concentrated effort we put onto sales has paid off and we now have lots of wonderful work, both for old and new clients. This is, of course, wonderful news but also a case of "be careful what you wish for!". I think that it will be a couple more weeks yet before things settle back down into a routine, it propbably wont be untill then that the new projects have been scoped enough to have a complete plan. I like having a plan (no laughing from the people in my office), makes me much more relaxed.

Got the team out of the office for a little while last week to a J2SE 5 presentation at the Sun offices here in London. There wasn't anything in the talk that was a surprise, but I haven't had much of a chance to look in detail at the new language features so this was a welcome walk through.

The highlight of the whole thing though must have been during the introduction where the "Technical Evangelist" from Sun told us exactly what topics he would be covering. One of these topics was Netbeans and, being the interactive kind of a guy that he was, he asked for a quick show of hands from people who used Netbeans.

Thank God Sun's cleaners are good otherwise there would have been tumble weed rolling through that room. Not a single hand went up. His response "Well that'll be a tough sell then..."