Tuesday, October 19, 2004

Open-Harmonise details: Publishing engine

In my previous post (More details on our open source initiative...) I promised to write a series of posts providing more details about Open-Harmonise. I decided to start this series with a description of the publishing engine.

This article only covers some very simple examples, what it does not show is the true power of Open-Harmonise which is building pages through complex metadata matching. I'll show you this another time once we have covered the basics.

End-To-End XML

One of the main features of Open-Harmonise is its end-to-end XML publishing. Many different publishing frameworks/engines talk about this, and some do do this. Our approach is perhapse a little different.

The core of the system takes/pulls in a whole bunch of XML from different sources, it then publishes the data pointed to by this XML into a final XML output document. After this it is down to the developer to assign an XSLT/FO translation to this XML output to get the final published output (HTML, WML, PDF etc).

There are three main XML inputs to the Open Harmonise publishing framework;

1) State XML
2) Page template XML
3) Object template XML

These contain the rules that describe the contents of the output XML, these rules point to specific data to be published or describe searches to be run over the repository the results of which will be published into the output.

The data fed into the output XML, based on these rules, can come from any Publishable object, either one of the built in objects or from other objects that developers add to their Open-Harmonise server.

This article is going to focus on these XML rules files and the XML output, leaving details of the data sources till later.

State XML

One of the most powerful features of Open-Harmonise is that it is abstracted away from the name/value pairs of HTTP requests. Instead the publishing engine expects an XML request to operate on, for example;




<state>
<page id="1001">
<referer>
<page id="1000"/>
</referer>
<session id="D_TagLSNNxiLWK1U5BasjO">
</state>


The above is a very simple example of an XML state in Open-Harmonise. The referer information is generated automatically if it is available, however the rest must be passed into the request. Now obviously, while XML can be posted into a HTTP request, this is not something that can be done with a simple HTML link.

As I said before, the Open-Harmonise publishing engine is abstracted from the name/value pair of HTTP request, in fact it is abstracted from any type of request. It can just as easily be place behind a Web Service. All of this is dealt with by protocol handlers, included with Open-Harmonise are handlers for dealing with;

1) HTTP name/value pair
2) HTTP post

The second of these simply grabs an XML document from the HTTP post request, therefore the above State could be sent as a complete XML document. This is typically how we handle requests from Flash applications.

The first of these is how HTML links are handled. It requires that the State XML is encoded into name/value pairs before the request is made, i.e. in the link on a web page. This handler then decodes these back into XML.

The encoding scheme is very similar to XPath, therefore will be familiar to anyone who has worked on XSL coding. The State example given before would look like this.

http://webdav/servlet/XRM?page/@id=1001&session/@id=D_TagLSNNxiLWK1U5BasjO

Although the referer element in the example XML will have been generated automatically by Open-Harmonise, you can probably see how you would create such an element as a name/value pair.

referer/page/@id="1000"

When creating the XSLT for translating output XML into HTML there are several utility xsl:templates to assist in creating such complex links.

Page Template XML

In the example State XML from before there are references to Pages. These are definitions of logical pages within Open-Harmonise. A logical page is made up of a XML file for the Page Template (the rules for the page contents, that were mentioned before) and a XSLT file for translating the output XML into the final desired format. In this section we will look at the Page Template XML and its role in the publishing engine.


<HarmonisePage>
<PageTitle>Test page</PageTitle>
<Navigation name="mainnav">
<Template id="10">
<Section/>
</Template >
</Navigation>
</HarmonisePage>



This example of a Page Template XML file includes a static title for the page and a navigation group called "mainnav". Inside the navigation group a Section object will be published. Sections are the grouping objects for Documents and Assets, they normally form part of the administerable structure of a website built on Open-Harmonise.

The Section element, in this example, has no identifier (id attribute). It could, it could instead have a Path element under it as an identifier. Without any identifier the publishing framework will first of all check the State XML to see if there is a Section element in it which does have an identifier. If there is this will be the Section that is published. If there is not, then a new Section will be created and published.

This last case, may at first seem like a last ditch attempt to publish something instead of throwing an error, however this is not the case. This is a the basis for publishing web forms. For example if you wanted someone to be able to register themselves on your site, you would publish a new User object. When this is submitted to another page, encoded as XML, it will appear in the State XML for that page and can then be saved as a new User.

These are just a couple of examples of what you can do with the Harmonise Page Template XML, however the schema has many elements in it. Here is a list of some of the more useful ones.

* Search - publishing a search form
* List - publishing a list, which could be the results of a Search or a match using information from another object from the State or the currently logged in User
* include - XInclude for building pages from fragments

Mostly elements in the Page Template XML match to Objects within an Open-Harmonise server, so you can easily add to the power of Open-Harmonise by developing your own Publishable Objects.

Object Template XML

In the Page Template XML example shown in the last section there was a Template element surrounding the Section element that we wanted to publish. This points to an Object Template XML file, which tells the Object which elements of itself to publish. While we could put these XML instructions directly into the Page Template XML, these tend to be very small reusable parts, which is why we split them out into Object Templates.


<Template>
<Section>
<Name/>
<Summary/>
</Section>
</Template>

This above example shows how we can tell the Section object to publish its Name and Summary information. These elements are generic to almost all Open-Harmonise objects along with other elements such as Profile, which tells an object to publish its metadata information.

Objects within Open-Harmonise can also have elements specific to themselves, for example Document (which contains an XML document) has a Contents element.

Because Templates are so small and reusable you can quickly build up a library of often used ones. Doing this enables to you build sites more and more quickly.

Conclusion

In this article I have given you a basic introduction to the Open-Harmonise publishing engine. As you can see it is designed to be very simple to implement with. However using these simple techniques and XSLT/FO to translate the output XML you can create complex information solutions, as we have for http://www.nc.uk.net and http://www.designcouncil.org.uk.