Friday, June 03, 2005

The Business Nervous System

Middleware - The business nervous system

What is it?

Well, the term 'middleware' actually has different meanings depending on who you talk to within the IT industry and can mean anything from a DCOM/Corba RPC interface to Transaction Processing monitor systems, but there are certain key features that almost everyone agrees upon.

Middleware is software which sits in the middle between two or more applications and allows them to talk to one another. I'm going to concentrate on a type of middleware called 'message oriented middleware'.

Never heard of it. How important can middleware be?

Well, middleware is the difference between automating business operations and having to do it all manually. For the individual, it isn't really important but for a business, the cost of hiring people just to perform repetitive data entry tasks for information that they already have can be crippling. Middleware automates business application communication and is arguably the single most important 'application' which a business can make use of.

The interesting thing is that of all the companies and IT departments which I have worked for, only one has implemented a middleware system properly and that company literally spent six figure sums on some extremely expensive software.

Doing it wrong

Perhaps you're sceptical. I'll give you a simple example of how things are usually done.

A company making and selling 'Widgets' has a financial accounts package which handles their invoicing, taxes etc. They also have an order placement package which allows them to take orders over the phone and a dispatch application which keeps a record of which orders have been dispatched and which have not.

Orders are taken and stored in a database.

  • Every evening the orders in the database are exported to a file and copied across the network to the system the accounting package is running on and loaded into the accounting system.
  • A second job copies a slightly different set of data from the orders database to the system the dispatch application runs on.

The batch job which unloads the data from the order placement application and FTPs it across the network to the accounts system was written in house by someone in the I.T. department. It doesn't do anything else. A separate batch job unloads slightly different data and FTPs it across to the dispatch system.

The situation is now approximately as follows:

Hosted by Putfile.com

That's fine you say, fairly simple and it gets the job done. Well, I'm just getting started. The dispatch department in Widgets Inc keeps a stock of Widgets so that they can handle most orders immediately. John in the storeroom knows how many widgets they have and phones Fred in Manufacturing when they need more. When there is a big order for widgets, the first thing Manufacturing know about it is when John phones up and requests 200,000 widgets by this afternoon.

Hosted by Putfile.com

The manufacturing department of course has it's own stock of parts to make widgets which it records in a database and a manufacture request system which tells the production line to make N widgets. Manufacturing receives invoices from their parts suppliers which are authorised and passed on to the finance department.

Hosted by Putfile.com

Fred in manufacturing manually enters requests for widgets in to the manufacturing request system and Joe keeps track of parts and orders more parts from suppliers in the parts database and the supplier database.

The Widget Inc service department also have their own customer database, parts database, supplier database and a database of broken widgets with what's wrong with them. The broken widget database sends data to the accounts system using FTP when a widget is outside warranty. When the widget is repaired, it is sent to the dispatch department and a separate job written by another nice person in IT immediately sends a remote shell to the dispatch system to enter the customer details automatically. Unfortunately the customer information in the services department is manually entered and is slightly different from the information in the dispatch system so it has to be checked manually.

Hosted by Putfile.com

You see the complexity starting to build up now? It gets worse. Some bright spark in Marketing decides that Widgets Inc needs to have an e-commerce web site where customers can enter orders over the web. The only problem is that the order placement system was only ever designed to accept orders from a termnal and cannot take submissions from elsewhere. Therefore the web interface needs to talk to the financial and dispatch systems directly and because the web page tells the customer when the order has been processed, the dispatch system talks to the web system as well.

Hosted by Putfile.com

I think you might begin to see the pain and expense involved in keeping the systems running never mind getting useful and sane information out of them. Almost all of the companies I have worked for have an IT infrastructure which looks approximately like the above diagram, most of them far worse.

Doing it right

The problem with doing things right is that it tends to look expensive. In fact commercial middleware systems are extremely expensive pieces of software. You can expect to pay several tens of thousands of pounds for a single middleware server, hundreds of thousands of pounds for a network of them and several thousand pounds for each individual client connection. So you see why things are still done badly.

Middleware architecture

The basic architecture of a middleware system turns a many to many configuration into multiple many to one configurations. All applications talk to the middleware, not to each other. The middleware software itself is often called a message broker as it allows the transfer of 'messages' from one application to another.

Hosted by Putfile.com

Notice from the diagram that a single transaction is required from the order placement system and it's the message broker which handles the transfer of the information to both the finance system and the dispatch systems.

The Widgets Inc IT department installs the Whizzy Widget Broker software on a new server and installs Whizzy clients on each of their existing systems. The client software is set to pick up messages every minute and give them to the broker for delivery to the destination application. As well as sending data, the client picks up and loads information from the applications queue every minute.

Hosted by Putfile.com

Now, within one minute of an order being placed, the information has been sent to the message broker, a further minute later, the dispatch and accounting systems have been updated. Another minute and the dispatch system generates a manufacturing request and sends it to the manufacturing request system. Two minutes later, the suppliers receive orders for the replacement parts required to keep the stock levels constant.

The web site when added, places orders in exactly the same way and retrieves the order status from the broker. Now, most of the commercial systems are much faster than 1 minute for message pickup and delivery, typically a second or two on a Local Area Network but do most businesses really need that kind of performance, or the kind of cost that is associated?

It's HOW much?

I sympathise and if you need the benefits of a middleware system without the huge cost then maybe it's worth rolling your own and seeing if something can be put together out of off the shelf components. There are a number of features that the middleware systems generally exhibit.

These features would need to be included in the home grown system:

  1. They are 'event driven'. This means that when an 'event' arrives a message can be fired off to the relevant application. An event can be something like a database update for example an order placement or a file being dropped into a directory. The middleware software generally doesn't care what the event is.

    Hosted by Putfile.com

    It's perhaps also worth noting that this can also be represented by the OSI 7 layer stack. I'm not going to bother here though.
  2. They are 'Message Oriented' (M.O.M). Typically an 'event' as defined above will have some useful information that needs to be shared with different applications, this information will be packaged up in some way and sent off to the applications which need it.
  3. They tend to be asynchronous. This means that the sending application doesn't have to wait for all of the other applications to finish with it's message before it can move on to the next message. Typically commercial broker systems are very fast, being able to send messages in fractions of a second and are capbable of handling hundreds or thousands of messages per second.
  4. They have message queues. Due to the asynchronous nature of the system, the broker will manage queues of messages destined for the various applications, this means that work can still continue even if an application is unavailable with jobs being queued up until the application is available.
Rolling your own - Middleware on the cheap.

There are a few systems which exhibit similar (though not identical) characteristics to the commercial message brokers:

  • Internet Relay Chat
  • Usenet News
  • SMTP email

The above systems are very common existing systems which might fit the bill. I'm going to use the Usenet News server 'INN' for a number of reasons:

  1. The newsgroups structure which Usenet news servers use can also be used for message queues.
  2. A News server automatically keeps a history of all the messages posted and automatically expires messages after a defined period.
  3. News servers are also designed to replicate articles between severs.
  4. The hierarchical nature of newsgroups makes it very easy to add groups which can be used for logging, acknowledgements etc.
  5. Groups can be protected with username and password requirements.
  6. There are lots of command line and GUI clients available, as well as free programming interfaces for C, Perl, Python, TCL, Visual Basic, Pascal/Delphi Java and other languages for virtually every platform and operating system.
  7. Existing GUI clients can be used to monitor the queues.
  8. Messages can be automatically 'cross posted' to multiple queues simultaneously while only occupying space required for one queue.
  9. The News system has proven itself to be an extremely scalable system with powerful management features for creating and propagating new groups automatically which would make the management of a network of servers much easier.
  10. Groups can be 'moderated' by humans. Articles posted to the groups are mailed to the group moderator who can approve or deny the post.


Collecting the bits

On to getting the bits together that'll be needed to make this work. I'm going to start by using shell scripts and command line utilities. It's much easier to prototype an idea this way. You don't have to wade into mountains of C or Perl code to get things working.

The INN news server software

  1. A command line news client which can read and post news articles. The 'suck' news client seems to be a utility which can do this.
  2. A utility to encode/decode binary file formats to plain text. This is needed because News servers transfer data as plain text but the messages themselves may in fact have binary content. There is a utility called 'mimencode' which can do this.
  3. A utility which can compress the messages, as some might be quite large. GNU zip will do this.
  4. A utility which can encrypt and digitally sign each message before it is sent. It's very important that the receiving application can be sure that a message really was sent by the apparent sender, that the message was transferred without being corrupted, has not been tampered with in any way and cannot be viewed by an unauthorised person. The GNU Privacy Guard software (GPG) shall provide that functionality.
  5. Some glue to stick it all together. The GNU 'bourne again shell' is what i'll be using to glue all these components together.
  6. Finally, a scheduler is needed. A piece of software which can run other software at specific times and dates, and at specific intervals. Once per minute is needed, and the standard Unix 'cron' can perform this task.
Architecture nitty gritty

Well, we have the software, it now must be configured. The basic architecture of the system is defined by the use of a news server and the fact that it's being used as middleware but there are specific configurations that need to be made.
Newsgroup (queue) configuration

Each application needs one queue for incoming messages. Thinking big, i'm going to structure the newsgroups as follows:

company.sitename.applicationname.queue

This way if Widgets Inc expands to multiple locations, it's easy to include the extra sites. The company name is added so that Widgets Inc can communicate with suppliers and customers applications directly. The UseNet news system can very easily distribute messages to other servers. This makes it easy to add more brokers without having to worry about distributing the messages.

Hosted by Putfile.com

Notice the complexity building up again, this time with inter-business data connections between suppliers and customers. This suggests that there is a case for a centralised brokering organisation which provides authentication and message transfer between companies.

Hosted by Putfile.com

The newgroups are configured to expire messages after one month. I'm going to assume that any problems with a particular message will have been dealt with by then. In addition to the message queue itself, each application will have a logging queue where information about message transfers can be posted by the applications, an error queue where errors are posted and an acknowledgement queue where acknowledgement of receipt and successful processing can be posted for the sending application.

company.sitename.application.messages
company.sitename.application.log
company.sitename.application.errors
company.sitename.application.acknowledgement

All of the queues need passwords to access the information in them.

Agent setup

The agent is the part of the middleware which interfaces with both the application and with the broker. The part which interfaces with the news server is easy to put together and will be fairly standard. The bit which interfaces with the application however will have to be customised for each and every application, but that's true of commercial middleware as well.. The amount of customisation required to talk to the application could be anything from writing some C code to the applications API to a shell or perl script to load and unload data.

I'm going to assume that the data is stored in a SQL database of some sort and that data can easily be loaded and unloaded using 'dbload' and 'dbunload' utilities. The pseudo code for each run of the agent is approximately as follows.

Recieving messages

suck new messages from newsgroup
check digital signatures
mimedecode messages
decrypt messages
uncompress messages
dbload messagefile

Publishing messages

dbunload outgoing data to file
compress file
encrypt file
mimencode file
digitally sign file
post file to destination newsgroup

The actual shell script would need to check and handle the error conditions from each of the utilities before allowing the data on to the next utility for further processing. The news server handles message transport from here.
Message format

One very important area I haven't touched upon is the actual format of the message itself. Any consistent format can fairly easily be used, but all of the applications which talk together must speak the same language. To start with Widgets Inc have a fairly limited set of applications which need to talk to each other a simple delimited format like comma separated variable may be enough. As more applications are added, it will become important to use a common file format that defines the data. This is where XML comes in. When Widgets Inc expands the middleware system to include suppliers, it becomes extremely important to use standardised XML document formats like those provided by RosettaNet.

Transactions

A transaction in the IT world is an all or nothing situation. Transferring money from one bank account to another is a good example. Either the money is fully transferred or it is not. If this were not the case then what happens if one account is credited and then the system fails? The originating account is not debited and there's extra money floating around.

The reason that I mention this is that the middleware system I have just described does not support transactions. Most commercial message oriented middleware systems do not support end to end transactions either, but it's something to be aware of.

Glossary

DCOM: Distributed Component Object Model
GNU: GNU's Not Unix
MOM: Message Oriented Middleware
RPC: Remote Procedure Call
SQL: Structured English Query Language
CORBA: Common Object Request Broker Architecture
FTP: File Transfer Protocol
INN: InterNetNews
SMTP: Simple Mail Transport Protocol
XML: eXtensible Mark-up Language.

References

Excellent middleware tutorial: http://www.sei.cmu.edu/str/descriptions/momt.html
RosettaNet: http://www.rosettanet.org/
General info http://www.middleware.org/

INN: http://www.isc.org/products/INN/
Gnu Privacy Guard: http://www.gnupg.org/
Suck news client: http://home.att.net/~bobyetman/
Mimencode (Part of Metamail): http://bmrc.berkeley.edu/~trey/emacs/metamail.html
Gnu zip: http://www.gzip.org/

Some free (java based) MOM broker systems:
Bond: http://bond.cs.purdue.edu/
XMLBlaster: http://www.xmlBlaster.org/
Nirvana: http://www.pcbsys.com/

Upload Video and Images - Putfile

14 Comments:

At Sunday, October 02, 2005, Blogger Steve Austin said...

Interesting blog. I have a xml rpc python blog.

 
At Friday, October 07, 2005, Blogger jenna said...

I really enjoyed the content on your blog about sql database will be back very frequently! I actually have my own sql database exposed blog with all kinds of stuff in it. You�re welcome to com by

 
At Friday, October 07, 2005, Blogger mia said...

Found a lot of useful info on your site about sql database - thank you. Haven't finished reading it yet but have bookmarked it so I don't lose it. I've just started a sql database blog myself if you'd like to stop by

 
At Tuesday, October 11, 2005, Blogger St Louis Cardinals BUFF said...

So many blogs and only 10 numbers to rate them. I'll have to give you a 8 because you have good content.

Free Access To More Information Aboutcampaign finance reform

 
At Sunday, October 16, 2005, Blogger Julie said...

This comment has been removed by a blog administrator.

 
At Wednesday, October 19, 2005, Blogger brian said...

This comment has been removed by a blog administrator.

 
At Wednesday, October 19, 2005, Blogger Johnjon said...

This comment has been removed by a blog administrator.

 
At Wednesday, October 19, 2005, Blogger brokenmonkey said...

This comment has been removed by a blog administrator.

 
At Saturday, October 22, 2005, Blogger cards65 said...

An interesting internet home business blog. I have a large internet home business article website at internet home business Perhaps you may get time to visit it.

 
At Sunday, October 23, 2005, Blogger BTR Beta said...

Choose a imageshack hosting Host that matches your Home Business needs By: Arun Pal Singh Copyright 2005 Arun Pal Singh With plethora of imageshack hosting hosting companies offering services, choosing a good web host seems an... imageshack hosting

 
At Sunday, October 23, 2005, Blogger allfreei said...

Good homeless people post. I'll be sure to stop by often. Also check out homeless people

 
At Wednesday, October 26, 2005, Blogger Infactahost.com said...

This comment has been removed by a blog administrator.

 
At Wednesday, October 26, 2005, Blogger tipsonrelationships said...

This comment has been removed by a blog administrator.

 
At Thursday, October 27, 2005, Blogger startonline said...

This comment has been removed by a blog administrator.

 

Post a Comment

Links to this post:

Create a Link

<< Home