Blog Image

Software and Systems Architecture

About this Blog...

This blog collects my ideas on different topics in the software world. I am happy for anybody who wants to participate. In case you have an interest, please contact me so I can enter your account here.

Central Data Management – or just a persistent canonical data structure…?

Service Oriented Architecture Posted on Fri, July 16, 2010 12:34:44

Hardly any integration project fails due to technical issues. When analyzing the reasons for a failed project, it is mostly the discussion and communication of the different involved parties that causes the project to be terminated prematurely.

One of the most fiercely fought battles is surrounding the ownership of core data. This discussion is less about the responsibility for important, but very static reference information such as postal codes, calendar information – but more about information that is rapitliy changing and is also a core to the busioness: client information, contracts,… The fight is very often not about the right place to store the information – it is following the rule that ownership brings power.

If I hold the ownership of the customer data – I own the customer and the other departments havew to come to me if they want their share of the information. I participated in a number of discussions and design meetings which did not try to solve a technical problem – but which were more focussed on company management issues, as the right way to handle information and who owns the business processes. If the design meetings are running into this direction you are only a second away from the worst possible catastrophe of the integration project: political design paralysis and immenent project termination. Bad new: Before the project is terminated a lot of time and money is going to be spend and a lot of people are working for the bin. And that is not the worst outcome: As a result of the termination a general look for a guilty party will always point to the integration team – as the weakest link in the organization. The underlying cause for the problem – the failed discussion on data and functional ownership – will not be addressed.

I am not the only architect who has seen these. Every integration architect with a number of years expirience can talk about these war stories – you will not hear them during the conferences and product meetings with the vendors, but in the evenings when they meet their friends and former co-workers.

Coming home from these meetings and knowing these discussions every architect starts thinking, trying to find a technical solution for an organizational problem. And in this case – there even might be one.

If the best place for the data is not in the applications, why not keep it in a place which is neutral to all the parties – in the integration infrastructure. As I have done my home work I have a canonical representation of the data structure. I even have it down to the level that attributes are typed (which is a very good practice if I want to avoid integration problems), so why should I not keep the data in the canonical form in the integration layer?

Technically the solution sounds pretty much straight forward. For the core entities of the operation/company/firm you develop a number of core services. These services can be used by the different business applicastions to coordinate (create, update, retrieve…) their internal representation of the business entities with each other. And as these entities change, so the update to the central storage of the data is updated.

Similar approaches for a different kind of data is already in use. Many organizations use a central repositories for their master data, such as reference information, ZIP codes, etc. These storages became important as a technical answer to a technical issue: how to coordinate base information on a technical level. The most common example for a central storage of information is the very well known and introduced handling of user accounts in an LDAP server or Active Directory structure.

So – technically the storage of the data in the infrastrucvture is not a problem. The necessary frameworks are either provided by the vendors of the integration frameworks (like Tibco) or can be build on a basic level by the integration team itself. Some service buses do not offer the level of guruanteed delivery the integration architect or business requires and as a result to this, the integration team has already build a level of storage in the bus to ensure that the data is correctly delivered. These solutions can be updated and expanded to accommodate also the missing services you need for the integration approach.

So – the need for a solution might be there, the technology is there for sure… – where are the pitfalls of the solution – why has it not been used everywhere yet?

The first reason are – as usual – costs. By building the central data repositiory another replica of data is generated – of data that is already in the environment. This additional data storage does produce costs, in the generation of the solution, but also operational costs when running and modifying the structures.These costs can be quantified, whereas the savings are more hidden.

The first level of saving is in the increased level of data consistency in the IT environment. This is an indirect saving to the operations manager, as he can reduce the number of staff for the maintenance of the existing data and has less service calls to his departments.

A second level of savings are in the application independent storage of corporate information. This becomes important if the decision is made to either introduce new functionality and solutions, or if existing solutions are replaced. In the first case the centrally stored information is a good basis for the population of the new systems, some enrichment processes will be required, but can be planned in. For the replacement of existing solutions, the central data storage is a perfect master copy of the core data and for data cleansing on the way.

A second reason for the missing popularity of the solution is missing sponsorship. What was the reason for the discussion in the first place – the fight on the ownership of the information – is also a problem for the solution. The owner of the solutions established in thecorporate world have to become a sponsor for a part of functionality that is outside of their reign. This obsical can only be passed if the lead of the integration has enough substance to the hands to build the solution outside of the direct project planning. Therefore the described solution is very successful in organizations in which the integration of applications is seen as a core functionality and equipped with project independent budget. As part of a single project, this solution might have a less successful reception. I can usually determine the abilitiy of an integration team to implement this kind of solution by looking into the organization and the sponsering of the canonical data model. I found that organizations that handle their canonical model as a permanent central function of their integration are much better suited to build this solution.

The final obstical for the implementation of the solution is the timeline of many integration projects. If the conception of the central storgage becomes part of the critical path of the project it is very likely that it is rushed and build as a pure data cache. It then loses a lot of the features which make it a benefit for all involved parties:
– knowledge of the structure of information is availble during the integration design process: the connectivity and extention of the central storage remain the challenges for the work in the project
– additional synergies – e.g. the use of the central data storage for business intelligence processes – can only be introduced as part of independent projects as only these projects sponsor the maintenance and modification of the register.

To summarize the said: The central data repository can a most useful tool in an organization that has very strong and independent systems which need to be integrated. It removes some of the main obsticals for the successful completion of integration projects: the discussion of data ownership by introducing a central and independent place to handle the data. It is technically available and can be introduced into a service architecture. But it also establishes permanent costs which need to be justified by benefits exceeding the use as a neutral broker.

A final word of wisdom to all the architects who read this: Do not rely on the technical solution for a management problem. The solution described here might work in many cases, but more often you need to talk to the sponsering manager to address the underlying problems.



Make me a Business Case for SOA!

Service Oriented Architecture Posted on Sat, June 05, 2010 09:13:13

This is a scenario every technical person regrets: Coming out of a meeting or the office of the manager for the project with this one task: “Make me a business case for the use of SOA in our project – and then we decvide whether we want to use the technology…” – and leaving the room is a technologist who really does not know where to start.

Writing business cases is difficult for a technical person anyway – the arguments for a technology are mostly technical and the benefits which are coming out of the use of a particular technology might only materialize after the project is completed. But when it comes to SOA the case is even more complex.

Service Oriented Architecture is – contrary to common believe – not a fashion statement which was generated to have a particular sales pitch and to impress the market with a new trent. It was generated as a hype, but like quite a number of developments in the integration and software development area it is more a stage of thinking.

Service Oriented Architecture is an evolution, not an invention, from the first days in EAI. It appeared when the architects of the first implementations of an integrated environments reviewed their work and identified where they could have improved their work. Like with all new technologies, EAI implementations -in the beginning- lived from trail and error in the implementation process. After some time the implementors reviewed what they did and found that quite a few things could have been done better, more stable, and -most important- less painful in the project management side.

So they reviewed they different approaches, looked around in the IT world and collected what they found good. Things like canonical models, service definitions and catalogues, and others. And they gave it a new name to promote the improved methodology: Service Oriented Architecture.

Coming back to the poor technologist standing in front of the office thinking about the business case. He knows that he has been tasked to actually write a business case for a methodology, that he has been asked to write down why to use a best practice. And a best practice should be used as the alternative is the use of something sub-optimal.

Where did he go wrong?

First – the choise management wants to make is not about SOA or something else. Management is interested how to implement an application most cost effective. Should they use a central application which contains all the functionality – or a distributed implementation which uses already existing capabilities in the environment. The technologist who presented SOA might have just made the mistake of presenting the “how” with the “what”: SOA as the best practice with the integration as an environment decision.

Secondly – interducing new technologies (and the related buzz words) is exciting for the technologist, but it is a horror trip for a manager. New technologies are notorious unreliable, risky, and they always cost more. They only a win for the manager if they exceed expectations and the manager can present himself as a successful visionary, looking good to the board and being invited to conferences to talk abgout his success. Therefore it is very important for the technologist not to sell SOA as a great technology thing. He should sell it as a requirement to success: If you do not use the best practice in the integration world, it will be much more difficult to be successful in the project.

Thirdly – there is a good chance that he got tasked with writing the business case because the manager got overwhelmed with new ideas and either needs to understand the concept – or he needs to find out about this SOA thing before making a decision. Anyway – the technoloist is now competing with his idea.
A much better approach is to introduce the different components one by one (such as the data model) and explain why this particular tool is useful. This allows the manager to step away from the big acronym SOA to something that has a real impact to his business and his needs. Andf it helps to implement the tools the technologist needs to successful complete his project.

Finally – the introduction of a methodology is used to also spend a lot of money on tooling. When introducing SOA it is not necessary to replace the integration platform with the new and fancy service bus (which is barely used as there are only a few services implemented). If I need a new service bus, then for very realistic reasons: the performance of the existing infrastructure is bad, the currently used platform is out of support, interfaces do not exist and I can by them cheaper then building them, or I do not have an integration platform at all. But these are investments which are independent from the concept of SOA and they have their own justification.
Using a concept or methodology to justify investments is very risky, as the stakeholder at one point wants to see their ROI. If I suggest to invest into a technical tool, then for a very specific reason which I can use to demonstrate the business benefit. That this tool can be used for many different reasons is synergy and only an additional plus.

At the end – best practices sell themself – whether they are given a particular name or not. A business case for SOA does not exist – but there is a business case for doing things right.



SOA and Security – The second great wall of China

Service Oriented Architecture Posted on Sat, February 13, 2010 23:16:33

Between the year 220 to 206 before current time, the first Emperor of China, Qin Shi Huang, decided to protect his lands by building a wall. This wall would become world famous and (after being expanded by the Ming Dynasty) would become the "Great Wall of China". The idea was genius… if the bad intruders run against a heavy big wall, they will not threaten the country. The idea did work, until the Manchus attacked… While the army was protecting the wall at the Shanhaiguan pass, the Chinese general Wu Sangui (who was opposing the royal house) opened the gates to the foe. Once the gate was broken there was no stopping of the floods, and the capital fell in just a few weeks. Relying on the fortification of the wall, the emperors did not provide enough defense capabilities in the back land to stop the Manchu army.

Why do I use this example of ancient history when writing about Service Oriented Architecture. Because we see again integrations which build great walls to secure the land – by providing firewalls, de-militarized zones, etc.. – but leave the applications – and cities – wide open. And once the intruder got into the realm, the production environment, all gates are wide open. Let us have a look at todays wall and how to close the gates.

The first described open gate are missing gates altogether. As web-services are developed in rapid pace and on short notice, the implementation of log-in mechanism and transaction security is seen as secondary. As a consequence even these basic means of security are not implemented, a web-service in the production environment is deemed secured because the generation of a service call is seen as too complex to be done manually. Additionally an external intruder needs to know the service catalogue, the internal data structures, the integration steps,… – how can an external intruder get all this information? Remember the Chinese empire – an internal informant provided all the information needed to penetrate the system. Aa soon as the information is available all needed is a XML editor and submission tool and access to the production environment. Some of these tools are standard packages used for the development of the applications: e.g. the free tool SoapUI is one of the most important tools for developers of web-services using the Microsoft SOAP standard.

And gaining access to the environment does not even have to pass the fortified environments. Security experts publish now for years that the main thread for the IT infrastructure comes from within the organization. A disgruntled developer, a playfull operator…

The second open gate in the protection of the walls is often based on carelessness: Even is the gates are build and stable, closing them and protecting them is expensive and time consuming. In many cases those gates remain open, the user identification is using a standard user which might even be identical for all applications. This scenario is even more dangerous than not having any security at all – does it not provide a false sense of security. In some bad cases, the identification of the user is wired in the messages and the intruder only needs to copy it. Even worse, those user information might even work on the human customer interfaces, opening the application even further. A special variant are developer backdoor, debugging and coding interfaces which are developed and used during the development and testing process but not removed before the application is moved into production. An open door into the application – like an open backdoor into the town. The development of web and web service enabled applications does not normally include a security scan – so those doors are forgotten, overseen, or even left on purpose – as they are useful in case something goes wrong and the application need to be supported by an expert. But what happens is the expert becomes a spy?

So – now the gates are guarded and the backdoor is closed. The city is safe! Is it? Imagine the guest coming to the town to be let in and the guards allow him entrance. And as soon as the guest/user is in he is allowed to use all facilities, all tools, see all archives and rewrite all laws. What is the equivalent in our IT world? Very often the web service client retrieves a high level of permissions. This happens for two reasons:

1. To actually reduce transaction errors: If the integration has to handle additional to technical and functional issues also security issues the actual integration becomes much more difficult and needs much more effort. So – give the web service privileges and avoid these issues. A good idea?

2. Generating user accounts and permissions consistently over many different systems is difficult and expensive. Using a component which provides a single sign on (also called SSO) means that additional funding for development and maintenance needs to be provided. It is much easier and cheaper to generate a single user which can access all applications and is trusted as soon as authenticated. In real life – would you give anybody unlimited access to all your information after just one authentification? Or would you feel happy if your bank allows someone to execute transactions for you – just because your caretake has identified him?

This list does not even discuss security issues caused by the actual coding and development. It does not include the security issues in the different pieces of software which is used to expose the environment to the world. The issues described so far are way easier to exploit and use for almost anybody with an IT background. If the coding issues are a small hole in the wall – these holes described here are big enough to let the Nimitz pass.

What is the consequence to all this? There are a few steps which everybody should take who wants to expose any application with web services to the net:

– Plan your walls: Enable all means of security the protocol allows. This implies that user accounts and passwords must not be transferred non-encrypted (readable). Best is to encrypt the authorization information with an algorithm that provides changed information every request. A possibility could be to use date and time as part of the encryption mechanism.

– Plan your authentication and authorization: When building the business process which underlies your integration, also define the users and their permissions in the applications that participate in the process. Make sure that the user really may do what he wants to do…

– Make sure in every application you can identify the real user behind the request. Just storing the web service as a user allows information being changed without proper audit trail, from users that can almost not be identified or that are almost anonymous.

– And finally: execute frequent security checks. Ensure that the development and deployment processes are taking security into account. Protocol what the web services do in your system and make sure that you can identify the actual user behind the service request.

A great wall is a great tool to secure your world. But if there is anything we can learn from Lenin in the world of information technology, then it is his alleged pro-verb that “trust is good, but control is better.”



Service Directories and UDDI server

Service Oriented Architecture Posted on Thu, January 22, 2009 10:28:52

One of the core concepts of services was the introduction of UDDI (Universal Description Discovery and Integration) services. For those readers not familiar to this concept, let me explain what this UDDI is all about.

The UDDI is a combination of directory and availability list. A service using the UDDI protocol will start up on the internet and connect to the UDDI and anounce its existence.

<not complete>



About the Fun to install a Sun box

Linux Posted on Thu, January 22, 2009 10:28:04

Actually this category is called Linux, but so far the only text I wrote was about the use of a Windows 2003 server as NFS file server. This text gets a bit closer to the main topic even I am going to talk about Solaris. Actually I write about yesterday night (or was it this morning ?) and my try to install a Sun Blade box with a new operating system.

I got this machine about 3 months ago to replace an older system (an E450). Both boxes share a lot of the architecture, especially their CD-Rom drives. The one in the E450 is broken, and as I discovered also the one in the Blade. Additionally the installation of Solaris (8) on the Blade was defect and I do not had the root password.

No problem I hear the system admins calling – why do you not install via the network, you do have an old Ultra 10 which you can use for this. Right – this is exactely what I did – and this follwing description is a protocol of the different steps I executed to get there. Just to make it clear: this is how it did work in my environment, you will have to look the details up on the Sun pages yourself.

Step 1: Prepare the Boot Server

I worked with boot server before, having disk and fan less Sun workstation. Nevertheless, the actual process of installing a boot server remained a mystery to me. So I followed the official Sun documentation, which defines defines a number of files and directories which need to be installed or modified. The files I actually modify were

– Generating the tftpboot directory

– Modifying both, /etc/ethers and /etc/hosts files

– And finally: ensuring that the tftp, the rarp and the nfs server works

Nevertheless, this did not cause the target machine to boot from the network. So I started to consult the documentation to find x different ways to configure the environment. After an hour I finally found the piece of documentation which saved the night… and which leads straight to the next step…

Step 2: Install an Installation / Boot Server

Sometimes it is just a section in the documentation which gives the clue on parameters which should be used. The script setup_integration_server has such a parameter: “-b” which does not only install the files for the installation server, but also prepares the boot server for the environment. This step takes most of the time but was in my environment without any problems.

Step 3: Add Target System to Installation Server

The documentation of Sun tells me to use the add_installation_client command in the installation tools directory of the installation server. But before this I had to add the MAC address of the target system to the /etc/ethers and the target system name to the hosts directory. It is not necessary but a reboot aftert this steop sounded like a really good idea. The add_installation_client command requires the name of the system (which we did enter before) and the platform identification. For a Blade 2000 this is “sun4u”. The script did then execute without any issues leaving a ready to use installation server behind.

Step 4: Install the Operating System

Out of the BIOS level (with the OK prompt) the installation is simply started by using the command “boot net – install”. The reboot worked without problems and the installation process started. Nicely, the installation only requires the usual entries in the beginning. As all the software is already on the server – no need of disk changes and selection is required. Much easier, and if you use a 100 MBit line, almost as fast as if you load directly from the CD.

For the Future…

The installation of the boot and installation server did cost me some hours sleep, but this will make the installation of Solaris on my network much easier. An investment I do not regred. And which I will repeat for Solaris x86 soon.



Windows 2003 NFS Server

Linux Posted on Sat, December 06, 2008 10:31:19

I have to confess… big time. I have been really bad – and installed an NFS server on my Windows 2003 server. And after a rainy weekend day and lots of tea I got it to work: all files (Windows, Mac OS, Linux, and Solaris) are now running of a central disk. I can actually shut-down two of my three files servers and use a single one.

But let me start at the beginning. Since I started to run my little private network at home – which I use to test things out and to find out why certain things are not working the way they do (clients can be a bit tricky to give you “root” access to their boxes) – I have a Windows domain as part of this network. Part of this Windows server – until lately just a 6 year old Pentium I with some 8 GByte of diskspace is the mail server, the active directory, the DNS server, and the DHCP server. You really do not need a big box for this in a small environment like mine. I also ran Windows 2000 Server.

A couple of months ago I got my hand on a (Dutch) copy of Windows 2003. (On the side: it is a original version – I take IP extremely serious – as I am making my living out of it). Knowing that Windows 2003 has two additional features I am quite interested in I started to replace the old file server. So – what do I want from the server?

– Active Directory being able to support LDAP: I had a dedicated Linux machine for this task. Yes, it is the right way to do it – in a company. In my little environment it is a bit over the top.

– NFS File Server: I used Linux for a long time as the central file repository for all the machines, using Samba and the Linux NFS server. But in todays corportate environments you find a lot of organizations that use Windows as the central file location. Does Windows NFS work as stable and as reliable as Linux does. In the last three years I rebooted my Linux file server – once (!).

So – I got a new PC (Pentium 4 w/ 2 GByte Memory and 500 GByte disk), I got Windows Server 2003 – let’s try it out. The first thing you realize: it all looks very fine and easy to install. Well – until you realize that you forget an essential tick in one of the Wizzards. You also need to plan the NFS server in if you set the domain up, otherwise you do not get it to work. But after a couple of hours – there it is – a working NFS server – I can see it, I can mount the disks from Solaris.

All fun? Oh no – because then comes the problem with the anonymous login. Yes – I can see the disk, I can mount it, it gets the right priviledges in the Solaris world, but can I actually access it? Oh no – permission denied.

You can call it a security feature, but I would more call it an oooppsss. If you want to access the disk from Unix, you first have to import the users and groups into Windows, have Windows access the central LDAP server, or make Windows the central LDAP server and use Active Directory. Yes – this is not anonymous – I agree.

The security officer smiles: anonymous is his / her no. 1 nightmare. But the guy who frequently changes the configuration has an issue – and is not happy.

Well – I will keep the 2003 NFS server. And I will continue using it for storing files and software. But for my experiments and my work spaces I will continue using the Linux environment – I am just not ready for an operating system that tries to be more intelligent than me.



Criteria For Re-Usability

Service Oriented Architecture Posted on Wed, November 26, 2008 12:15:04

Some of the entries here are driven by discussions I had with friends and collegues. This one here is an example of such a start. It is based on the discussion on what makes a service re-useable. In general, re-useability is one of the main sales stories for service oriented architecture. But it is also the one criteria most implementations do really fall short.

The selection of the interface technology does not make the main difference when it comes to the degree of re-useability. Even if all services are build using the same technology – it does not make the underlying functionality re-useable at all. The secret of a re-useable service lies in the design and the coverage of the business process.

In order to build services which can be used in more then one application, they have to be defined in an abstract level, in a way that they make sense in the business process. Just to avoid discussions – I am not talking about trivial services here, such like: “get time”, but about services within corporate business processes. In order to provide re-useable services the business itself must be able to define their business services in a way that similar steps in the process are modelled identically.

The idea is to build a fixed number of basic services which are optimized for performance and flexibility (if possible, sometimes we need to make a decision what is more important – performance or flexibility). These services build the basic building block for complex composite services.

Good candidates for these re-useable basic services are those that are based on data replication and modification, which are handling the synchronization of entities of the canonical data model. Those services, commonly modelled following the CRUDS (Create, Read, Update, Delete, and Search) paradigm, are allowing the compilation of services into composite services using the ESB or other servers. They can be seens as the building blocks of a more complex architecture.

A composite service is using the basic CRUDS service in a complete transaction, there-by conserving the atomic nature and exposing the atomic nature of a generic service.

I like to compare the system with the Danish building blocks for children. By providing the stones for the different function I can build nicely a number of different functions for all the different consumers in the environment.


The seond concept of re-useable service is more top-down. This approach is using re-useable business processes. As an example, the process of purchasing a good in a shop is using many identical steps independent from the slaes channels used. To define these identical transactions and design them re-useable it is important that the person working on them knows all different related business processes. The designer can then define the process steps in a way that the re-use of the service is optimized.

If you look into the two approaches you will see the different qualities of the re-useability. The first is a total technical approach, the business side comes in (and requires the existence) of a well defined corporate data model. The second one defines the re-useabilities on a business service level. The second approach does ask for a strong corporate data model, too – but the kind of implemented technical services are less restrictive.

What you did not find above is any mentioning about the “how” the service is implemented (whether a WebService is used, etc.). Yes, the service can be using JDBC adapters, specialized application connectors, etc. If you follow my argumentation it is actually irrelevant how I implement the service, the technology used is more dictated by other requirements, like whether the service is exposed to a certain user group (externally) or a certain service bus.



Critical Success Factor: Application and Data Ownership

Integration: Lessons learned.. Posted on Wed, May 14, 2008 10:48:05

This is a very short entry, and it handles less about technology or methodology within an architecture, but something that is even more influential in integration projects: politics.

Within the integration world we know a number of reasons why a project does not provide the results expected. Some of them are related with the expectation setting or with the complexity of the integration. Also the operational readiness of the environment plays a significant part in the integration being successful. But one reason for a issues is most prominent and causes more problems than anything else: Insufficient designed and discussed business entity ownerships.

The first decision which needs to be taken when applications are integrated is the decision on the ownership of the different business entities. Every entity in the integration has to have a clear “home”, a place which handles and owns the core information of the entity and which the connected applications can use to refresh and synchronise their information.

The ownership discussion is highly political, as the location of the data indicates a lot about the prioirites of the business. Let’s – for an example – take the most common business entity – the customer. This entity can live in different locations, e.g. the CRM system, the billing environment, or the general ledger system. By moving the customer in one or the other location, the point of gravity in the integration is defined.

It also tells a lot about the importance of the departments or team related to the application. For an integrator or any consultant it is easy to identify the importance of different business areas within the company’s philosophy by re-viewing the group which owns the important (valuable) business entities, such as customer or product catalogue.

For the project it is most important to clarify this topic as soon as possible. It does not only influence the designs of the integration and the required workforce, it also sets the most important decision making processes. By defining the owner of the entities, decision about the nature and the shared features of the entity are in a single hand and the owner can apply them to the designs as required.

Not all decisons taken in this phase of the project will be optimal, but my expirience has shown me that it is easier to correct the ownership during the project then to remain in ownership limbo or – even worse – in power warfare.

Of course, there will discussions about the location of functionality effecting the entities – but with the responsibilities clear it is possible to resolve the issues as part of the development process and with limited impact on the project.



Next »