IDG Contributor Network: Life cycle management: Let the sunshine in

Enterprise architecture is about the landscape of your organization. A landscape of people and IT and of the behavior of both. These landscapes have become very complex. In itself that complexity and the problems it brings has led to many efforts at standardization and rationalization. Not too many brands of an operating system, for instance. Not six different operating systems, but two. Not five different workflow engines, but one. Not too many different networking or computing technologies. Less is better.

Even when you are standardized, an additional complexity comes from the life cycles of IT. Putting IT in—any IT, even standardized—doesn’t mean you are done. For software this is the most obvious: new versions are released on a regular basis. There are many reasons for new versions, a very important reason are fixes of bugs, both in terms of the primary function, but also for instance in terms of security. Platforms that are used to run applications get updated all the time with security and bug fixes. And then there is a constant stream of new functionality, which in itself also means new functionality that can have bugs.

Having old IT in your landscape is generally not a good thing. Old stuff is one of the most important sources of security problems. If your landscape still contains Flash, very old Java or dotNet versions, outdated operating systems, it is very likely not that secure.

Companies are aware that they have to manage the systems and make sure that they are up to date. This is life cycle management (LCM). LCM also takes place with IT hardware. Hardware also gets old. It wears. It can fail. Hardware has a lifetime of a few to maybe ten years. And then, you generally cannot replace it with exactly the same as it is not for sale anymore. Technology has moved on. New versions and products are forced into your landscape, even if you do not have new demands (but generally, you have).

All these products often depend on each other. A patched operating system or application framework may suddenly result in an application not working properly anymore. A new version of a platform may offer new possibilities, but also lose old ones. You may want to update your Windows server from Windows Server 2012 to Windows Server 2016, or from RedHat Linux 6 to 7, because the old version is going out of support, but will your application that runs on it still work?

Or you might want or need to update an application, but the new version requires a different platform version. Which you do not support yet. And most of your landscape is not built but bought (or rented) and the suppliers are in control over their LCM and what they support (and what not). Will you be using that old Windows Server 2003 that is no longer updated with security patches? Is there an old application that forces you to keep using it?

The result is a frothing sea of components, all at a certain level of ‘up-to-dateness’ and all in constant change. This is a major source of why it is difficult to change your IT landscape and with that your enterprise landscape.

It is hard enough to plan a change as it is, but while you are making the change, a lot of other things change as well simultaneously. The landscape is volatile. In “Chess and the Art of Enterprise Architecture,” this is likened to playing a chess game where—while you are making a move—hundreds of other players are making moves as well.

It is no wonder that life cycle Management is one of the trickiest parts of maintaining a healthy architecture or landscape. Strangely enough, architecture frameworks seem to pay little attention to what life cycle management does to your landscape. They talk about the life cycle of your architecture artefacts; not about what life cycle events do to your landscape and how to manage that.

So, how can you get grip on that? I’ve seen in multiple organizations that architects were managing roadmaps (now we are using Windows 7, next year we will move to Windows 10). But it was difficult to get a grip on all the items and selection of what is part of the roadmap and what not is pretty arbitrary (“Why do you mention JBoss versions but not tomcat versions or Hibernate versions? I don’t know, should I?”).

I often saw more generic rules, such as “We use two versions of anything: if the latest version is N, we will be using versions N-1 and N-2.” Or: “We will only be using software that has ‘standard support’ and only exceptionally use software in ‘extended support’.”

Those generic principles generally did not solve the issue, the real world of dependencies was too complex to be governed by them. I should update as a new version N+1 has arrived, but I can’t because…

A proposal to get a grip on life cycle management

First, it is important to understand what is in your landscape is in the end your choice. There are some legal limits (e.g. if you do not have the license to go on using something, you’re legally not allowed). But for the rest: your landscape (your architecture), your choice. Do you want to keep running that unsupported old piece of soft- or hardware? Nobody stops you.

Then, for every product or system or platform or application or service you use, you define a set of your own periods when it can be used. For managing the use of systems and their full LCM the periods are:

  1. Sunrise
  2. Sunshine
  3. Sunset

And there is an aspect, called Quarantine, which is independent from it, but generally may play a role during Sunset.

1 2 Page 2

Sunrise

Before Sunrise, the product (defined as a specific version or version range) is not to be found in your landscape. Not anywhere. Not in development. Not in testing. It is not there. If it is, it is not under LCM control. The only place you’ll find it is in the life cycle administration (see below).

At Start of Sunrise, the product/version may be in your landscape, but it is not yet in production. This is the period where people develop what is needed to put the product in production. Think of security baselines that need to be defined for a new platform, or logging, backups, etc. are being set up for a new product or version. People are being trained or recruited to use or support the new product. Nobody in the business can use it for real production work, we are in preparation mode, agile or not.

Sunshine

At Start of Sunshine (which is End of Sunrise, but we ignore that name) the product/version is fully in production in your landscape. During that time everyone in the organization can use the product as it is supported by your organization. The fact if there is a vendor that offers support for you is secondary. You decide if you want to have it in your landscape. That it is in (extended) support by a vendor or service provider, is part of the complex reasons for allowing it in your landscape.

Sunset

You plan when you want to retire the product/version. Before that, there needs to be a period where everyone using the product/version must move away from it. This period is the Sunset period. During that period, it is still supported and allowed in your organization, but new uses are not allowed, only when an exception is given. So, after Start of Sunset, you (repeatedly) start to warn current users of the product/version that the product/version will be (forcefully) removed from the landscape at End of Sunset. And this cutoff is hard (but see below): after End of Sunset any instance of the product/version will be shut down.

Now, it is customary to have these sorts of dates in organizations, but what generally happens in an organization is that the reality forces you to keep things operational beyond what you would like. For instance, you want to scuttle Windows 2008, but there is that one important application that only runs on that platform and the replacement has been delayed. Or, you really have planned to update all the Oracle databases to a new version, but some important legally required change gets absolute priority in the organization and there is a limit of what you can handle in terms of changes. This is reality. Watch out. It bites.

So, what happens when your plan to scuttle a product/version gets bitten by reality? What organizations generally do is consider their standards (which are updated once in a while) and use a ‘comply or explain’ approach. And if the explanation is valid, an exception is given, for a certain period. The idea is to try to get to the point that in the end nobody uses the product/version anymore and it is gone from your landscape.

In some organizations there will be—at least if LCM actually is working—so many exceptions that it is hard to maintain that you have a standard.

You might argue that the exception amounts to a lie. You say you are ‘standardized’, but the exceptions prove that you are not. Your policy document, roadmap or “landing mode of operations”, may show that Windows Server 2008 is gone, but all the exceptions are not part of that idealized reality. Think what happens by the way when your policy documents say that Windows 2008 is no longer allowed in your organization at some point, but there are many exceptions and some auditor comes looking. It becomes messy.

That is why I argue that you should not give exceptions to End of Sunset in your organization. Instead, if there is a pressing reason to keep something around beyond the End of Sunset date, you move the End of Sunset date itself instead. You do not give an exception to the standard; you change the standard. There is a good reason for that: it is closer to reality.

By doing that, you also make clear that what remains in that landscape is still there because you want it. Because it is policy. It still gets backed up. It still is monitored. Everything needed to keep it running, safely and according to your organizational requirements, still needs to remain in place. You cannot get rid of your mainframe support until the mainframe is really gone, can you? You can (should) even put in a pricing mechanism. If some business owner makes you keep up a group of mainframe specialists because he or she has been unable to move away from an old application on the mainframe (technical debt), that business owner is the source of all that extra cost.

Why are you as a platform provider so expensive for John? Well, that is because of the cost that comes from Suzan’s technical debt. It is not for nothing that in the world of outsourcing and cloud providers you have even less choice. If they drop it, they drop it. And you must move. It is your landscape, but it is their decision. When the landscape and the decisions are still fully your own, you have the advantage of making more flexible decisions. The cloud has many advantages but is not disadvantage-free.

A small exception to a standard (as it mostly happens these days) is easily granted but moving the End of Sunset is a lot harder. And that is as it should be. We should stop lying about our landscape. The ‘exception’ is still part of your landscape. Acting as if it isn’t is not a good thing.

As an aside, the philosophy behind this is also illustrated by an old AI adage: “The best model of the world is the world itself.” As argued in “Mastering ArchiMate,” any administration you have about (parts of) your landscape is in fact (part of) a ‘model’ of (part of) that reality. An important hygiene factor for your organization is to have ‘models’ not contradict each other, another problem that ‘linking to that single reality’ solves.

When you rely on the model and the model lies, you often pay a hefty price. Things get harder than they need to be. In AI during the ‘80s-‘90s –  at the height of the Winter of AI, when it had become clear that almost all that symbolic modelling of the first 30 years had failed – AI researchers decided that instead of trying to make ever better symbolic models, they should rely on direct observation more. Only relying on a model is like walking in your home with your eyes closed and relying on memory.

There are reasons we don’t do that much. So, if you are forced to use a model, and IT this is often the case, make sure your model at the least doesn’t lie. And if possible, do not use a model (an administration) when it is feasible to do direct observation. That discovery tool for licensed software is a lot more reliable that a spreadsheet with a human maintained administration of servers and what has been installed on them.

Anyway, will End of Sunset get extended all the time? No. First, we are still in Sunset and during Sunset new instances of the product/version still do require an exception. And second, by making it visible that the end date moves, you improve the weight of the decision. Those small exceptions largely remain hidden. The move of an End of Sunset date far beyond the end of vendor support for a standard will get noticed. That someone has gotten permission to keep using it while on paper we have ended it is much harder to spot.

Quarantine

Now, it will happen (it does in many organizations) that some product/version is kept in operation while it has gone beyond (affordable) support. Apart from the cost of keeping the product/version running, we need to manage security as we will not be getting patches anymore. One way to do that is to put the not-vendor-supported product in Quarantine. Quarantine is a special separate period. After Start of Quarantaine, the product must be isolated. Maybe behind a firewall, in a separate network segment, or in extreme cases physically. The business owner has no choice. He or she must migrate to a Sunshine solution.

For a product/version, it could be like this:

  • Start of Sunrise: 1 Jan 2016
  • Start of Sunshine: 1 Jul 2017
  • Start of Sunset: 1 Jan 2021
  • End of Sunset: 1 Jan 2022
  • Vendor End of Support (optional, informational): 1 Jan 2023
  • Vendor End of Extended Support (optional, informational): 1 Jan 2024
  • Start of Quarantaine: 1 Jan 2023
  • End of Quarantaine: never (see below)

Again, the vendor support is just informational. It is we who decide if we ‘support’ it being in our landscape, based—amongst other things—on vendor support. And frankly, all these vendors have wildly different support schemes. Many are now moving to a regular short cycle with the occasional ‘long term support’ version, some just move, especially the more modern tools. By the way, we need to make the same choices for open source products where there maybe isn’t vendor support at all (just a community forum).

Now in the above example, if End of Sunset has to move, the business owners that are the cause of this have to pay the extra cost. And if it is unavoidable that the End of Sunset goes beyond Start of Quarantaine, the product goes into Quarantaine.

For example, if you have an old application (technical debt) that depends on Oracle 8, then you will find the database and/or the application behind a firewall. The performance may suffer. But you have choices. In the example, Quarantaine starts when normal Support ends. We have set it that way, because extended support is too costly. But if we have to move End of Sunset beyond Vendor End of Support, we have a choice. Do we put the system in Quarantaine, or do we buy Extended Support and move our Start of Quarantaine date?

It’s important to note that Start of Quarantaine may even be Start of Sunshine. For instance, if you add some appliance to your landscape that is managed by another party (say the people who you have hired to do some facilities work including climate control of your building, they use IT that they manage themselves, they do their own security patching (or not…), etc.

Such a new addition to your networks should be separated from your other systems. That already means Start of Quarantaine. Technically, you might even have End of Quarantaine, for instance for a product that goes into production with security flaws but for which improvements have been announced. As soon as the improvements have been installed, Quarantaine may end.

Roadmap, LCM, LMO—it is all the same thing

Architects have often been tasked with maintaining “roadmaps” and more recently “landing modes of operation” (LMO). An LMO is a description of “what is supported in terms of how we can run applications.” Where the applications end up may be called a “landing zone” (LZ). The LZ is where an application “lands.” LMO is just another name of which products/versions make up your standards of that LZ.

Architects have also often been tasked with maintaining “roadmaps.” Roadmaps are how an LMO develops over time. But in that roadmap, I often only find a few major choices, like operating system brands and versions, database brands and versions, etc. It is kind of arbitrary what ends up in the (often strategic) Roadmap and what not.

The LMOs are often somewhat more precise but lack the dimension of time. This is a disadvantage, as it pays to know that while Windows 2012 is still part of LMO now, but soon it will no longer be. Maybe it is better than for the application owner to choose Windows 2016 now even if 2012 is technically still part of Sunshine.

The Sunshine approach puts it all together. Your planning of all the product deployment choices of everything that you want to control (which also is your choice as an organization) in terms of life cycle is in fact LCM and LMO and roadmap in one. The life cycle administration tells you what the LMO is now, but also what it will be in a few years (as far as planned). Is it too much? No, because the poor enterprise architects are not the ones who have to do all the work.

Each product in the Sunshine administration has a product owner. If your IT department (owner) offers tomcat 8.1-4 (minor upgrades, so when 8.4 replaces 8.3, all migrate—as with a security patch), tomcat 8.5 (major functional changes with respect to 8.1-4), and tomcat 9 application servers for applications to land on, you are having three entries in the life cycle administration, each with its own Start of Sunrise, Start of Sunshine, Start of Sunset, End of Sunset and Start of Quarantaine. If you’re smart, you implement it in tooling with discovery and workflow engine. You manage all the dependencies. You’re in control as much as you want to be.

Owners may be users of the products of other owners. Business owners whose applications run on the tomcat platform IT offers them. Every product (application, platform, etc.) ends up in that administration with dependencies in place. If server X contains platform Y, then the Sunshine events on platform Y have to be managed by the owner of server X. If Application A uses platform Y, the same is true for the owner of application A. Etc.

It is a lot of work to set this up, but it will be not very hard to maintain if you make sure that your model is closely linked to reality as in “the best model of the world is the world itself.”

This article is published as part of the IDG Contributor Network. Want to Join?

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *