The rate of change in IT today is unmanagable... Discuss
A few years back I wrote an article about how I thought the enterprise would begin to struggle keeping up with the way IT is evolving and the rate of change that is required. A few years later I think that challenge is bigger than ever but to a large degree it seems to go under the radar as a topic.
What do you mean?
Ok so this is a pretty bold statement but I think we need to define the problem and discuss some examples.
At its heart the problem is that in IT everything is changing and the pace of change is very rapid. In a way we IT has become a paradox for many businesses but they dont yet know it. For years the business functions within an organisation have complained about how their IT is inflexible and changes too slowly to keep up with the rate of change required in the business world for the business to stay competitive.
As organisations have adopted the cloud and other modern software development platforms they are able to work with vendors who push new features and services to the IT department at a speed which is much quicker than the IT department is able to effectively consume them. The rate of change of services from vendors often means a service can be previewed, go live, become mature and then decommissioned before the IT department ever got around to evaluating if the service was something that they could benefit from. You could argue the rate of change is bordering on chaos and there is going to need to be a recognition in IT departments that this is a problem space we will need to manage.
Lets look at a few examples of smaller instances of this problem to bigger examples.
Example 1 - NuGet Refresh
Speaking to a friend recently who develops a software product when I asked him if he felt this was a problem he replied that he was bordering on scared everytime he needed to refresh the Nuget packages used by his product. He tried to get the team to do this on a regular basis to minimise disruption but he knew that if it had been a few months since they were updated there could be the following two scenarios:
- A significant number of the packages will need to be updated (can be hundreds for some code bases)
- 1 or more packages might have multiple versions to update by
My friend knew that everytime they did this there is the risk that a breaking change could be introduced and it could be anything from 5 minutes and everything is fine to a case of a couple of days before you can get the code to compile. Hopefully you have good tests to catch any issues but sometimes you may not find a new bug until much later.
Example 2 - How many different ways are there to build an API?
Recently I finished a project where we had been doing what we considered to be quite cutting edge stuff and speaking to people at conferences we were using technologies in anger that most other companies were just starting to play with. Then I met some other companies and realised that there were solving the same problems we were but using completely different approaches. With so many vendors out there today and with the Microsoft eco-system now including much more from the open-source world and other vendors it seems we are dealing with many more different options for implementing the latest buzzword solution.
If you think of a scenario as simple as building an API then with todays tech on Azure it would not be unreasonable to consider the following options to build it:
- An API component hosted on a VM
- An API component hosted on Azure App Service
- An API component developed and deployed using Azure Functions
- Which may be deployed on Windows
- Or deployed on Linux
- Or deployed with Docker
- An API component hosted on Service Fabric
- An API component hosted on Azure Container Service
- An API component hosted on Azure Kubernetes Service
On all of the above you also may or may not use API Management and some of them may use Docker and some may not and we arent even considering yet which language you might want to write the code with.
You could argue that we have gone from a situation where 10 years ago we were significantly constrained in our architectures by the lack of choice of tools and platforms to build solutions with to a situation today where we have too many choices and tools and platforms with too many overlaps. In fact I would go so far as to say you could argue concepts like "Best Practice" and "Best of Breed" can no longer exist in this eco-system because there are so many choices and so many scenarios that even if one option was the best now in 6 months it would be likely to be considered out of date.
What is the hidden problem?
I believe that the examples above are 2 of many different things that play out in IT in the enterprise today which lead to implementations which are delivered with best intentions but once the solutions are live they quickly develop architectural baggage and technical debt because they dont necessarily keep up with the times.
Some of the behaviours in IT that I believe can be seen with this are things like:
- Every agile team chooses their own tools with no consistency across the organisation
- Every new project or product starts with the latest new shiny tools
- Every new project has to reinvent solutions to common delivery problems which may not exist in the immature new tools they are using
- No one wants to work on the older products/projects because they are using "legacy" technology
I guess one side point to note here is that I think some of the above points may differ between a product company and a typical enterprise. Product companies tend to have long living teams who focus on one product and its entire life-cycle. In these cases different tech between products is less of an issue but in the enterprise it is less common that a team will be dedicated to one product and will likely have team members work across a multitude of applications and technologies over time.
Why is this a problem?
When thinking about the problem from an organisational perspective lets think about the CTO. Do you think the CTO would understand this problem. Do you think the CTO understands the risk that they carry because of this? I would argue that its highly unlikely.
I think a CTO would fully understand that when making investments in off the shelf applications or SaaS applications they typically view the investment as a long term investment where they want a stable platform to help them reliably deliver a solution. When it comes to the problem we are talking about here however its really a custom software development problem. The CTO typically is less aware of the details and perhaps doesnt realise some of the problems that exist.
To explain this lets use an analogy. If we compare the situation I describe above to a car manufacturing plant. In my car plant I would like to develop an assembly like to build a certain type of car. I would optimise the assembly line to be able to build lots of the same type of car consistently and repeatably. The same could be true of components in your architecture. An architecture where you could build lots of API's or Microservices which are very similar and repeatable is surely a good thing? The problem is however if you change the underlying tools and technology all of the time then this is the same as building your assembly line, building one car and then throwing the assembly line away and building a new one for the 2nd car which looks to the outside world to be exactly the same but your team is very prowd because they used version 2.0 of a cool new spanner to build the car.
If your CTO begun to recognise the problem then immediately they would see this as an architecture and governance problem. The problem however is we have managed to end up in a world of agility where people managed to convince many organisations that architecture was just a blocker to getting things done and for many organisations architecture is no longer something they actively manage.
I think the path that this is likely to take is that organisations start to realise they dont actually have a clue what technologies they are using where and to what extend which will result in a clamp down in some cases as an over-reaction and then later the realisation that this is actually an architecture problem and there are architecture tools and practices to manage this.
I think we have seen a resurgence in solution architecture from the perspective of "how should we solve this problem" over the last few years, but I think the next wave will be when IT departments cotton on to this problem and start recognising the need to govern their architecture again.
Bringing back Architectural Governance
The first step in architectural governance is the realisation that the silver bullet technology this week will not necessarily be the silver bullet technology in a few months time so we need to apply some rules to the organisation to make sure we are developing solutions within an awareness of the approaches we are taking and an acknowledgement of the pros and cons of each approach used.
I think for many organisations the answer will eventually be a set of principles, practices and blueprints that will be used to guide their teams on approaches are allowed and which approaches should be avoided. If we refer back to our original example of the dozen or so different ways to build an API. For most organisations there is unlikely to be a need to have more than one way. Lets say in the interests of being adaptable the architecture team may support two approved ways. Lets at least have an understanding of the overlaps and benefits of each and why we would use one approach in a certain situation versus another. I would also expect that the "default" approach should be the one that is the best combination of simplicity and value for money. For most organisation the KISS (Keep it simple stupid) principle is completely valid so why over-engineer a solution.
Once you have created some principles to guide your decisions and some Blueprints to show how to implement the preferred approaches then the challenge is that over time new approaches will evolve to challenge your decision. Perhaps one approach to handle this could be to add some criteria which must be met in order for your approach to be challenged. EG a new technology must have been out for a year post general availability and also support certain levels of automation before it will be considered. In some cases by the time the challenger meets this criteria people will be bored with that tech and there will be another new kid on the block.
What should good look like?
Above I guess we have painted a slightly bleak picture but the scenario should be managable for many organisations. First off Architecture Governance needs to be recognised as the way of managing this. The idea is to find the balance of having enough approaches to allow the development of solutions which are effective for the business and provide some flexibility in how things are done, but without having a free for all.
In a good organisation the architecture team will be looking ahead at the technologies which are out there and which could be beneficial to the organisation. At present in many organisations agile teams can implement stealth approaches to use the tools, platforms and technologies that suits their objectives. The architecture team needs to turn this around and be ahead of the game so they can recommend tools and platforms to the agile teams which may help them. To do this requires a balance of time spent looking outside at technology trends and inside at requirements and then using this info to update practices and blueprints to follow the technology changes they want. At this point the architecture team have mitigated the risk of too many different options and overlaps with their set of supported tools that are aligned to the architecture principles for the organisation.
I think one of the most common challenges to the restriction in approaches which can be used and the implementation of some checks and balances is that sometimes people will claim that members of staff will be difficult to recruit if we arent using the latest technologies and difficult to retain in the same way. Their is some truth to this but at the same time many organisations do not have good trainining plans aligned to what their developers do. This could be managed with architecture principles which combine technology choices with requirements to train staff on those technologies. Again referring back to the KISS principle it is also easier to train a wider range of staff on a more mature platform than on something which is very new and cutting edge which may not even have much training available. Good principles can guide our choices based around ability to train the team.
Another good practice for the architecture team would be to use the outward looking focus they have to help them plan to evolve their architecture in small bits incrementally over time. This is a much better approach than having to rewrite significant parts of the architecture at longer intervals. One of the bigger challenges in this space can relate to component which may need to be updated but no longer form part of active project based plans so there is no commercial vehicle to fund the update. The common case is that the component gets left in the corner to go rusty until it breaks, at which point the business becomes unhappy and some funding is found to fix the problem. A better approach would be to recognise that all IT has an ongoing maintence cost and have a budget line which covers upgrades to developed software in the same way that IT departments usually have budgets to cover upgrades to Windows and Server patches, etc.
One thing I do wonder in the IT space that may mitigate some of the challenges in this area is the boom around citizen developer platforms such as Microsoft Flow and Power Apps. With these platforms, by removing the need for lots of custom code means the ongoing maintenance of these solutions is tiny by comparison. I do wonder if this is one of the areas organisations will begin developing more solutions to move way from some of the custom code solutions which is really where much of the problem space revolves.
At this point I will try to wrap up this discussion. I hope I have articulated reasonably what I think is one of the biggest challenges in IT today, even though it is not fully visible to many organisations that they are up to their neck in it.
I think the answer to this problem will come in the resurgence of architecture as a discipline. I think architecture has found itself lost and no longer knowing its place in recent years but the need for management of the IT eco-system is more important than ever.
In terms of the theme of this "Discuss", I will share this with some people and invite anyone to share their experiences, to challenge if they disagree or have different ideas. Its all about creating a healthy discussion where we can all learn from each others insights.