How the UK Home Office halved TCO over 3 years while increasing resilience.
The Challenge
enCircle received a call from a long serving member of the previous Home Office Digital team with a seemingly insurmountable problem; A collection of unsecure workloads hosted on unsupported operating systems developed by a company who no longer existed. The complex situation led to high maintenance costs and unnecessary risks for the organisation and members of the public who were consuming the digital services.
The cost of refactoring the code and migrating business logic to a modern framework was prohibitive. The incumbent Cloud Service Provider made it clear they had no interest in continuing to support the workloads and they recommended they be retired immediately.
enCircle were asked to take a quick look and see if we could bring the services under our wing within a tight budget and timeframe.
What We Did
We grabbed this hot potato with both hands and immediately began a comprehensive review of the infrastructure. Within the first couple of weeks, we had quantified and understood the full stack before moving to identify & recommend potential solutions. By deconstructing the often obfuscated logic, breaking it down into its key components, we revealed the underlying ‘proprietary’ code was, in fact, using an open source framework which was no longer actively developed by its community and suffered from a number of critical vulnerabilities.
A WAF (Web Application Firewall) was implemented immediately, consisting of a simple Nginx proxy with MODSEC rule set to mitigate the XSS (Cross-Site Scripting) and other critical vulnerabilities. All usernames and passwords were changed to high entropy strings. Vulnerability monitoring and Intrusion detection were deployed to ensure no exploits or data breaches were occurring.
The next step was to create a dev pipeline and migrate the application layer components to a new host VM on a supported operating system. This revealed a number of issues with the underlying code which required manual patching and refactoring before passing sanity tests.
Once we had the application up and running on a secure stack, we moved to UAT to make sure the functionality was intact. Improving the UX and identifying any quick wins was our focus at this stage.
Through a series of agile sprints, we were able to refine the code and tune the backend databases to ensure they were performing at their best potential with minimal resource usage. Performance testing showed the new stack outperformed the legacy one by almost ⅓.
Finally, as there were no analytics for the services, we analysed the past 12 months of raw access logs to determine the usage patterns. This allowed us to craft real life performance testing scripts. Running these scenarios showed that although the current stack was performing well, it was in fact over specified with almost 50% of resources remaining idle. Bottlenecks in the code and architecture were the main contributor to this poor resource utilisation.
Outcome
These high risk and costly services are now running on highly available, scalable, maintainable and secure infrastructure provided by hyperscale CSPs (Cloud Service Providers). IAM credentials are shared with the Home Office DDaT teams to ensure continuity. Resource utilisation is now within appropriate thresholds and efficiency has improved by over 30%.
Making continuous improvements to the services during the life of the contract, TCO (Total Cost of Ownership) has reduced by over 50% while significantly improving quality and user experience.
Taking a holistic approach to the management of digital services, enCircle has ensured both the longevity and sustainability of the workloads moving forward.
With regular performance review meetings, the service owners and end users are pretty happy too.
Key Lessons
If your organisation is facing a similar challenge, enCircle can help answer the following key questions:-
- Is the service still needed by its intended audience?
- Check the analytics or access logs - Are the functions you expect being used?
- Is the data integrity and value high enough to justify the service?
- Is the codebase secure and maintainable?
- Deconstruct/decompile the components - Are they proprietary or open?
- Document dependencies - Are they all needed?
- Scan repos for vulnerabilities - Mitigate with updates, WAF or monkey patch?
- Can the service be rebuilt using a modern open source ‘low code’ framework, e.g. a CMS such as Drupal?
- Map the IAM Roles and Permissions - Do these translate into config?
- Document the Data Architecture - Is it structured / non-structured / time based? Is it suitable for the chosen platform?
- Define the Taxonomy and Workflow requirements - Do these fit?
- Understand the gaps - Will you need to write endless custom code?
Sometimes it’s better to bite the bullet and retire a digital service that’s reached end-of-life. Change in policy and/or legislation is the only constant, so be sure to create a roadmap that takes this into account.
Get ready to meet enCircle at DigiGov Expo on 24th - 25th September 2024. Find out more about them here, and secure your free place here.
enCircle
Providing cloud managed services, agile design and development, along with obsessive support, to public, private and non-profit organisations for over 20 years. Our highly capable and conscientious team delight our customers, all of the time.