With software and service failures making headlines, the most recent BCS CIO Network meeting in London discussed the importance of professionalism in helping to ensure critical IT services remain dependable and resilient.

With the IT industry still under the shadow of the recent CrowdStrike global outage, the BCS CIO Network met to discuss service resilience. Speakers and guests used the cyber incident — which brought millions of Windows-based PCs to their knees — as a starting point to explore why IT systems fail. They also looked at the technical, cultural, leadership and on-team diversity changes that, if made, could help make systems more robust and able to withstand cyber attacks and incidents.  

Following the event, key speakers appeared in a special edition of the Gem of All Mechanisms podcast discussing architecting resilience.

Previous CIO Network Events:

Architecting resilience

If there was one key takeaway from the event, it was this: the days when cyber incidents’ consequences were limited to the digital domain are long gone. Software is now integrally woven into the weft and weave of everyday life, and if that software fails — or is slow to recover from a failure — the consequences can be devastating.

Speaking at the event and opening the first fireside chat, Alastair Revell FBCS, BCS President, said: ‘The public is starting to wake up to the need for trust in the systems we depend on. Having confidence in critical national infrastructure is incredibly important… We’re also thinking about accountability — what accountability do people have at the boardroom level?’

Alastair looked broadly at the points and places where IT has hit the news headlines for all the wrong reasons. Beyond the CrowdStrike outage — which closed airports, stifled hospital admissions and choked e-commerce — he pointed to the Post Office Horizon Scandal as a prime example of how failures in critical software can lead to awful real-world consequences.

Alastair said that when critical systems fail and lives are impacted, there’s often a rush to blame IT teams. He observed: ‘A critical part of the solution, I feel, is making boards culpable, responsible and accountable for what their organisations put into software.’

Against this backdrop, the event’s invited speakers — active CIOs and board members — explored how successful organisations can build strategies to prevent, survive and recover from critical IT failures. Much of the focus was on practical ways boards can work productively with CIOs and how CIOs and IT departments can support their governance committees.  

The CIO Network meeting was busy, broad and very interactive. It also offered many opinions, ideas and insights. These fell into three main categories:

  • The case for professionalism
  • Actionable advice for boards
  • Actionable advice for CIOs

In summary, the CIO Network focused on appreciating digital risk and the roles of the board and CIOs in managing these issues.

The case for professionalism

Given IT’s ever-growing criticality in business and the public’s everyday life, the panel discussed the case for professionalising IT.

‘It’s something I’m very passionate about,’ said Alastair, pointing to digital health and social care as evidence of technology’s importance in keeping the public safe.

Another speaker, backing up this point, discussed the impact of the June 2024 ransomware attack against a pathology lab responsible for processing blood on behalf of NHS organisations in London.

The attack’s consequences rippled across the NHS and eventually led to hospitals in the capital cancelling operations due to a lack of blood. This shortage eventually triggered a national call for blood donors across the UK to come forward. A Russian group of cyber criminals was thought to be behind the attack.

Against this backdrop, panellists and speakers from the floor questioned why IT systems can be designed and built by people with few specific IT qualifications. However, hiring a surveyor, surgeon, solicitor, or accountant would be unthinkable without seeing that they are correctly validated and certified.

Panellists felt that a Chartered IT Professional should manage the most critical IT projects, where public safety, trust and taxpayers’ money are at the most significant risk.

There was a hope that one day, parents would be proud to say that their children are Chartered IT Professionals in the same way as they may be if their child became a certified accountant or, indeed, a bricks-and-mortar architect.

Insights for CIOs 

It was felt that CIOs should prioritise understanding and mitigating risks associated with their digital infrastructure. Specifically, they should consider scrutinising dependencies between connected systems and ensuring fail-safes are in place. A significant theme was integrating cybersecurity considerations into every aspect of IT operations, particularly around potential vulnerabilities in third-party software.

Modern applications place a great deal of emphasis on mapping and understanding user journeys and user experiences. However, panellists felt that this emphasis on understanding critical steps in software's operation needs to extend well beyond the presentation and interaction layers.

Instead, they felt a 'golden process' needed to be understood, appreciated, drawn and documented. Organisations need to understand clearly the connections between systems and architecture that combine to enable a service. Without this understanding, making a clear connection between technology and business processes is challenging.

CIOs were encouraged to adopt rigorous governance frameworks, including processes like technical and business design authorities, to ensure all technological decisions are well-structured and documented. Another vital recommendation was for CIOs to proactively encourage root-cause analysis when IT failures occur, taking ownership of the situation, identifying risks early and tracking these risks with metrics to guide decision making.

The importance of clear, structured communication with non-technical leaders was also highlighted. CIOs should be able to translate complex IT risks and strategies into language that executives and other decision makers can easily understand. This includes presenting risks in a way that shows how they will be mitigated and how these mitigations align with overall business objectives.

A recurring theme was how technical risk is communicated to boards, whose members are generally non-technical. Boards naturally speak and understand the language of profit and loss statements. As such, it was felt there’s a fillable niche for IT team members who can present cyber risk to boards in ways they will naturally grasp, appreciate and then prioritise.

Other vital suggestions for CIOs included:

  • Focus on component uptime: move beyond measuring overall server uptime and focus on gathering performance data about critical modules and components that impact user experience
  • Balance agility with security: integrate security and resilience into agile development processes to avoid compromising long-term stability
  • Address technical debt: prioritise resolving technical debt proactively to prevent future system vulnerabilities and inefficiencies
  • Encourage transparent risk sharing: create an environment where teams feel safe escalating risks and challenges early, fostering collaborative problem solving
  • Invest in continuous training: ensure ongoing training and upskilling for teams to keep pace with technological advancements and cybersecurity needs. Here SFIAplus and RoleModelplus may prove useful
  • Implement metrics beyond hardware: shift away from traditional hardware based metrics to user centred measures that reflect accurate service availability and quality
  • Adopt proactive risk management: continuously identify and mitigate emerging risks, ensuring resilience in IT systems

Findings for boards

The key recommendations and recurring themes for boards and directors emphasise the need for enhanced accountability, deeper engagement with IT systems and more robust governance over technological decisions. Boards must accept greater responsibility for IT related risks, moving beyond traditional financial oversight to ensure the resilience and security of digital infrastructure. This includes fostering a better understanding of the risks that come with IT systems, especially in terms of service continuity and potential disruptions.

Panellists felt that board members should ensure they are well informed about IT risks and require clear communication from CIOs and technical leaders to help them grasp the implications of new technologies or system changes.

A recurring theme was the need for boards to engage more actively with their organisations' technical architecture. This involves asking the right questions about systems' resilience and security and not relying solely on the IT department for these insights.

Training and certification are also critical. Boards should be encouraged to ensure that IT professionals, including system architects, are appropriately certified, similarly to how other professions require formal qualifications.

This also applies to board members, who should receive training to better understand the technical dimensions of their oversight.

The recurring message is that boards must take a proactive, informed and strategic role in managing IT risks to ensure organisational resilience.

Summing up

The CIO Network event ended with expert guests breaking into smaller groups and discussing their key take-aways.

Looking specifically at the CrowdStrike incident, experts felt that it was unlikely that this specific software failure would change attitudes and approaches to risk. That’s because organisations expect critical software systems to fail or to be unavailable — it’s a fact of life.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

Guests recommended running cyberattack and cyber incident simulations to engage with boards and help people on a governance committee understand software based risks. The NCSC’s Exercise in a Box was recommended as a good starting point. Such exercises underline that avoiding and recovering from cyber incidents is a company-wide responsibility — not just one that sits with the IT department. Indeed, while CIOs lead these initiatives, sponsorship must be sought at the most senior level of the organisation — the understanding and ownership of risk cannot reside with just IT; accountability must be reinforced at all levels.

Finally, the discussion addressed the balance between digital transformation and maintaining traditional processes, emphasising the need for careful evaluation to prevent operational gaps.

What is the CIO Network?

Vibrant, animated and practical, the CIO Network aims to provide leaders with a space to form ideas and ultimately shape tomorrow’s technology agenda.

Want to connect with like-minded CIOs and discuss your challenges and triumphs? Find out how you can join the next event by contacting us. Please note the CIO Network is non-commercial.

AI was used to transcribe conversations and summarise themes. The article was written and edited by BCS editors.

Image credits: BCS