AI Triage Cuts Vulnerability Volume by 20% and Removes 91 Critical Flaws
Industry: Defence, National Security, Cyber Security
Client
Defence Digital, Ministry of Defence (UK)
Goal
To design and implement a human-machine teaming model for vulnerability triage that accelerated remediation across the client while preserving full human accountability, decision traceability, and operational control in a high-risk national security environment.
Challenges
- No escalation architecture existed for low-confidence outputs or high-impact decisions, creating a risk of AI substitution rather than AI assistance.
- A fast-growing volume of vulnerabilities, complex MOD systems, and an unsustainable ‘patch everything’ approach made prioritisation slow, inconsistent, and operationally risky.
- Decision traceability was absent, leaving no audit trail for prioritisation, exception handling, approvals, or oversight in a regulated operational environment.
- AI-assisted triage introduced a governance risk: automation could accelerate recommendations faster than the organisation could define accountability for critical decisions.
Solution
Designed the Vulnerability Management Support Team (VMST) and a human-machine teaming model that used AI to rank, cluster, summarise, and flag anomalies, while preserving human authority for prioritisation, exceptions, trade-offs, and sign-off.
Applied a ‘Reliable, Resilient, Responsible’ AI governance framework so the system delivered faster, more consistent results under pressure while preserving control, assurance, and leadership confidence.
Introduced an adaptable risk-scoring methodology, V-Score, supported by peer-reviewed research and benchmarked to improve performance over the CVSS industry standard. The approach was published in the Journal of Cyber Security (July 2025) as ‘An Open and Adaptable Approach to Vulnerability Risk Scoring’, proposing a structured, evidence-based method that can be adapted to organisational context.
Redesigned the triage and exception workflow to produce a complete decision trail covering rationale, inputs, approvals, and timing, creating an audit-ready assurance record.
Implemented confidence cues, escalation triggers, and mandatory human-in-the-loop checkpoints so that low-confidence or high-risk cases required human review before action.
Established a formal division of labour between AI and human analysts, making the AI recommendation-only by design, with humans remaining the accountable decision-makers for every consequential action.
Impact:
Achieved a 20% reduction in total vulnerability volume, with more remediations completed in three months than in the previous two years combined.
Removed 91 critical vulnerabilities from designated MOD systems while maintaining full human accountability for every decision.
Cleared 100% of historical vulnerabilities across 13 MOD systems without losing operational control or decision traceability.
Improved consistency across teams and shifts by replacing fragmented manual judgement with a governed, repeatable triage model.
The methodology was peer-reviewed and published in the Journal of Cyber Security, providing both scientific and operational validation.
Context
A UK national defence digital organisation within the government’s defence department faced an urgent cyber risk problem across its complex operational systems. Operating in a defence, national security and cyber security context, the organisation required a solution to accelerate vulnerability remediation across distributed estates while preserving full human accountability, decision traceability, and operational control in a high-risk environment. The objective was to design and implement a human-machine teaming model for vulnerability triage that would act as assistance rather than substitution, enabling faster, safer remediation across critical systems.
Challenges
The organisation was contending with a fast-growing volume of vulnerabilities across complex Ministry systems and an unsustainable “patch everything” posture that made prioritisation slow, inconsistent and operationally risky. Triage relied on fragmented manual judgement with no reliable escalation architecture for low-confidence AI outputs or high-impact decisions, creating a material risk that automation could supplant human oversight. Introducing AI-assisted triage added governance risk: recommendations could be produced faster than accountability for consequential decisions had been defined, and decision traceability was effectively absent — leaving no audit trail for prioritisation, exceptions, approvals or oversight in a regulated operational environment.
Implementation
A Vulnerability Management Support Team (VMST) and a human-machine teaming model were designed and deployed to address these gaps. The solution used AI to rank, cluster, summarise and flag anomalies, but was intentionally recommendation-only: the formal division of labour gave AI the role of insight generation and humans retained authority for prioritisation, trade-offs, exceptions and sign-off. Confidence cues, escalation triggers and mandatory human-in-the-loop checkpoints were implemented so that low-confidence or high-risk cases required human review before any action. The triage and exception workflow was redesigned to produce a complete decision trail capturing rationale, inputs, approvals and timing, creating an audit-ready assurance record that meets regulatory and operational scrutiny.
An adaptable risk scoring methodology, V-Score, was introduced to replace static scoring models. V-Score is a structured, evidence-based approach that can be calibrated to organisational context; its design and benchmarking showed measurable performance improvements over the CVSS industry standard and the methodology was peer-reviewed and published in the Journal of Cyber Security (Jul 2025). The implementation was governed under a Reliable, Resilient, Responsible AI model so that the system improved speed and consistency under operational pressure without compromising control, assurance or leadership confidence. The fractional Head of AI led the integration with existing workflows, training for analyst teams, and the governance frameworks that ensured humans remained the accountable decision-makers for every consequential action.
Results
The human-machine teaming model delivered rapid, measurable impact. Total vulnerability volume fell by 20% and remediation throughput accelerated dramatically: more remediations were completed in a three-month period post-deployment than in the preceding two years combined. Ninety-one critical vulnerabilities were removed from designated systems while preserving full human accountability for every decision. Historic technical debt was cleared: 100% of historic vulnerabilities across 13 designated systems were remediated without loss of operational control or decision traceability. Consistency across teams and shifts improved as governed, repeatable triage replaced fragmented manual judgement, reducing operational risk and improving audit readiness. The V-Score methodology achieved validated performance gains against CVSS and, through peer review and publication, provided both scientific and operational validation of an adaptable, evidence-based approach to vulnerability risk scoring. Overall, the programme demonstrated that AI can materially accelerate cyber remediation in national security environments when paired with robust governance, clear division of labour and mandatory human oversight.
*Case studies reflect work undertaken by our Heads of AI either during their tenure with Head of AI or in prior roles before they were part of the Head of AI network; they are provided for illustrative purposes only and are based on conversations with our Heads of AI.