Annual Report 2025

Virgin Money's testing GenAI transformation

Our role

GenAI testing and assurance

Our deliverable

Innovative testing technology and framework developed

The outcome

Banking productivity enhanced, risks mitigated

Virgin Money and PwC partnered to pioneer GenAI testing technology and frameworks - building trust and boosting productivity in banking through innovative technology solutions.

Context and challenge

In the banking sector, adopting nascent technology such as GenAI poses challenges due to regulatory concerns and risk management.

Virgin Money aimed to introduce a GenAI productivity tool, embedding it directly across the organisation, enhancing efficiency by enabling staff to focus on tasks requiring human judgement instead of repetitive activities.

Despite initial scepticism about testing GenAI, PwC and Virgin Money believed it was possible to test the once untestable.

Generative AI at Virgin Money evolves by learning our data, sharpening decision accuracy and outcomes, while our human-in-the-loop remains essential for creative, value-add tasks.

Innovative testing solutions

Our work engaged Virgin Money in a strategic partnership, developing pioneering testing techniques to build a picture of how the GenAI productivity tool behaved in Virgin Money’s environment – to harness the benefits at the same time as understanding and managing the risks.

Together, we developed a comprehensive testing framework and harnessed 'LLM As-a-Judge' technology, which effectively used AI to evaluate AI. This was complemented by statistical methods and human expertise.

Over five weeks, the robust testing framework addressed errors, hallucinations, and biases, ensuring the tool met regulatory standards and Virgin Money's risk tolerance.

To test the once 'untestable' we developed a comprehensive testing framework and harnessed 'LLM As-a-Judge' technology, which effectively used AI to evaluate AI. This was complemented by statistical methods and human expertise.

Data scientists, technologists, AI experts, prompt model risk and model validation experts ran live tests to build a full picture of errors, hallucinations, bias and toxicity, using AI to test AI.

Impact and empowerment

Through rigorous testing of 1,464 AI assistant outputs across various productivity tools, along with Virgin Money’s model validation team, we identified risks in AI accuracy, robustness, and bias. Implementing best-practice prompting boosted the AI assistant's accuracy from 40% to 85%.

The results, shared with Virgin Money's AI Council, set the stage for a confidently expanded GenAI deployment. This pioneering effort democratised technology use, highlighted productivity improvements, and strengthened organisational confidence in GenAI, paving the way for an AI-enabled future.

Video

Virgin Money interview

Virgin Money, CMRO, Nidhi Agarwal and PwC Partner Chris Heys, discuss the project approach

View Transcript

Contact us

Annual Report enquiries

Corporate Affairs, PwC United Kingdom

© 2015 - 2025 PwC. All rights reserved. PwC refers to the PwC network and/or one or more of its member firms, each of which is a separate legal entity. Please see how we are structured for further details.