$25 million and a video call: what the Arup deepfake scam changed

In January 2024, a finance worker at the Hong Kong office of Arup, the British multinational engineering consultancy responsible for the Sydney Opera House, Crossrail, and the Beijing National Stadium, received what appeared to be a phishing email from the firm’s UK based Chief Financial Officer, asking for an urgent and confidential transaction. The employee was rightly suspicious. The senders followed up by inviting them to a video conference.

On the call were the CFO and several other senior Arup colleagues. They looked and sounded exactly as the finance worker expected. The CFO authorised the transfer, the other executives confirmed it, and the finance worker, reassured by the live presence of multiple recognised figures, completed fifteen separate transfers totalling HK$200 million (approximately US$25.6 million) across five Hong Kong bank accounts. The fraud was only discovered when the employee later followed up directly with Arup headquarters.

Every participant on that video call, other than the employee themselves, was a deepfake.

Hong Kong police reported the incident in February 2024. Arup confirmed its identity as the victim in May 2024, in a statement that included a phrase worth pausing on. The firm’s Chief Information Officer described the attack as “technology enhanced social engineering”. No systems were compromised. No data was breached. The control that failed was not technical. It was procedural and psychological.

For organisations approaching ISO 27001 certification, NIS2 obligations, or simply trying to harden their authorisation processes, the Arup case is the clearest single example yet of how the threat environment has changed. It is worth understanding in detail, because the controls that would have prevented it are not exotic. They are unfashionable.

What is actually new about this attack

Business email compromise (BEC) and CEO fraud have been with us for years. The FBI’s Internet Crime Complaint Center has tracked them as a multi billion dollar category since at least 2016. The textbook version is straightforward. An email purporting to be from a senior executive instructs a finance employee to make an urgent, confidential transfer. Detection relies on the employee being trained to recognise the patterns, and on procedural controls that require independent verification before payment.

The Arup case is the same attack type with a critical upgrade. Three things have changed in ways that materially weaken existing defences.

First, the verification channel has been compromised. The classic BEC control is, if you receive a payment instruction by email, verify it by another channel. Most controls advise calling the requester back, verifying in person, or checking with an independent senior approver. In Arup’s case the other channel was the trick. A video conference was the verification, and it was fake. Every visual and auditory signal the finance worker relied on to confirm authenticity was synthesised in real time.

Second, the production cost has collapsed. Deepfake video of a single person used to require days of compute and significant skill. Real time, multi participant deepfake video calls are now achievable on commodity hardware with off the shelf software. The market for “deepfake as a service” has matured to the point where the technical barrier to this kind of attack sits in the same range as running a basic phishing campaign in 2018. The Arup attack required no breakthrough capability. Someone took the existing capability and applied it to a high value target.

Third, the social engineering quality is higher. The classic phishing email has telltales. Stilted language, mismatched email addresses, unusual urgency, requests for unusual procedures. A live video call with multiple senior figures collapses those signals in front of the target. The employee in the Arup case was not credulous. They were sceptical of the initial email and demanded a stronger signal. The attackers provided exactly that signal, and the signal turned out to be worthless.

Why “I saw and heard them” no longer works

The deeper implication of the Arup case is that one of the most fundamental human controls in financial authorisation, live verification of a counterparty by sight and sound, has lost its evidential value in any channel that can be remotely fabricated.

For most of human history, the assumption that a person you can see and hear is the person they appear to be has been a reliable default. The exceptions (theatre, impersonators, telephone fraud against the elderly) have been narrow enough that the default has continued to work in most professional contexts. Over the past three to four years, that default has stopped being safe in any digital channel.

The implications cascade quickly. Many financial authorisation processes, particularly those designed for distributed organisations operating across time zones, rely on senior approvers visually or aurally confirming instructions during a brief call. Many vendor onboarding processes include a video verification step intended to confirm the supplier exists and is who they claim to be. Many incident response procedures begin with leadership convening on a call to coordinate response. Each is potentially vulnerable to the same vector that compromised Arup.

This does not mean video calls should be abandoned, or that visual recognition is worthless. But no remote audio visual channel, taken alone, can support a high value authorisation. The control architecture has to assume the channel may be fake.

What controls actually work

The controls that would have prevented the Arup attack are not new. They are the same controls that have prevented BEC fraud for years. What is new is that they need to be enforced more strictly than they used to be, and re evaluated against a threat model that includes live deepfake video.

Channel separation in payment authorisation. Any instruction received in channel A must be verified through channel B, where channel B is not capable of being controlled by channel A. A phone call initiated by the verifier to a known fixed number, not an inbound call from the requester, and not a number provided during the conversation. A face to face meeting where the requester is physically present. A SWIFT or banking system confirmation that requires a separately issued credential. In the post Arup environment, a video conference is not a sufficient secondary channel. It can be both ends of the attack.

Mandatory dual authorisation for transactions above defined thresholds. Two named people, each independently verifying, each with the authority to refuse, and no single person able to compress the process under “urgency”. The threshold should be low enough that the categorical case (executive impersonation for a high value transfer) is structurally caught.

Bank side authorisation controls. Most commercial banks now offer payment system controls: payee account whitelists, dual authorisation at the banking layer, mandatory cooling off periods for new payees, and call back verification by the bank’s own fraud team for unusual transactions. These controls catch fraud outside the organisation’s own process and add a meaningful layer of defence. Many organisations have these features available but do not enable them.

Culture: “pause and escalate” rather than “act quickly”. Almost every BEC attack relies on creating urgency that defeats normal authorisation discipline. A finance culture that explicitly rewards employees who pause suspicious instructions, and that is led from the top in declining “emergency” payment requests, is one of the strongest defences against social engineering. The Arup employee was initially suspicious, then was persuaded out of that suspicion. A culture that supports staying suspicious is the meta control.

Out of band identity verification for high trust interactions. For the smaller number of interactions where remote verification is unavoidable, stronger identity assurance is starting to be achievable. Cryptographic signing of meeting invites, pre shared verification phrases known only to legitimate participants, MFA on the call platform itself rather than just the calendar invite. None of these are panaceas, but they raise the cost of impersonation.

Visible attention to deepfake risk in awareness programmes. Most security awareness training still treats phishing as a text and link problem. Deepfake aware training, covering live video manipulation, audio cloning, and the new shape of impersonation, is increasingly necessary. The Arup case is a study almost designed for inclusion in such material.

What standards and frameworks expect

The Arup case has accelerated a shift that was already visible in compliance frameworks.

ISO 27001 does not address deepfakes by name, but multiple Annex A controls bear directly on the attack pattern. A.5.36 (Compliance with policies, rules and standards) and A.5.37 (Documented operating procedures) cover the payment authorisation process. A.6.3 (Information security awareness, education and training) covers the human factor control. A.8.5 (Secure authentication) covers identity assurance. An audit conducted now is increasingly likely to probe whether payment authorisation procedures account for the inability to trust remote video.

NIS2’s Article 21 includes “human resources security” and “cyber hygiene practices and security training” as explicit ten point minimum measures. Competent authorities are starting to expect training programmes that include modern attack vectors, deepfakes among them, rather than treating awareness as a once a year poster campaign.

AI governance as a discipline now needs to account for AI as an attacker capability, not just as an internal tool to govern. A serious AI governance programme considers both how we use AI safely and how we defend against AI being used against us. The Arup attack is canonical case material for the second question.

Financial sector regulations in Ireland and across the EU are tightening controls around payment authorisation independently of any cyber framework. The European Banking Authority and national supervisors have issued repeated guidance on authentication and authorisation strength following the rise of authorised push payment fraud. Boards in regulated financial services should assume this trajectory continues.

A practical exercise

For anyone who has not done this recently, here is a useful exercise. Pick the next three payment instructions over a defined threshold in your organisation, say anything above €100,000. For each, work backwards through the actual control chain that would have stopped an Arup style attack.

Were the instructions authenticated through a channel that could not have been controlled by the same attacker who controlled the request channel? Was the verification call dialled out from a known directory number, or initiated by an inbound call or a video invitation provided during the conversation? Would the bank’s own fraud system have flagged the destination as unusual? Is the threshold for dual authorisation set low enough that this transaction could not have been processed by a single person under social engineering pressure?

If the chain has any weak link, fix it before something tests it for you. US$25.6 million across fifteen transactions to five accounts in one short window is not as exceptional as it sounds. It is a model the threat market has industrialised. The next round of organisations to learn this lesson will be the ones whose controls did not catch up in time.

Closing thoughts

The most important sentence in Arup’s response is the one their CIO chose carefully, “technology enhanced social engineering”. The phrasing locates the attack correctly. This was not a failure of network security, identity infrastructure, or data protection. It was a failure of authorisation process under a stimulus that traditional process design did not anticipate.

That is exactly the kind of failure that compliance frameworks, security programmes, and resilience work are designed to address. The technical capability behind deepfakes will continue improving. The cost will continue falling. The number of incidents will continue increasing. None of that is preventable at the level of any individual organisation.

What is preventable, and what every serious organisation should be addressing now, is the gap between the controls that were designed for a world where you could trust a video call, and the controls required for a world where you cannot.