Facial vs Voice Biometrics – What’s best?

blog post

Nov 22

Written By Auraya

AI Voice Spoofing: A Global Concern

It was only in February this year that journalists in Australia and the USA used AI-generated voice spoofing techniques to access accounts - notably the largest Australian government department “Service Australia”, the “Lloyds Bank” in the UK, and several financial institutions in the USA. The journalist's successful spoof attack attracted a lot of media attention at the time although there was little actual fraud occurring.

Voice Spoofing and Facial Recognition Debates

Fast forward to now, AI-enabled voice spoofing has become the “cause celebre” for many in the facial recognition industry. It could, of course, be a simple deflection from the negative publicity focussed on facial biometrics, with the availability of fake documents combined with image manipulation, deep fake video (who hasn’t seen an AI version of Elon Musk selling cryptocurrencies – or for those in the UK consumer advocate Martin Lewis promoting scam trading platforms) and of course the ongoing debate on data privacy.

So, is customer data protected by voice biometric systems vulnerable to attack by fraudsters?

The first thing we note is that all security systems, including voice biometrics, can be defeated if sufficient effort is made by the attackers. Good security practice adopts a layered approach so that consumer convenience is balanced against the risk involved.

EVA’s Comprehensive Approach to Counter Voice Spoofing

In well architected systems like Auraya’s EVA Voice Biometric solution, the threat of voice spoofing is considered and counter-measures are in place to reduce the risk of spoofed voices accessing an account. EVA’s layered security process includes the ability to check if the device being used by the person seeking access is trusted. When the voice of the authorised user is sampled by the voice biometric system, the system checks to see if it matches the authorised person to a sufficient level of probability. EVA’s voice AI also checks to see if the voice has any indications that it has been created by a synthetic voice generator. A user can also be asked to say a one-time passcode that is sent to the trusted device, adding a knowledge-based element to the identity verification process, and making it even more difficult for fraudsters that attempt to use recording or a synthetic voice.

The EVA voice biometric system can be configured to continuously sample the person's voice throughout the entire conversation, making sure that that authorised person is present throughout the entire process. This continuous monitoring will thwart even the most sophisticated spoofing attempts.

It should be noted that spoofing a victim's voice requires an original voice to copy from. While the amount of voice required to create a fake is dropping from the 5+ minutes of continuous recording, a “good” fake needs up to 60 seconds of high quality genuine voice to create a high quality synthetic voice.

However, the voice v face debate is only a small part of the war on fraud. Devices and sim cards are regularly cloned, AI model algorithms are breaking passwords, ChatGPT is enabling higher quality phishing emails (all those Nigerian princes who want us to receive a suitcase of money are much better at grammar now), and you can buy just about any personal data on the dark web for less than a chai latte.

Beyond Voice vs. Face: Multi-modal

The best answer is layering identity characteristics. Call it second factor or multimodal or multi-factor authentication – you chose - but the old mantra still applies in proving identity: provide a mix of Something you know (such as a one-time passcode); Something you have (like your smartphone), and Something you are (a biometric or combination of biometrics).

No identity characteristic is perfect but we do know that knowledge-based authentication like passwords and answers to secret questions are too open to data theft to be useful security signals. We also know that device identities can be cloned and biometrics are more complicated to fake, but as we’ve discussed here - it’s not foolproof.

The answer is to combine the best of these characteristics with a clear eye on user experience. Here it is worth considering voice as part of your identity and fraud journey. It is independent of a device, it’s easy to use, adaptable to context, conversational to reduce friction, and as someone at BT once said, it’s good to talk…

To know more, visit aurayasystems.com.