Which method of hacking voice biometric security system is the most dangerous?
Successful attempts of system penetration are usually related to recording the passphrase with a device during the successful logging-in. Equivalent of such attack for standard PIN number would be visual recording of typing user with external or built in camera, or camera capturing digit reflections from screen or eye retinas. Solution to this problem is a mechanism of Playback Detection
What is the protection against playback attack?
For standard PIN such attack is almost impossible to prevent. VoicePIN adds another protection layer – PAD (Playback Activity Detector). PAD is able to discover whether the verification is conducted with use of previously captured recording.
How efficient is this additional protection layer?
PAD estimated efficiency in playback attacks reaches 90%. In case of PIN numbers the efficiency is 0% as there is no mechanism allowing that. For long passphrases it is possible to use behavioral analysis biased with low accuracy. For short PINs and passwords there is no such procedure.
In case of PAD malfunction (or without PAD), does voice biometrics still ensure sufficient safety?
Voice biometrics alone, without PAD applied, is a sufficient mean of protection against playback attack detection conducted with recording captured under the table, behind a solid surface or made from a distance greater than 60-80 cm from the speaker. PAD is a protection against a high-quality recordings, including recordings directly captured from a mobile, GSM channel and digitally mastered by detection of statistical anomalies in a sample that occur during the manual tuning of human voice.
In case of moderate sore throat the system is designed to handle with it, however in case of severe condition it is recommended to use another way of authentication. As in case of fingertip damage or deformation, voice biometrics will not work if the voice is abnormally altered. Please mind, that the degree of alteration is much wider than perceived by human ear, as the system takes into account much more aspects of human voice during the statistical analysis.
In the first case, the PAD (playback activity detector) will be triggered. Both samples, the incoming one and the previously used are registered and compared by the system. Even digital/manual tuning of such recording will not bring any result as PAD detects unique acoustic trait of original sample.
Why would we allow anyone to record us while loggin-in? It is much easier to steal someone’s PIN than to approach him with a recorder.
Furthermore, the sound of recording is being distorted by loudspeaker in a way that will be considered erroneous and will not allow for positive verification.
Similiar voice – kinship
Test conducted on twins and triplets proved that the system is well tuned for detection of individual characteristics of the voice. The risk of breaking the system with the sibling’s voice is very low and should not be considered as a factor lowering the accuracy of the system.