According to Crowdstrike, voice-based attacks increased 442% in 2024 This article explores voice security tier. . By 2027, DeepStrike predicts that losses from AI-generated frauds, such as voice and cloning deepfakes, will total $40 billion.
Voice has long been viewed by organizations as a necessary exception to their security posture. Voice is still optimized for customer convenience, putting speed and accessibility ahead of security features that cause friction, even though other digital channels have developed layered defenses. While that trade-off was effective when voice was low-risk, advances in scamming technology have quickly and covertly increased the risk profile of calls, which are now the preferred method for high-privilege interactions (75% of consumers prefer speaking with a real human for customer support).
When such a system detects social engineering, for instance, it may also detect three lines of dialogue that show attempts to circumvent policies, an 83% chance that the speaker was using a deepfake, and phony expressions of urgency. Related:CISA Updates the KEV Catalog with Unannounced Ransomware Changes Voice is still a high-privilege attack surface and a preferred engagement channel, so protecting it will need models designed to manage its scale and ambiguity. Voice security is now a top-tier capability rather than an exception thanks to voice-native ELMs like Velma, which represent a move away from identification verification protocols and reactive controls and toward real-time understanding.
By Kirsten Aebersold, with input from Modulate's CEO, Mike Pappas, and Director of Market and Behavioral Research, Ken Morino, covering threat intelligence, voice security, and applied artificial intelligence.












.webp%3Fw%3D1068%26resize%3D1068%2C0%26ssl%3D1&w=3840&q=75)