Real-time audio processing & AI

Live Voice Call Fraud Detection PoC

A proof-of-concept that listens to live phone calls, transcribes them, and warns users in case of suspected fraud.

Problem

The client wanted to explore whether real-time analysis of live phone calls could reduce fraud incidents for vulnerable users. They needed a PoC that proved technical feasibility, with strict latency and reliability requirements, but without committing to a full-blown product yet.

Solution

We designed a small, event-driven system that receives audio from Twilio, streams it to a transcription provider, and feeds the text into an AI model tuned for fraud-related patterns. The system surfaces non-intrusive alerts to the user while collecting structured data for later analysis.

Architecture / Technologies

Audio is captured via Twilio media streams and sent over WebSockets to a Node-based edge service. This service forwards chunks to a managed transcription API and publishes transcripts to Redis Pub/Sub channels. Downstream workers enrich the text with fraud signals and store results in blob storage and a relational database for analysis.

Node.jsTypeScriptTwilio Media StreamsRedis Pub/SubAzure Blob StorageWebSockets

Highlights

  • WebSocket audio streaming directly from Twilio media streams
  • Low-latency transcription pipeline with backpressure handling
  • Real-time enrichment and rule-based fraud scoring
  • Event-driven microservice layout using Redis Pub/Sub
  • Blob storage archive of raw audio and transcripts for later model improvement

Outcome

The PoC demonstrated that real-time detection was technically feasible within the client’s latency budget. This data and architecture became the foundation for a follow-up project and helped the client secure internal buy-in.

Want something similar? Let's talk

Tell us about your project and we'll discuss how we can help.