AI models: Speech Enhancement

Introduction

The following document lists Sanas' Speech Enhancement (SE) model releases. Versions are listed newest-first. Each entry summarizes what changed and who it affects.

Speech Enhancement builds on all Noise Cancellation capabilities and adds high-fidelity audio processing at a higher sampling rate, producing richer, more natural-sounding output. Like NC, SE uses independently trained AI models for each audio direction— outbound (agent-side) and inbound (customer-side).

For desktop application releases that bundle these models, see Product updates. For detailed benchmarking methodology and performance data, see Sanas science blogs.

SE2.1

Field	Detail
Model version	2.1
Speech capability	Speech Enhancement
Audio direction	Outbound and Inbound
Sampling rate	8kHz
Available modes	(Outbound) Voice Isolation - Near field and Far field (Inbound) Standard with ringtone passthrough
Release status	GA

What changed

Eliminates reverbs and room acoustics from speech
Restores audio packet loss due to a poor network connection
Corrects audio degradation via the codec compression
Improved audio fidelity on the old telephony systems (8kHz)
Improvements in the background environmental noises and distant speech removal
Overall enhancements in speech clarity, articulation, energy, and vocal presence.

SE1.0.1

Field	Detail
Model version	1.0.1
Speech capability	Speech Enhancement
Audio direction	Outbound (Sanas app user’s-side)
Sampling rate	24kHz
Release status	GA

What changed

Includes all outbound Noise Cancellation capabilities, background noise suppression up to 80–90 dBA, background speech cancellation, and support for 8,000+ noise profiles.
High-fidelity 24kHz audio output delivers richer, more natural-sounding speech compared to the 8kHz telephony-grade output of NC models.

SE1.0.1

Field	Detail
Model version	1.0.1
Speech capability	Speech Enhancement
Audio direction	Inbound (Call recipient’s-side)
Sampling rate	24kHz
Release status	GA

What changed

Includes all inbound Noise Cancellation capabilities — far-field speech preservation, handling of unpredictable ambient environments, and support for variable-quality audio sources.
High-fidelity 24kHz audio output enhances the clarity and naturalness of the customer's voice beyond what NC inbound models provide.

Support

Need help? Get in touch with our Support Team for assistance.

Documentation Index

AI models: Speech Enhancement

Introduction

SE2.1

What changed

SE1.0.1

What changed

SE1.0.1

What changed

Support