Overall Quality MOS Scale

The Overall Quality MOS estimated via NISQA-v2 (Non-Intrusive Speech Quality Assessment, version 2) expresses the listener's perceived overall quality of a speech signal transmitted through a communication channel or processing pipeline. It follows the ITU-T P.800 Absolute Category Rating (ACR) 1–5 scale used in classical telephony and VoIP quality evaluations.

Score	Descriptor
5 – Excellent	Imperceptible degradation; speech is perfectly clear and natural.
4 – Good	Slight, non-annoying impairments; easily understandable.
3 – Fair	Noticeable but acceptable distortions; intelligibility preserved.
2 – Poor	Annoying distortions; intelligibility partially reduced.
1 – Bad	Very annoying distortion; speech hardly intelligible.

Practical Range and Expected Scores

Although the theoretical limits are 1 → “Bad” and 5 → “Excellent,” real subjective tests and model predictions rarely use the full range:

Clean human speech recordings under ideal conditions typically yield ≈ 4.3 – 4.6 MOS.
High-quality telephony or codec pipelines (e.g., AMR-WB, Opus wideband) usually score ≈ 3.8 – 4.4.
Moderately degraded or noisy speech lies around 2.5 – 3.5.
Heavily distorted, packet-loss or low-bitrate conditions often fall below 2.0.

The NISQA v2 model was trained on large crowdsourced English speech-quality datasets covering these conditions. Consequently, its predicted “Overall Quality MOS” values follow the same empirical distribution: even pristine signals seldom exceed 4.6 – 4.7, while most practical system outputs cluster between 3.0 and 4.2.

Why the Effective Ceiling Is Below 5.0

This saturation below 5.0 arises primarily from human rating behavior rather than any model limitation. Listeners in ACR tests display a well-known central-tendency bias—a reluctance to choose the extreme categories of a Likert-type scale. As a result, ratings concentrate around the mid-upper range, and even “perfect” reference material receives an average below the nominal maximum. NISQA-v2, trained to reproduce human judgments, therefore inherits this calibration naturally.

Summary

The Overall Quality MOS predicted by NISQA v2 quantifies the subjective transmission quality of speech on the ITU-T P.800 1–5 scale. In practice, due to the human tendency to avoid extremes and the statistical properties of real datasets, the usable range is approximately 1.0 – 4.7. Scores near 4.5 correspond to clean, transparent audio, 3–4 indicate acceptable communication quality, and values below 3 reflect perceptibly degraded or unpleasant speech.

All the AI features of Altered RealTime Pro

Experience
RealTime Pro

Transform your

video calls
voice chats
voice calls
video calls

with Altered Real-Time Pro

Download for Windows

Subscribe to our newsletter

Keep updated with the latest news

Company

Terms & Conditions Privacy Policy Ethics Careers Contact Us