Overall Quality MOS Scale

The Overall Quality MOS estimated via NISQA-v2 (Non-Intrusive Speech Quality Assessment, version 2) expresses the listener's perceived overall quality of a speech signal transmitted through a communication channel or processing pipeline. It follows the ITU-T P.800 Absolute Category Rating (ACR) 1–5 scale used in classical telephony and VoIP quality evaluations.

ScoreDescriptor
5 – ExcellentImperceptible degradation; speech is perfectly clear and natural.
4 – GoodSlight, non-annoying impairments; easily understandable.
3 – FairNoticeable but acceptable distortions; intelligibility preserved.
2 – PoorAnnoying distortions; intelligibility partially reduced.
1 – BadVery annoying distortion; speech hardly intelligible.

Practical Range and Expected Scores

Although the theoretical limits are 1 → “Bad” and 5 → “Excellent,” real subjective tests and model predictions rarely use the full range:

  • Clean human speech recordings under ideal conditions typically yield ≈ 4.3 – 4.6 MOS.
  • High-quality telephony or codec pipelines (e.g., AMR-WB, Opus wideband) usually score ≈ 3.8 – 4.4.
  • Moderately degraded or noisy speech lies around 2.5 – 3.5.
  • Heavily distorted, packet-loss or low-bitrate conditions often fall below 2.0.

The NISQA v2 model was trained on large crowdsourced English speech-quality datasets covering these conditions. Consequently, its predicted “Overall Quality MOS” values follow the same empirical distribution: even pristine signals seldom exceed 4.6 – 4.7, while most practical system outputs cluster between 3.0 and 4.2.

Why the Effective Ceiling Is Below 5.0

This saturation below 5.0 arises primarily from human rating behavior rather than any model limitation. Listeners in ACR tests display a well-known central-tendency bias—a reluctance to choose the extreme categories of a Likert-type scale. As a result, ratings concentrate around the mid-upper range, and even “perfect” reference material receives an average below the nominal maximum. NISQA-v2, trained to reproduce human judgments, therefore inherits this calibration naturally.

Summary

The Overall Quality MOS predicted by NISQA v2 quantifies the subjective transmission quality of speech on the ITU-T P.800 1–5 scale. In practice, due to the human tendency to avoid extremes and the statistical properties of real datasets, the usable range is approximately 1.0 – 4.7. Scores near 4.5 correspond to clean, transparent audio, 3–4 indicate acceptable communication quality, and values below 3 reflect perceptibly degraded or unpleasant speech.

All the AI features of Altered RealTime Pro

Experience
RealTime Pro

Transform your
  • video calls
  • voice chats
  • voice calls
  • video calls
with Altered Real-Time Pro
Download for Windows
Altered logo

Subscribe to our newsletter

Keep updated with the latest news
Copyright © 2022-2023 Altered. All rights reserved.