Bertin IT Introduces MediaSpeech v6, Its Latest Multilingual Speech Recognition Solution

MediaSpeech® offers the industry’s best capabilities for the
operation and in-depth analysis of media and telecommunications
databases.

PARIS–(BUSINESS WIRE)–#SpeechAnalytics–Bertin IT (CNIM Group) announces the release of the new version of MediaSpeech®,
its multilingual speech recognition solution that converts audio
tracks to searchable text transcripts, enabling audio and video sources,
to be indexed searched and analysed. MediaSpeech® now also comes in a
live version for real-time audio streams, paving the way for new
interactive and augmented communications applications.

Thanks to deep neural networks commonly used in Artificial
Intelligence systems, MediaSpeech® creates an extremely fine model of
the acoustic space which is robust with different speakers and acoustic
conditions, so offering even faster and more accurate transcription.

Features:

Speech recognition with each word being transcribed within a
millisecond and assigned a recognition confidence score.
Automatic detection of spoken language (LID).
Automatic segmentation speaking slots and speakers with gender
recognition.
Identification of the speaker from a biometric database.
Automatic and semi-automatic adaptation of vocabularies and domains.

And all this in 17 different languages.

MediaSpeech® has several variations: deployed on site or in SaaS mode,
hosted on Bertin IT’s cloud, MediaSpeech® Factory can handle large
volumes of files with guaranteed performance levels; a new version MediaSpeech®
Live is able to transcribe audio streams on the fly, opening the door to
innovative real-time applications – voice chatbots, call-bots,
enhanced call centres (the enhanced call centre concept involves the
provision of assistance to the adviser during the call so streamlining
and improving the quality of the dialogue.).

Among the main improvements in the new version of MediaSpeech®:

MediaSpeech® Live version for processing audio streams in real
time.
New neural models make transcription two to three times faster and
more accurate.
“Full” neuronal transition of all speech processing modules:
speech detection (VAD) and speaker segmentation (Diarization) for even
greater accuracy.
Easy installation process, stronger security and new interfaces.
A fully neuronal language identification module (LID) with
increased accuracy, even for relatively short sections of speech.

Version
6 of MediaSpeech® is already being used by several customers,
including a major French investment and finance bank. The MediaSpeech
Live version has just been delivered to another major banking group for
use at its contact centres.

Contacts

Nathalie Sablon
[email protected]

One United Properties posts a consolidated turnover of 285.5 million euros and a gross profit of 88.6 million euros in 2024

QNB Group Strengthens Innovation and Fintech Ecosystem with Strategic MoUs at Web Summit Qatar 2025

Calderys invests in a state-of-the-art Innovation Center in Neuwied, Germany

MEXC Launches Campaign for ENA & USDe with $1,000,000 Rewards

Esker (Market Dojo) Recognised in the 2025 Gartner® Market Guide for Sourcing Applications

Whatfix Unveils ScreenSense: An AI Technology to Shape the Next Frontier of Digital Adoption

Veeva Direct Data API Now Included with Vault Platform to Enable AI Innovation

Consensus concludes sold-out debut event in Hong Kong and announces return to Asia in 2026

Bybit Takes Aim at Crypto Crime with Launch of Industry-first LazarusBounty.com Platform