Leading results with industry benchmarks.
Deep Neural Network (DNN) architecture.
Domain specific keywords, phrases and jargon “on the fly”.
Easily integrates with your systems, solutions and workflows.
Recorded audio processed in parallel.
Realtime audio processing that scales horizontally.
Market-leading channel density (channels per CPU).
Flexible licensing options that work for your business.
Mod9 Technologies helps us deliver an industry-leading ediscovery solution for investigations and litigations. We selected the Mod9 ASR engine for its accuracy and speed in the transcription and analysis of recorded audio content, as well as the product’s ability to be deployed within our own environment, which reduces the complexity, operational cost, and risk of managing our customers’ sensitive data offsite.
CTO and Founder, Everlaw
In delivering a world class AI coaching and training platform for contact center agents, VoiceOps needed an accurate, customizable, real-time, transcription engine, ideally compatible with the IBM Watson API. Mod9 was able to deliver against these requirements, offering highly-accurate transcription and analysis of spoken conversations in real-time at a reduced cost, delivering valuable, actionable and data-driven feedback for coaches, raising the performance of the contact center in weeks, versus months.
As a Global Leader in AI Training Data Services and Software, BasicAI is committed to providing high-quality data annotation services and software. BasicAI selected Mod9 as a partner because of the highly customizable nature of their ASR solution and deep expertise in the space. Our customers can now utilize our labeled speech-to-text datasets to build custom solutions for languages, dialects, and industry-specific terminology to satisfy their use cases.
A core capability of the Mod9 ASR Engine includes the ability to automatically transcribe conversations, either real-time or from recorded audio content. While the Engine can be configured with either large or small vocabularies, it can also be customized with the addition of keywords and phrases “on the fly”, ideal for domain specific applications.
Real-time and batch transcript generation for live conversations or recorded audio.
Continuously training, improving and introducing new language and acoustic models.
For 8kHz (telephony) and 16kHz (audio/video) applications and requirements.
For conversational (real-time or recorded) and directed dialogue applications.
Enables speech recognition and transcript generation in parallel and at massive scale.
Include domain specific vocabularies “on the fly”. Tune for application specific use cases.
The Mod9 ASR Engine is asynchronous and will simultaneously return results while still receiving audio. While processing audio data, the Engine will respond with one or more JSON-formatted messages representing the ASR result. In addition to responding with the “1-best" hypothesis and depending on how configuration options have been set, additional metadata may be returned, and further processing may also take place.
To gauge certainty in determining if a returned word is as was stated.
To maintain a chronological order and help recreate events.
For improved recognition speed and naturalness in the response.
Automatic transcript formatting. Adding punctuation, capitalization, disfluencies and more.
So that specific speakers can be easily identified in the transcript.
For scenarios where multi-channel recordings are unavailable.
The Mod9 ASR Engine, a multi-threaded TCP server, is implemented using a client / server architecture and is deployed in your data center or private cloud. To help ensure you can get up and running quickly, a generic Python client application is provided (with a sophisticated command line interface) although other custom TCP clients can be developed.
A multi-threaded TCP server.
Docker® container “packaged”.
Native Linux support (CentOS, Ubuntu).
Full duplex communication, custom protocol.
JSON format (commands, options and results).
Kaldi for Deep Neural Network (DNN) capabilities.
On-premise (in your data center) or Private Cloud.
Edge device support options.
Python application, command line interface (CLI).
Support for custom client development (TCP socket).