Gnani.ai Unveils Vachana STT to Tackle India’s Speech AI Gap
Trained on more than one million hours of voice data, the model targets India’s multilingual and low-resource speech landscape.
Topics
News
- India Faces Lower Risk of AI Job Disruption Than the West, Says IT Secretary
- Tata Steel Discloses $1.6 Billion Dutch Class Action Over Emissions
- Nvidia Licenses Groq’s Inference Tech, Hires Leadership
- HCLTech Deepens Software Push With Three Acquisitions in a Week
- Isro Launches BlueBird Block 2 in Heaviest Commercial Mission Yet
- OpenAI Softens ChatGPT’s Tone While Scaling for an AI Showdown
Indian voice AI startup Gnani.ai has launched Vachana STT, an enterprise-grade, foundational speech-to-text model trained on more than one million hours of real-world voice data, positioning it as one of the most ambitious attempts yet to build sovereign speech AI from India.
Vachana STT forms the first critical layer of Gnani.ai’s upcoming VoiceOS, a unified voice intelligence stack that will integrate speech recognition, understanding, and orchestration. Unlike modular systems assembled from multiple APIs, VoiceOS is being built as a single architecture, with Vachana STT serving as its base layer.
What sets the model apart is its performance in India’s complex, noisy, and multilingual speech environment. Gnani.ai claims Vachana STT delivers 30-40% lower word error rates (WER) for low-resource Indic languages and 10-20% lower WER across India’s top eight spoken languages compared to leading global and sovereign speech AI providers.
The model has been benchmarked across Hindi, Bengali, Gujarati, Marathi, Punjabi, Tamil, Telugu, Kannada, Malayalam, Odia, Assamese, and other Indian languages.
Trained on proprietary multilingual datasets spanning 1,056 domains, Vachana STT is designed to work out of the box, without additional fine-tuning. Gnani.ai says the model maintains accuracy across real-world omnichannel audio, making it suitable for sectors where transcription quality directly affects automation outcomes, regulatory compliance, and customer experience.
Already deployed at scale, the platform processes nearly 10 million calls daily across BFSI, telecom, and customer support systems, with a reported p95 latency of 200 milliseconds.
Vachana STT supports both real-time and batch transcription, handles compressed audio ranging from 8 kbps to 64 kbps, and maintains stable performance even under high concurrency and fluctuating network conditions.
The launch also aligns with national priorities. Vachana STT is part of Gnani.ai’s selection under the IndiaAI Mission, through which the government has identified a small group of startups to build sovereign foundational AI models.
“Speech recognition in India is not a localization problem, it’s a foundational systems problem,” said Ganesh Gopalan, Co-Founder and CEO of Gnani.ai, underscoring the company’s focus on core AI infrastructure built for India’s realities.
Vachana STT is available immediately via enterprise APIs, with early adopters offered 100,000 free minutes of usage.