India Has 121 Languages. Most Voice AI Only Serves a Handful.

Building an Open and Responsible Voice Technology Ecosystem: Policy Recommendations for Digital Inclusion in India

India’s remarkable linguistic diversity presents both opportunities and challenges for the development of inclusive voice technologies, shaping how millions can participate in the digital economy.

This report examines key barriers to building open and responsible speech systems in India - from data collection and model development to compute infrastructure and responsible practices. It recommends key policy interventions to support an innovative and equitable voice-technology ecosystem:

Making foundational datasets available as digital public goods to address market failures in voice-technology ecosystems and promotes local economic innovation.
Encourage benchmarking around open evaluation datasets and transparent leaderboards providing common baselines for developers, improving procurement standards, and helping assess performance across diverse languages and speaker groups.
Sustain open, equitable development by treating dataset hosting as durable public digital infrastructure rather than grant-based, project-specific assets.
Support value-sharing mechanisms that counter extractive data practices through attribution norms, community benefit-sharing, and the use of copyleft licenses for publicly funded datasets.

Indic Voice Technologies for an Inclusive India: Toolkit for Developers

The development of speech and language technologies in the Indian context is constrained not by a lack of innovation, but by persistent structural gaps in data representation, quality assurance, evaluation practices, and governance. Models trained on narrow or homogenised datasets risk underperforming for large segments of the population, while post-hoc ethical safeguards and deployment fixes are insufficient to address foundational exclusions embedded early in the development lifecycle.

This toolkit sets out a layered, lifecycle-oriented approach to building inclusive and robust speech artificial intelligence (AI) systems. It brings together strategies for diverse and representative data collection, linguistically informed model training, rigorous quality control, and deployment optimisation under real-world constraints, alongside embedded Responsible AI practices.