Open Source AI – Between Promise, Power, and Public Interest

Open source has long been celebrated for enabling collaboration, transparency, and decentralised innovation. In this age of rapidly expanding AI, these ideals are being revisited, and contested. As governments, researchers, and builders seek to develop alternatives to Big Tech-dominated ecosystems, “open source AI” has emerged as a pivotal strategy. But, as the team at Digital Futures Lab shares, what counts as ‘open’ in AI is deeply contested and far from settled.

From licensing grey zones to questions of community ownership, Open Source AI (OSAI) reveals critical tensions: between access and equity, openness and extraction, public value and private interest.

Below, we outline some key questions and themes emerging from our work exploring OSAI in India:

1. Is Open Source AI a technical category or a composite claim?

Unlike open source software, AI systems are made up of multiple components: source code, model weights, training data, evaluation metrics, fine-tuning instructions. Rarely are all of these open. One model may release weights, another methodology, a third data.

What then counts as “open”? Would it be transparency, replicability, affordability? Further, if openness is a composite, how do we guard against open-washing, where the ambiguity of fragmentation becomes a branding exercise rather than a meaningful practice?

2. The value proposition: transparency, innovation, and reduced duplication.

Open ecosystems reduce duplication. When datasets, APIs, or model architectures are available to all, they lower barriers to entry for smaller actors, civic tech groups, and public sector developers, and enable innovation, cost savings, and greater scrutiny.

However, in multilingual and culturally diverse contexts like India, does openness lead to more diverse innovation in practice? In a context where building homegrown, language-aware models requires local data, not one-size-fits-all systems, can open data unlock localised solutions or does it risk being co-opted by better-resourced actors?

3. Voice tech in India shows the stakes of open data, as well as its vulnerabilities.

Our ongoing work on speech datasets in Indian languages highlights what openness looks like on the ground. Through this project, we are looking at how responsible, ethical datasets for low-resource Indian languages are built and sustained to support downstream voicetech applications.

As part of this work, we are tackling some critical questions that surround voice data equity: What does consent look like in low-resource settings? How can we ensure responsible reuse — and avoid communities losing control over the data they help build?

4. How do we think about risk?

Open models may be at risk of being misused and repurposed to bypass safety features and generate harmful content. But is the answer restriction or regulation?

The risks underscore the need for clear, enforceable governance frameworks that build trust, enable accountability. What might be the kind of governance frameworks that can enable openness without losing accountability?

5. What sustains open ecosystems?

Openness has real costs: hosting, maintaining, documenting, translating, governing. It is to be seen what kind of financing models ensure sustainability and independence of emerging open-source AI communities.

For deeper insights, keep an eye out for the publications from these projects looking at Open Source AI:

Open Source AI: Policy Options for India: Making AI systems open-source could democratise the AI ecosystem. But, open-source AI could contribute to newer forms of data extractivism. Through this project, we explore some of these tensions & develop recommendations for Indian policymakers & AI developers.
Voice Technologies for Indian Languages: Best Practices & Recommendations for Responsible & Open AI: In partnership with ARTPARK and Trilegal, and is supported by Bhashini and GIZ, we are working to identify barriers and enablers for open-source voice technologies in India and develop best practices and recommendations for their responsible development and use.

To receive these monthly reflections and insights directly into your inboxes, subscribe to our newsletter.

Subscribe Now