Open Source in Large Language Models: How ‘open’ is ‘open’ really?
The terms ‘open source’ and ‘open’ are used liberally in the LLM world – yet behind the marketing claims lie massive differences. In this article, I aim to categorise the spectrum of openness in these systems and show which relevant models fall into which category. Full transparency is crucial, particularly for trustworthy AI applications – yet, as we shall see, it is rarely achieved.
The spectrum of openness: 5 levels
In 2024, the Open Source Initiative (OSI) established a standard for the first time with the Open Source AI Definition (OSAID 1.0). In addition, the Linux Foundation’s Model Openness Framework (MOF) offers a graded approach to determining how open – and therefore traceable – an LLM actually is. From these frameworks and practical experience, we can derive five levels – ranging from completely closed to completely open.
Category Weights: The trained model parameters – the ‘brain’ of the model, which can be executed directly.
Category Code: Source code for training – enables traceability and customisation.
Category Training data: The datasets used – crucial for transparency, bias analysis and legal traceability.
Category Training methodology: Procedures, hyperparameters and processes during training – ranging from brief paper descriptions to full reproducibility.
Overview
| Level | Description | Weights | Code | Training data | Training methodology | Licence |
| Closed/ Proprietary | ❌ | ❌ | ❌ | ❌ | API access only | |
| Restricted Weights | ✅ | ⚠️ Partially | ❌ | ⚠️ Paper | Restrictive (usage limits) | |
| Open Weights | ✅ | ⚠️ Partially | ❌ | ⚠️ Paper | Free to restrictive | |
| Open Weights + Open Methodology | ✅ | ✅ | ⚠️ Partially | ✅ | Free | |
| Fully Open Source | ✅ | ✅ | ✅ | ✅ | Free (Apache 2.0, MIT) |
Level 5: Closed / Proprietary ⚫
No access to weights, code or data. Use is restricted to APIs or licensed integrations.
These models are under the full control of the developer companies. You can use them, but not inspect, modify or host them yourself. The internal architecture, training data and code remain trade secrets.
relevant models
| Model | Organisation | Key features |
| GPT-4o / GPT-5 | OpenAI | Flagship models. Multimodal. API-only. |
| Claude 4 / 4.5 | Anthropic | Focus on safety and long contexts. API-only. |
| Gemini 3.1 / 3.1 Pro | Natively multimodal. Deeply integrated into Google products. | |
| Grok-3 / 4 | xAI | Successors to Grok-1 and 2 (which were still open). Closed. |
Classification: These models often offer the highest performance ‘out of the box’, but provide no control over the data, no reproducibility of results, and thus complete dependence on the provider. Often, it is not even known how large the model is or how much training data was used.
Level 4: Restricted Weights (Restricted Open) 🔴
Weights are downloadable, but the licence contains significant restrictions, e.g. limits on commercial use, usage regulations or attribution requirements above certain thresholds.
These models are often marketed as “open source”, but are not open source according to the OSI definition. They provide access to the weights, but tie usage to conditions.
relevant models
| Model | Organisation | Licence | Restrictions |
| Llama 4 (Scout/Maverick) | Meta | Llama License | Commercial use up to 700 million monthly active users (MAU). Above this: separate licence required. Prohibited from training other LLMs with it. |
| Kimi K2.5 | Moonshot AI | Modified MIT | From 100 million MAU or $20 million in revenue: ‘Kimi K2.5’ branding mandatory. |
| Command R+ | Cohere | CC-BYNC-4.0 | No commercial use without a separate licence agreement with Cohere. |
Classification: Meta’s Llama models are the most prominent example of this category – they are undoubtedly useful and powerful, but the licence excludes key open-source freedoms.
Level 3: Open Weights 🟠
Model weights are freely available and can be used (including for commercial purposes), but the training data and often the training code as well remain proprietary.
This is the most common category among ‘open’ models. You can download them, run them locally and fine-tune them – but you cannot reproduce them from scratch, as the training data is missing.
relevant models
| Model | Organisation | Licence | Key features |
| Gemma 3 / 4 | Gemma-Lizenz | Multimodal. Efficient on consumer hardware. 256K context. | |
| GLM-5 | Zhipu AI | MIT | 744B MoE (40B active). Strong at coding and agentic tasks. No usage restrictions. |
| gpt-oss 120b | OpenAI | Apache 2.0 | First open OpenAI model since GPT-2. 117B (MoE, 5.1B active). Strong in knowledge (MMLU-Pro approx. 80.8%). |
Overview: For most companies and developers, this category is the sweet spot – you get powerful models with extensive freedom of use, without the complexity of full reproducibility.
Level 2: Open Weights + Open Methodology 
Weights and code are open and licensed without usage restrictions; training data is partially documented or referenced, but not fully available.
These models go far beyond ‘just weights’: they publish detailed technical reports, training recipes and often the training code as well – but the exact training data is not fully available, for example due to copyright reasons or the sheer volume of data.
| Model | Organisation | Parameters | Licence | Special features |
| DeepSeek V3 / V3.2 | DeepSeek | 671B (37B aktiv, MoE) | MIT (Code) / DeepSeek Model License (Weights) | Full training code open-source. Detailed paper. Training data not open-source, but methodology excellently documented. Weights commercially usable. |
| DeepSeek R1 | DeepSeek | 671B (37B aktiv, MoE) | MIT (Code) / DeepSeek Model License (Weights) | Reasoning model with RL. Distilled variants: Qwen-based under Apache 2.0, Llama-based under Llama Licence. |
| Qwen 3 / 3.5 | Alibaba | bis 397B (MoE) | Apache 2.0 | Widest range of models (0.6B–235B). 200+ languages. Training methodology documented in papers. |
| Mixtral 8x22B / Mistral Small 3 | Mistral AI | 141B (MoE, 39B aktiv) / 24B | Apache 2.0 | European-based company. Freely usable (unlike Mistral Large 2, which is licensed under the Mistral Research Licence and would therefore be classified as Level 4). |
Classification: This section features many of the most powerful open-source models currently available. DeepSeek and Qwen set the standard for industry-ready openness under the MIT and Apache 2.0 licences respectively – without disclosing the full training data.
Level 1: Fully Open Source 
Everything is open: weights, code, training data, methodology and documentation. The model can be reproduced from scratch.
This is the strictest category – and the rarest. According to the OSI definition (OSAID 1.0), all components must be available without restrictions on use (for example, under Apache 2.0 or MIT): model weights, complete training code, the training data (or sufficiently detailed documentation), and the entire training methodology.
Why is this important?
Only with complete openness can one audit bias in training data, verify results, and actually reproduce the model from scratch. This is the foundation for genuine verifiability.
relevant models
| Modell | Organisation | Key features |
| OLMo 3 / 3.1 | AI2 (Allen Institute) | All checkpoints, Dolma-3 training data, logs and evaluation code are open-source. Apache 2.0. Includes OLMoTrace for tracing back to source data. |
| Amber-7B / Crystal-7B / K2-65B | LLM360 | Project with radical transparency (“360°”): all checkpoints, training data, metrics and W&B logs open. K2-65B outperforms Llama 2 70B. |
| Pythia | EleutherAI | Research model suite with 8 sizes (70M–12B), 154 checkpoints each. Pile training data open. Apache 2.0. |
| BLOOM (176B) | BigScience / HuggingFace | Pioneering project (July 2022): ROOTS corpus (1.6 TB, 46 languages) open. BigScience BLOOM RAIL License v1.0. |
| MAP-Neo (7B) | M-A-P | Bilingual (EN/ZH). 4.5T tokens. Training data (MatrixPile), cleaning pipeline and checkpoints open. |
Classification: These models are not the most powerful – but they are invaluable to the scientific community and the open-source community. OLMo from AI2 is currently the flagship model in this field.
The key differences in detail
What exactly is available?
| Level 5 | Level 4 | Level 3 | Level 2 | Level 1 | |
| Weights | ❌ | ✅ | ✅ | ✅ | ✅ |
| Architectural details | ❌ | ⚠️ | ✅ | ✅ | ✅ |
| Training code | ❌ | ❌ | ⚠️ | ✅ | ✅ |
| Training data | ❌ | ❌ | ❌ | ⚠️ | ✅ |
| Training methodology | ❌ | ⚠️ | ⚠️ | ✅ | ✅ |
| Open licence | ❌ | ❌ | ✅ | ✅ | ✅ |
| Reproducibility | ❌ | ❌ | ❌ | ⚠️ | ✅ |
Licence map
| Licence | Type | Commercial use | Examples |
| Proprietary | Closed | ❌API only | GPT-4, Claude, Gemini |
| CC-BY-NC | Restrictive | ❌ Non-commercial only | Command R+ |
| Llama License | Restrictive | ⚠️ Up to 700M MAU | Llama 3, Llama 4 |
| RAIL | Restrictive | ⚠️ With usage restrictions | BLOOM |
| Gemma License | Semi-open | ✅ With usage guidelines | Gemma 3, Gemma 4 |
| MIT | Open (no restrictions) | ✅ Unrestricted | DeepSeek (Code), GLM-5, Phi-4 |
| Apache 2.0 | Open (no restrictions) | ✅ Unrestricted | Qwen, Mixtral, OLMo, Falcon 7B/40B |
Conclusion: What does this mean in practice?
1. ‘Open source’ ≠ ‘open source’ – The term is used loosely. Only Level 1 models fully meet the OSI definition. Most popular ‘open’ models fall into Levels 2–3.
2. The sweet spot lies in Levels 2–3 – Models such as DeepSeek V3.2, Qwen 3.5 or Gemma 4 offer an excellent balance of performance, freedom of use and accessibility.
3. Caution with Level 4 – Llama models are fantastic for prototyping and research, but the licence terms can become a problem in commercial use.
4. Level 1 is crucial for science – projects such as OLMo and Pythia enable genuine research into the behaviour of LLMs, bias analysis and algorithmic transparency.
5. The gap is closing – by 2025/2026, open models (Levels 1–3) will reach, on many benchmarks, the level that proprietary models had only a few months earlier. The rationale for committing entirely to closed providers is becoming increasingly weak.
As of April 2026. The LLM landscape is evolving rapidly – new models and licences can quickly alter the classification.
Sources & further links
Open Source AI Definition (OSAID 1.0) – OSI

