The terms ‘open source’ and ‘open’ are used liberally in the LLM world – yet behind the marketing claims lie massive differences. In this article, I aim to categorise the spectrum of openness in these systems and show which relevant models fall into which category. Full transparency is crucial, particularly for trustworthy AI applications – yet, as we shall see, it is rarely achieved.

The spectrum of openness: 5 levels

In 2024, the Open Source Initiative (OSI) established a standard for the first time with the Open Source AI Definition (OSAID 1.0). In addition, the Linux Foundation’s Model Openness Framework (MOF) offers a graded approach to determining how open – and therefore traceable – an LLM actually is. From these frameworks and practical experience, we can derive five levels – ranging from completely closed to completely open.

Category Weights: The trained model parameters – the ‘brain’ of the model, which can be executed directly.

Category Code: Source code for training – enables traceability and customisation.

Category Training data: The datasets used – crucial for transparency, bias analysis and legal traceability.

Category Training methodology: Procedures, hyperparameters and processes during training – ranging from brief paper descriptions to full reproducibility.

Overview

Level	Description	Weights	Code	Training data	Training methodology	Licence
5	Closed/ Proprietary	❌	❌	❌	❌	API access only
4	Restricted Weights	✅	⚠️ Partially	❌	⚠️ Paper	Restrictive (usage limits)
3	Open Weights	✅	⚠️ Partially	❌	⚠️ Paper	Free to restrictive
2	Open Weights + Open Methodology	✅	✅	⚠️ Partially	✅	Free
1	Fully Open Source	✅	✅	✅	✅	Free (Apache 2.0, MIT)

Level 5: Closed / Proprietary ⚫

No access to weights, code or data. Use is restricted to APIs or licensed integrations.

These models are under the full control of the developer companies. You can use them, but not inspect, modify or host them yourself. The internal architecture, training data and code remain trade secrets.

relevant models

Model	Organisation	Key features
GPT-4o / GPT-5	OpenAI	Flagship models. Multimodal. API-only.
Claude 4 / 4.5	Anthropic	Focus on safety and long contexts. API-only.
Gemini 3.1 / 3.1 Pro	Google	Natively multimodal. Deeply integrated into Google products.
Grok-3 / 4	xAI	Successors to Grok-1 and 2 (which were still open). Closed.

Classification: These models often offer the highest performance ‘out of the box’, but provide no control over the data, no reproducibility of results, and thus complete dependence on the provider. Often, it is not even known how large the model is or how much training data was used.

Level 4: Restricted Weights (Restricted Open) 🔴

Weights are downloadable, but the licence contains significant restrictions, e.g. limits on commercial use, usage regulations or attribution requirements above certain thresholds.

These models are often marketed as “open source”, but are not open source according to the OSI definition. They provide access to the weights, but tie usage to conditions.

relevant models

Model	Organisation	Licence	Restrictions
Llama 4 (Scout/Maverick)	Meta	Llama License	Commercial use up to 700 million monthly active users (MAU). Above this: separate licence required. Prohibited from training other LLMs with it.
Kimi K2.5	Moonshot AI	Modified MIT	From 100 million MAU or $20 million in revenue: ‘Kimi K2.5’ branding mandatory.
Command R+	Cohere	CC-BYNC-4.0	No commercial use without a separate licence agreement with Cohere.

Classification: Meta’s Llama models are the most prominent example of this category – they are undoubtedly useful and powerful, but the licence excludes key open-source freedoms.

Level 3: Open Weights 🟠

Model weights are freely available and can be used (including for commercial purposes), but the training data and often the training code as well remain proprietary.

This is the most common category among ‘open’ models. You can download them, run them locally and fine-tune them – but you cannot reproduce them from scratch, as the training data is missing.

relevant models

Model	Organisation	Licence	Key features
Gemma 3 / 4	Google	Gemma-Lizenz	Multimodal. Efficient on consumer hardware. 256K context.
GLM-5	Zhipu AI	MIT	744B MoE (40B active). Strong at coding and agentic tasks. No usage restrictions.
gpt-oss 120b	OpenAI	Apache 2.0	First open OpenAI model since GPT-2. 117B (MoE, 5.1B active). Strong in knowledge (MMLU-Pro approx. 80.8%).

Overview: For most companies and developers, this category is the sweet spot – you get powerful models with extensive freedom of use, without the complexity of full reproducibility.

Level 2: Open Weights + Open Methodology

Weights and code are open and licensed without usage restrictions; training data is partially documented or referenced, but not fully available.

These models go far beyond ‘just weights’: they publish detailed technical reports, training recipes and often the training code as well – but the exact training data is not fully available, for example due to copyright reasons or the sheer volume of data.

Model	Organisation	Parameters	Licence	Special features
DeepSeek V3 / V3.2	DeepSeek	671B (37B aktiv, MoE)	MIT (Code) / DeepSeek Model License (Weights)	Full training code open-source. Detailed paper. Training data not open-source, but methodology excellently documented. Weights commercially usable.
DeepSeek R1	DeepSeek	671B (37B aktiv, MoE)	MIT (Code) / DeepSeek Model License (Weights)	Reasoning model with RL. Distilled variants: Qwen-based under Apache 2.0, Llama-based under Llama Licence.
Qwen 3 / 3.5	Alibaba	bis 397B (MoE)	Apache 2.0	Widest range of models (0.6B–235B). 200+ languages. Training methodology documented in papers.
Mixtral 8x22B / Mistral Small 3	Mistral AI	141B (MoE, 39B aktiv) / 24B	Apache 2.0	European-based company. Freely usable (unlike Mistral Large 2, which is licensed under the Mistral Research Licence and would therefore be classified as Level 4).

Classification: This section features many of the most powerful open-source models currently available. DeepSeek and Qwen set the standard for industry-ready openness under the MIT and Apache 2.0 licences respectively – without disclosing the full training data.

Level 1: Fully Open Source

Everything is open: weights, code, training data, methodology and documentation. The model can be reproduced from scratch.

This is the strictest category – and the rarest. According to the OSI definition (OSAID 1.0), all components must be available without restrictions on use (for example, under Apache 2.0 or MIT): model weights, complete training code, the training data (or sufficiently detailed documentation), and the entire training methodology.

Why is this important?

Only with complete openness can one audit bias in training data, verify results, and actually reproduce the model from scratch. This is the foundation for genuine verifiability.

relevant models

Modell	Organisation	Key features
OLMo 3 / 3.1	AI2 (Allen Institute)	All checkpoints, Dolma-3 training data, logs and evaluation code are open-source. Apache 2.0. Includes OLMoTrace for tracing back to source data.
Amber-7B / Crystal-7B / K2-65B	LLM360	Project with radical transparency (“360°”): all checkpoints, training data, metrics and W&B logs open. K2-65B outperforms Llama 2 70B.
Pythia	EleutherAI	Research model suite with 8 sizes (70M–12B), 154 checkpoints each. Pile training data open. Apache 2.0.
BLOOM (176B)	BigScience / HuggingFace	Pioneering project (July 2022): ROOTS corpus (1.6 TB, 46 languages) open. BigScience BLOOM RAIL License v1.0.
MAP-Neo (7B)	M-A-P	Bilingual (EN/ZH). 4.5T tokens. Training data (MatrixPile), cleaning pipeline and checkpoints open.

Classification: These models are not the most powerful – but they are invaluable to the scientific community and the open-source community. OLMo from AI2 is currently the flagship model in this field.

The key differences in detail

What exactly is available?

	Level 5	Level 4	Level 3	Level 2	Level 1
Weights	❌	✅	✅	✅	✅
Architectural details	❌	⚠️	✅	✅	✅
Training code	❌	❌	⚠️	✅	✅
Training data	❌	❌	❌	⚠️	✅
Training methodology	❌	⚠️	⚠️	✅	✅
Open licence	❌	❌	✅	✅	✅
Reproducibility	❌	❌	❌	⚠️	✅

Licence map

Licence	Type	Commercial use	Examples
Proprietary	Closed	❌API only	GPT-4, Claude, Gemini
CC-BY-NC	Restrictive	❌ Non-commercial only	Command R+
Llama License	Restrictive	⚠️ Up to 700M MAU	Llama 3, Llama 4
RAIL	Restrictive	⚠️ With usage restrictions	BLOOM
Gemma License	Semi-open	✅ With usage guidelines	Gemma 3, Gemma 4
MIT	Open (no restrictions)	✅ Unrestricted	DeepSeek (Code), GLM-5, Phi-4
Apache 2.0	Open (no restrictions)	✅ Unrestricted	Qwen, Mixtral, OLMo, Falcon 7B/40B

Conclusion: What does this mean in practice?

1. ‘Open source’ ≠ ‘open source’ – The term is used loosely. Only Level 1 models fully meet the OSI definition. Most popular ‘open’ models fall into Levels 2–3.

2. The sweet spot lies in Levels 2–3 – Models such as DeepSeek V3.2, Qwen 3.5 or Gemma 4 offer an excellent balance of performance, freedom of use and accessibility.

3. Caution with Level 4 – Llama models are fantastic for prototyping and research, but the licence terms can become a problem in commercial use.

4. Level 1 is crucial for science – projects such as OLMo and Pythia enable genuine research into the behaviour of LLMs, bias analysis and algorithmic transparency.

5. The gap is closing – by 2025/2026, open models (Levels 1–3) will reach, on many benchmarks, the level that proprietary models had only a few months earlier. The rationale for committing entirely to closed providers is becoming increasingly weak.

As of April 2026. The LLM landscape is evolving rapidly – new models and licences can quickly alter the classification.

Sources & further links

Open Source AI Definition (OSAID 1.0) – OSI

Model Openness Framework – Linux Foundation

OLMo – AI2

Open Source LLM Leaderboard – whatllm.org