Open-weight Local hosting possible Apache 2.0 License

GPT-OSS

OpenAI — Published August 2025

At a glance: GPT-OSS is the first reasoning language model family published by OpenAI under an open license (Apache 2.0). Unlike ChatGPT, these models can be downloaded and run on your own machines — opening the possibility of use in a controlled environment, where no data leaves your infrastructure. For clinicians, this is a paradigm shift: the question is no longer just “is the AI reliable?” but also “can I run it on my own premises?”.

Identity

Publisher: OpenAI (San Francisco, USA)

Published: August 2025

Variants: gpt-oss-120b (117B params) / gpt-oss-20b (21B params)

Type: Large language model (LLM) with reasoning

License: Apache 2.0 (commercial use authorized)

Architecture: Mixture-of-Experts (MoE)

Context: 128,000 tokens

Local execution: gpt-oss-20b → 16 GB VRAM (consumer GPU)

What GPT-OSS does (in plain terms)

GPT-OSS works on the same principle as ChatGPT: it is a language model that generates text by statistically predicting the next word. The fundamental difference is its distribution model: the model weights are publicly downloadable, meaning any organization can install and run it on its own servers.

Two variants

120B

gpt-oss-120b

117 billion parameters (5.1 billion active per request thanks to MoE architecture). Requires an H100-class GPU. Performance close to top proprietary models.

20B

gpt-oss-20b

21 billion parameters (3.6 billion active). Runs with 16 GB of VRAM — a consumer-grade GPU like the RTX 4080 or equivalent. This is the most accessible variant for local installation.

GPT-OSS's technical distinctive feature is its adjustable reasoning: the model can activate an internal “chain of thought” at three levels (low, medium, high), allowing it to adapt the depth of its analysis to the complexity of the question.

Open-weight ≠ open-source: a distinction worth knowing

The term “open-source” is frequently misused in the AI field. It is important to understand what GPT-OSS actually offers — and what it does not make accessible.

What is open (open-weight)

  • • The model weights (trained parameters)
  • • The inference code (to run the model)
  • • The Apache 2.0 license (free commercial use)
  • • The technical documentation (model card)

What remains closed

  • • The training data
  • • The training code
  • • The alignment procedures (RLHF)
  • • The full internal evaluations

Why this matters for clinicians: An open-weight model can be partially audited (its outputs, its observable biases), but not fully (we don't know exactly what it was trained on). You can control where it runs and who accesses the data, but you cannot verify everything it has learned. See our glossary entry on Open Source / Open Weights.

The case for a local model

When you use ChatGPT, your queries transit through OpenAI's servers in the United States. With an open-weight model like GPT-OSS, it becomes technically possible to run the AI on a local server (within a healthcare facility, a certified health data host, or even on an equipped workstation). No data leaves your infrastructure.

Concrete mental health scenarios

  • Sovereign clinical writing assistance: drafting reports or session summaries without sending patient data to a foreign cloud.
  • Clinical case exploration: testing diagnostic hypotheses or therapeutic strategies with a local LLM, in full confidentiality.
  • Research and training: training staff or conducting pilot studies on patient-AI interactions without dependency on an external provider.
  • Regulatory compliance: local hosting facilitates GDPR compliance (Art. 9, sensitive data) and health data hosting requirements.

Caution: Local installation does not automatically make use “safe”. A local model can still produce hallucinations, biases, or inappropriate responses. Control applies to data confidentiality, not to response reliability. The same verification requirements apply — the HAS A.V.E.C. framework remains relevant whether the model runs on an American cloud or within your facility.

Limitations and realism

Technical skills required

Installing and running an LLM locally is not trivial. It requires system administration skills, suitable hardware (GPU with sufficient VRAM), and ongoing maintenance. This is not (yet) within reach of individual clinicians — but rather of IT departments or specialized providers.

Lower performance than proprietary models

GPT-OSS is competitive but remains behind the most advanced commercial versions (GPT-5, Claude Opus). For tasks requiring the highest reasoning quality, the confidentiality/performance trade-off must be evaluated on a case-by-case basis.

No built-in guardrails

Unlike ChatGPT, which includes safety mechanisms (crisis detection, redirection to hotlines), a locally executed model only inherits the alignment built in during training. Additional guardrails must be implemented by the organization.

Residual opacity

Even with open weights, the training data remains unknown. We cannot verify whether the model was exposed to quality clinical sources or problematic content. The openness is partial.

Our analysis

GPT-OSS represents a turning point in the accessibility of advanced language models. For the first time, OpenAI — the publisher of ChatGPT, the most widely used LLM — is releasing a reasoning model that any institution can host on its own servers.

For mental health facilities, this is a strategic opportunity. The main ethical barrier to using LLMs in clinical practice — sending sensitive data to foreign servers — can theoretically be lifted with local or sovereign hosting. GPT-OSS and its equivalents (Meta's LLaMA, Mistral AI's models) are shaping a landscape where clinicians are no longer captive to a single provider.

But we must remain clear-eyed. Data sovereignty does not solve the question of clinical reliability. A local model can hallucinate just as much as a cloud model. It can reinforce biases, validate rumination, or produce inappropriate recommendations. The framework for cautious use — systematic verification, transparency, human oversight — applies identically.

The real promise of GPT-OSS is not technological autonomy. It is the ability to experiment under control: testing use cases, evaluating scenarios, training teams, conducting research — all without dependency on a third party and without risk to patient confidentiality. It is a research and exploration tool before being a clinical tool.

GPT-OSS vs ChatGPT: quick comparison

Criterion ChatGPT GPT-OSS
Hosting OpenAI servers (USA) Local or cloud, your choice
Patient data Transits externally Stays in-house
Response quality Superior (GPT-5) Competitive (120B) / Decent (20B)
Guardrails Built-in (hotlines, disclaimers) Must be implemented yourself
Accessibility Immediate (web browser) Technical skills required
Cost Free / $20/month Free (model) + hardware/infra cost

Other notable open-weight models

GPT-OSS is not the only open-weight model. The ecosystem is dynamic:

LLaMA 4 (Meta)

Meta's open-weight model family, used as the foundation for numerous third-party implementations. Large community and tool ecosystem.

Mistral (Mistral AI)

French publisher, high-performing multilingual models. Strong European sovereignty argument for healthcare institutions subject to the GDPR.

DeepSeek (DeepSeek AI)

Highly performant Chinese open-weight models (DeepSeek-V3, R1). Note: questions about provenance and alignment arise differently depending on the publisher.

References

OpenAI (2025). GPT-OSS Model Card. arXiv:2508.10925.

Isola, P. et al. (2025). GPT-OSS: Open-Weight Reasoning Models from OpenAI. OpenAI Technical Report.

HAS (2025). First guidelines for generative AI use in healthcare — A.V.E.C. Framework.

Model card last updated: February 2026

Back to AI Tools