OpenAI’s Shift From Model to System

It’s been interesting watching OpenAI shift over the last couple of years — not just in terms of how the models function, but how the company frames its identity and role in the ecosystem they helped create.

Historically, it felt like OpenAI would be a behind-the-scenes provider. A foundational model company that others would build on. But today, they’re both an API provider and a product company, both a platform and a performer. That dual identity — horizontal and vertical — is a key thread in how they’re evolving. You see it in the tension between their developer platform and first-party products like ChatGPT, which now reaches nearly 800 million people weekly.

Maintaining that balance isn’t easy. When a company provides infrastructure to others while also building its own products on top of that infrastructure, prioritization becomes political. Do you withhold platform features to give your own apps an edge? Do you risk alienating your developer base by competing with them directly?

OpenAI seems to be embracing this complexity rather than trying to avoid it. And the way they’re resolving it — anchored to their mission of broadly distributing AI — is shaping many of their strategic moves.

One shift that hits close to home for product and engineering teams like ours is the move away from the idea of “one model to rule them all.” That early vision of a single, monolithic general-purpose model has given way to a more practical understanding: multiple models, each tuned or trained for different tasks, will almost always outperform a catch-all approach. Codex for code, Sora for video, GPT-4-turbo for reasoning-heavy tasks — these are specialized deployments of a shared intelligence core.

This specialization aligns naturally with how products work in the real world. As teams, we rarely need a brain that knows everything about everything — we need a brain that excels at one particular function and integrates well with the system it lives inside.

The fine-tuning API, particularly the reinforcement learning flavor, is especially compelling here. It allows companies not just to hyper-optimize a model’s behavior for a task, but also to do so from their own live product usage data. OpenAI even floated the idea of offering discounted compute in return for that data — a kind of data barter system. That’s a subtle but powerful shift toward platform-level incentives that reach beyond simple pay-per-token pricing.

On the topic of tokens, OpenAI’s pricing remains largely usage-based. That makes sense, especially since workloads that involve deeper reasoning (and are higher value) tend to naturally use more tokens anyway. But it’s clear they’re also experimenting — outcome-based pricing, data-sharing discounts — all signs that monetization is evolving as fast as the tech.

A recurring theme underneath all this is that deployment matters just as much as the model itself. OpenAI doesn’t view agents — AI systems that act autonomously over time — as a new category, but rather a new interface. Agent behavior depends not just on LLMs, but on surrounding logic, context, and orchestration — essentially, engineering.

This is where so much of the recent action lives: in context engineering, not prompt engineering. We’ve moved well beyond carefully worded questions. Now it’s about structuring the environment around the model — the data it can see, the documents it can retrieve, the way its memory forms or updates. Retrieval-augmented generation (RAG), tool routing, step-wise execution, programmatic escapes — these are the mechanics of turning a model into a system. That’s the real frontier.

OpenAI seems to understand that, especially through tools like Agent Builder: a no/low-code framework that makes it easier to build predictable agents with structured policies. This matters a lot for enterprise use cases, where compliance and repeatability are non-negotiable. In heavily regulated domains, you can’t have a model that improvises — you need one that colors precisely within the lines. Determinism isn’t sexy, but it’s a requirement in places where hallucinations aren’t just annoying — they’re dangerous.

The other underappreciated insight from the conversation is OpenAI’s stance on open source. There's been this assumption that proprietary players will try to suppress or avoid the growth of OSS models, but the reality seems more nuanced. OpenAI supports open source efforts, and claims that they haven’t seen meaningful cannibalization of their paid products. Why? Because scale is hard. Running inference on frontier models – reliably, safely, and affordably – is a tough job. Pre-training may be democratizing, but serving these models into production is still a deep moat.

All up, OpenAI’s trajectory — from monolithic AGI dreams to a modular, multi-interface approach — mirrors how AI is landing in the real world. It doesn’t get deployed as one model. It arrives as endpoints, prompts, tools, agents, callbacks, and workflows. No two setups look alike. That variety demands infrastructure that’s flexible, open where it matters, and opinionated where it helps.

We’re reminded daily that AI is not just a capability — it’s an interface, a collaboration layer, a policy decision, a system behavior. The model is necessary, but not sufficient.

And in navigating that, it helps to see how even the companies building the frontier are still figuring it out in real time.