← all insightsBuild With Athar// insight
June 29, 2026·7 min read·#vertex-ai#mlops#google-cloud#gen-ai#machine-learning

Gemini is the engine. Vertex AI is the entire car company.

Everyone can name Google's AI model. Almost nobody can tell you what Vertex AI does — and the boring one is where AI actually ships. Because shipping AI was never about the model. The model is the easy part.

Ask ten engineers to name a Google AI model and ten will say Gemini. Ask the same ten what Vertex AI does and you'll get a pause, a guess, and the word "platform" doing a lot of unpaid work.

That's the tell. Gemini is the engine — the loud, famous, demo-able part. Vertex AI is the factory, the assembly line, the dealership, and the service garage wrapped around it. And here's the thing nobody puts on a slide: the factory is the harder problem — and it's the part that actually decides whether any of this ships.

Because shipping AI was never about the model. Getting a model to say something clever once, in a notebook, on your laptop is the easy part. Getting it to do that for a million users, versioned, monitored, and not silently rotting — that's the job. That's Vertex AI.

1. An "AI platform" isn't a model. It's an assembly line.

When people hear "Vertex AI" they picture a box you talk to Gemini through. It isn't. Vertex AI is Google Cloud's unified ML platform — the place you build, train, tune, deploy, serve, and monitor models. Google announced it at I/O in May 2021, and the operative word in every description of it is unified.

Unified from what? From a graveyard of separate tools. Before, your notebook lived one place, your training jobs another, your feature data somewhere else, your model versions in a spreadsheet, your serving on hand-rolled infra, and your monitoring — if you had any — in a dashboard someone built on a Friday. Vertex AI's pitch is: stop gluing eight tools together with duct tape and a service account. Here's one platform where they're already connected.

That's hard to feel in prose. So here it is as a thing you can break. Toggle it to Fragmented and watch the MLOps surfaces fly apart into the disconnected tools you'd otherwise wire together yourself. Toggle it back to Unified and they snap onto one platform, with the model layer — Model Garden — at the core. Tap any tile to see what it actually does.

// one platform, one lifecycle
Model Garden

The model layer: 200+ models you can deploy or tune — Google's own (Gemini, Gemma, Imagen) plus partner models (Anthropic's Claude, Meta's Llama, Mistral). The famous part. Also the smallest tile on the platform.

Toggle Fragmented ↔ Unified. Fragmented = eight tools you integrate yourself. Unified = one platform, one lifecycle. Drag to rotate, tap a tile.

Look at where the model sits in that diagram: one tile, in the middle, surrounded by seven others. The model is the part everyone talks about and the part you spend the least time on. Everything orbiting it is the actual work — and the actual product.

// concept
Vertex AI is where you chat with Gemini.
tap to reveal
// insight
No — that's the model. Vertex AI is the factory around it: train, tune, version, deploy, serve, monitor. Gemini is one engine in the Model Garden. The platform is the assembly line that turns any engine into a shippable product.
tap to flip back ↻
TL;DR — section 1

2. You don't train the model. You rent it — and own the pipeline.

Here's the part that breaks the "we built our own AI" pitch. On Vertex AI, you almost never train a foundation model. You open Model Garden — a catalog of 200+ models including Gemini, Gemma, and Imagen, plus partner models like Anthropic's Claude and Meta's Llama — and you start from one. Then you tune it on your data (supervised fine-tuning), ground it on your own documents so it stops inventing things, and deploy it to an endpoint.

So what's actually yours? Not the model. The pipeline. The data plumbing, the tuning recipe, the evaluation gate, the serving setup, the monitoring. That's the engineering, and it's the part that survives the model getting swapped out next quarter.

And that pipeline is the whole gap between a demo and a product. Drag the handle: same model on the left and the right — one's a notebook cell, the other is a thing a million people can hit at 3am.

The demo (notebook)The product (Vertex endpoint)
# Monday's demo — runs in a notebook on your laptop
model = load_model("fraud_finetune.pkl")
pred = model.predict(features)   # 0.94 — ship it!

# ...and it works. On your machine. Once.
# no versioning. no autoscaling. no canary.
# no monitoring when it silently rots in week 3.
# "works on my machine" is not a deployment.
# The product — same model, on Vertex AI
endpoint.deploy(
  model=registry.get("fraud", version=7),  # versioned
  machine_type="n1-standard-4",
  min_replicas=2, max_replicas=40,         # autoscales
  traffic_split={"7": 90, "8": 10},        # canary 10%
)
# + Model Monitoring watches drift & skew
# + rollback = flip the traffic split. that's it.
‹ ›
drag the handle ↔ to compare

The left side is a model. The right side is a system. Vertex AI is the machine that gets you from one to the other without you building all of that yourself.

TL;DR — section 2

3. The model is ~10% of the work. The other 90% is why projects die.

This isn't a vibe — it's the most quietly influential diagram in machine learning. In Google's 2015 paper Hidden Technical Debt in Machine Learning Systems, the actual ML code is a tiny black box in the center of a sprawling diagram of everything else: data collection, feature extraction, serving infrastructure, configuration, monitoring, process management. The famous line: only a small fraction of a real-world ML system is the ML code.

Vertex AI is Google selling you that surrounding 90% pre-built. Drag the slider and watch where a project's time actually goes.

// tinker

On a real ML project, how much is the model vs. everything around it?

12team-months
On a 12-month project, very roughly 10.8 months go to data, serving, and monitoring — the plumbing. The model itself is the rest. That ~90% is exactly the surface Vertex AI ships pre-built, and exactly where unmanaged projects bleed out.
Illustrative, not a measured constant. The anchor is Google's 2015 'Hidden Technical Debt in ML Systems' paper, whose Figure 1 draws the ML code as a small box dwarfed by surrounding infrastructure — the 90% here is that diagram's spirit, not a number from the paper. (And ignore the viral '87% of ML projects never reach production' stat — it traces back to a 2017 column with no source. The real survey figure is Gartner's: only about half of models make it from prototype to production.)

And the way projects die is rarely dramatic. A fraud model launches at 94% and six months later is quietly missing fraud it used to catch. Nobody changed a line of code. The world changed — the data drifted away from what the model trained on — and nothing was watching. That failure mode has a name, and a dedicated Vertex AI surface (Model Monitoring) that exists for exactly it.

// quiz · guess first

A fraud model hit 94% in testing. Six months after launch it's silently missing fraud it used to catch. Nobody touched the code. What most likely happened?

TL;DR — section 3

So what is Vertex AI, in one sentence?

It's the part of "AI" that isn't the AI. The unified platform that takes a model — yours or a rented one from Model Garden — and does the unglamorous, decisive work of turning it into something that runs in production and keeps running:

  • Model Garden — rent an engine: Gemini, Gemma, Imagen, Claude, Llama. Don't build it from scratch.
  • Workbench / Feature Store / Training — build and feed the thing.
  • Registry / Pipelines / Evaluation — version it, automate it, prove it's better.
  • Endpoints / Monitoring — serve it at scale and notice when it starts to lie.

Gemini gets the headlines because an engine is exciting. But you can't drive an engine. Vertex AI is the boring, expensive, load-bearing everything-else that turns it into a car — and that's the part most companies underestimate right up until their model quietly stops working in production.

// poll

What's actually been the hardest part of shipping ML for you?

loading…

Next time someone shows you "our AI," ask the question that separates a demo from a product: what happens to it in week three? The answer tells you whether they built a model, or built the factory around it.

What do you push back on?

// leave a comment

Push back. Tell me what I got wrong.

Held for admin review. Real email required (we verify MX).
loading comments…
// THIS POST · TELEMETRYBOOTING
Who's reading this.
0
reads
0
unique readers
0
countries
0
pinned
no logins · no names · just where the click came from · refreshed every 60s