Experiment 003: Vision-Model Creative Features in the MMM

The standard line on creative measurement inside a Marketing Mix Model is that creative is unmeasurable. The MMM tells you the channel effect; the creative inside that channel is treated as a black box, absorbed into the channel coefficient. The CMO defending the budget gets the channel answer. The creative team gets nothing useful out of the same model. That assumption made sense when extracting structured features from a video or static asset required a team of researchers and a coding rubric. It does not make sense now.

This experiment puts that assumption on a clock. If we extract structured creative features with a vision-language model — hook archetype, pacing, claim density, brand-asset prominence — and feed them in as covariates alongside spend, can we drop the model's residual variance enough to claim creative is doing measurable work? If the answer is yes, "creative is unmeasurable" goes from received wisdom to a 2022 belief. If the answer is no, that is also useful: it tells us where the boundary actually sits.

Hypothesis

Vision-extracted creative features explain a measurable share of MMM residual variance.

The hypothesis is narrow on purpose. Take a 12-month YouTube and Search dataset where the MMM has already been fit cleanly. Score every creative asset on roughly 20 structured features via a vision-language model — hook archetype, pacing, claim density, brand-asset prominence, on-screen-talent presence, scene cuts per second, copy density, and so on. Refit the same MMM with the new features included as covariates alongside the existing spend variables. Compare the refit's R² and channel-level effect sizes against the baseline model.

The prediction is that R² rises by a non-trivial amount (5–15 percentage points), and that some channel coefficients shrink — meaning a piece of what we were calling "the YouTube channel effect" was actually a creative-quality effect that happened to live inside YouTube. That is the result that matters. Not the headline R² lift, but the redistribution between channel and creative coefficients. That redistribution is what makes the model usable for creative decisions, not just channel decisions.

Method

Score the assets. Refit the model. Compare effect sizes.

The method has three steps. First, score every creative asset in a 12-month window on ~20 structured features via a vision-language model — running once per asset, with a defined rubric so the same asset always scores the same way. Second, refit the existing MMM with those features as covariates, holding the rest of the spec constant. Third, compare R² and the channel-level effect sizes against the baseline. The publishable artifact is the side-by-side: R² before and after, channel coefficients before and after, the new creative coefficients with their confidence intervals.

Required dataset

12+ months of clean spend and outcome data, with creative-asset-level metadata linking each impression to an identifiable creative.

What the vision model produces

~20 structured features per asset: hook archetype, pacing, claim density, brand-asset prominence, talent presence, scene cuts, copy density, and similar.

What gets refit

The same MMM spec the partner is already running, with the creative-feature covariates added alongside spend. No other spec changes.

What gets published

R² before/after, channel coefficients before/after, new creative coefficients with confidence intervals. Partner-anonymized.

"That 'creative is unmeasurable' is a 2022 belief. The same MMM that proves channel value should be proving creative value too — and the CMO who runs that model first gets a real conversation with the CFO."

Why it matters

The CMO who proves creative value inside the MMM gets a real seat at the CFO conversation.

The reason this experiment is worth running has very little to do with the model and almost everything to do with the organizational dynamic the model creates. Today, the CMO defending creative spend in a CFO conversation is using language the CFO does not trust — awards, brand-tracker scores, qualitative panels. The CMO defending channel spend in the same conversation is using language the CFO does trust — MMM coefficients, incrementality tests, ROAS. The asymmetry is what keeps creative budgets first-on-the-block when the cycle tightens.

If the same MMM that proves channel value can also prove creative value — even partially — the conversation shifts. The CMO who runs that model first has a defensible answer to "what is the creative actually doing for us." The CMO who waits is still arguing with the wrong vocabulary in 2027. The question is not whether the model is perfect. It is whether the model is good enough to change the conversation. That is a low bar relative to what is technically possible right now.

What I need to run this

A partner with 12+ months of clean media + creative metadata.

Looking for a brand or agency with at least twelve months of stable media spend, an existing MMM (Meridian, Robyn, or commercial), and creative-level metadata that lets every impression be linked back to a specific asset. I'll do the vision-model scoring, the refit, and the analysis. The partner keeps the data; the published writeup is fully anonymized with figures rescaled. Happy to share the rubric and full method ahead of time. Reply to any newsletter issue to start the conversation, or reach me directly if we already know each other.

Status: queued, partner wanted. The hypothesis, method, and rubric are ready; the next step is a dataset to run them against. Subscribers to Stay Sharp get the full writeup before it's indexed.

Re-running last quarter's MMM with vision-model creative features as covariates.

Vision-extracted creative features explain a measurable share of MMM residual variance.

Score the assets. Refit the model. Compare effect sizes.

The CMO who proves creative value inside the MMM gets a real seat at the CFO conversation.

A partner with 12+ months of clean media + creative metadata.

Get the experiments first.