Every AI program eventually hits the same executive question: should we build, buy, or partner. The wrong answer is usually the one that gets decided in a single staff meeting without a scorecard. I force the decision into criteria the business can defend later.

I score four dimensions for each option: capability fit, data and compliance fit, total cost of ownership, and operational control. Capability fit is the easy part. The other three decide whether the program survives contact with legal, finance, and production.

For vendor and model selection I run a short proof, not a bake-off theater. Same three use cases, same evaluation set, same latency and cost constraints, documented side by side. I publish the rubric before anyone sees results so the conversation stays about trade-offs, not preferences.

Build vs buy is rarely binary. The pattern I see work is buy the foundation, build the workflow and evaluation layer. Trying to own the entire stack without a multi-year platform mandate creates a program that looks innovative and runs permanently understaffed.

On the Rovo and Gemini alignment work at Atlassian, the program win was not picking a logo. It was defining the interface contract, the escalation path when the partner roadmap moved, and the rollback plan when a model behavior changed upstream. That is vendor evaluation for TPMs. Not features on a slide. Operating boundaries you can run.