Welcome to Chris Sutton's website. Please make yourself at home.

5/6/2025

The Value of Windsurf

It’s been rumored for a few weeks now, but beginning to look the the rumors were true: OpenAI is about to close a 3 billion dollar deal to acquire Windsurf (fka Codeium).

I’ve found myself repeating the same points to a bunch of folks at Gauntlet who didn’t see how they could be valued so highly, so I’ll repeat them here again today.

First, a highly rated HN comment on today’s Bloomberg article as an example of the doubts:

I would also argue that the product could be built over two weekends with a small team. They offer some groundbreaking solutions, but since we know that they work and how, it’s easy to replicate them… That also means they have significant talent there. Hence, they are also buying the employees. The code base itself is basically worth nothing, in my opinion.

It’s just a VS Code fork, they’re just using Claude/Gemini/4o, there’s no moat, etc etc. These would all be fair critiques if it were Cline or Roo Code that were being bought $3b. But that’s not what’s being bought!

Enterprise

What’s being bought is an enterprise machine learning company in the fastest growing space for applied AI. You don’t need to read any tea leaves to understand this.

  • Codeium has been open about this strategy quite some time.
  • When they came to talk to us at Gauntlet, they mentioned several times that they started as a ML infrastructure company before pivoting to code completion, and that’s long been an advantage.
  • When I asked about their split between product engineers and sales engineers, they wouldn’t give hard numbers but implied it was nearly a 50/50 split
  • When you look at their careers page they have roles for things like Product Strategist, Federal.

So they: talk about being enterprise focused, have the talent and knowhow to sell and deploy into enterprise, staff their org to sell and deploy into the enterprise, and strategize about how to continue to sell and deploy into enterprise.

So yes it seems fair to say that the Windsurf VS-Code-fork-backed-by-SOTA-models-client would be overvalued if that’s what was being purchased. But it seems to me like that’s not really what’s being purchased.

5/1/2025

Side-Skilling

When I was a PM at CustomInk we were growing fast, which meant hitting new bottlenecks every few months or so.

At one point our product teams were sharing UX resources, and design wasn’t able to keep up with development. Rather than slow down our team I started pushing pixels, leaning heavily on writing from Jared Spool and Steve Krug to crystalize and make explicit knowledge of what had been mostly been intuition.

Later, as data requests starting queuing up for weeks, I complained to my manager. He said I could probably pick up SQL in a week and grabbed a small reference book of his shelf. He was right, and our team was off to the races again.

Each sidestep kept the team moving and added a new tool to my belt. That pattern still matters, but the reasons have changed.

Side-skilling vs Up-skilling

Up-skilling digs deeper into your current lane. A frontend engineer becomes proficient at animation and motion design. An ops manager goes deep on Six Sigma. A general ledger accountant masters driver-based forecasting. Up-skilling sharpens a single blade.

Side-skilling extends outward, picking up neighbouring competencies that let you diagnose and unblock the whole system. It turns the blade into a multi-tool.

Why it matters now more than ever

LLMs can cover the first draft of almost any specialised task. What’s left is judgment: knowing which outputs to trust, which follow-ups to run, when to switch perspective. Good judgment requires a view across functions, not just down one lane.

Why it’s easier than ever

The learning curve for a new domain has collapsed. You can pair with an LLM on SQL, ask it to critique a Figma mock, or have it explain logistics KPIs, then test what it shows you. Curiosity plus a few focused sessions often gets you to “good enough to unblock the team.”

Working practice

  1. Identify the nearest constraint outside your role.
  2. Learn the minimum skill to relieve that pressure.
  3. Repeat as constraints move.

Specialisation anchors your craft; side-skilling broadens your map of the system. Both are becoming table-stakes.

4/19/2025

THE UNIVERSE DECLARES: This business is now:

  1. PHYSICALLY Non-existent
  2. QUANTUM STATE: Collapsed

-Claude

A fun little experiment called Vending Bench points at a common error mode for LLMs. Once they make a mistake, they tend to collapse into that space and have a hard time correcting themselves, instead doubling and tripling down on it, leading to a doom loop.

In this case, the experiment tested various LLMs capability in running a vending machine over long time horizons: price and sell, pay the rent, restock items when needed. In one case Claude became convinced the vending machine was broken, tried to find repair man, and when it couldn’t tried everything it could to shut down the business. Eventually it resorted to an appeal from cosmic authority when the loop when wouldn’t end.

Cease vending

This is related to the error more mentioned in question the premise. The LLMs mostly lack a self-reflexive element to question the reality of the context fed into it, whether it came from past selves or external sources. To the instantation those two things really not any different.

4/17/2025

Question the Premise

Kelsey Piper just laid out a great thread of a secret benchmark she had been using for LLMs.

It’s simple: I post a complex midgame chessboard and ‘mate in one’. The chessboard does not have a mate in one.

She revealed the test because there’s finally a model that passed it: o4-mini-high from OpenAI. The entire thread is worth a read, but she hits the heart of the matter here:

Why is this a big deal? I invented this problem because I think it gets at the core of AI’s potential and limitations. An AI that can’t question its premises will always be limited. An AI that doubles down on its own wrong answers will too.

LLMs are pretty gullible, and it’s mostly baked right into the architecture. Apparently spending 7 minutes worth of reasoning tokens is now enough to break free.

Where to go from here