Prompting and the Probability Sandwich

Jun 03, 2025

The issue with prompting is English (or any other human language) is nowhere near as rigid or strict a language as a programming language. Almost always an idea can be expressed much more succinctly in code than language.
Combine that with when you’re reading the code it’s often much easier to develop a prototype solution as you go and you end up with prompting feeling like using 4 men to carry a wheelbarrow instead of having 1 push it.

I think we are going to end up with a common design/code specification language that we use for prompting and testing. There's always going to be a need to convey the exact semantics of what we want — if not for AI then for the humans who have to grapple with what is made.

That was my response to a comment about using prompts to generate and manipulate code. I’ve been thinking about this a lot lately.

LLMs give us a lot of leeway. They can interpret and act on any prompt that remotely seems like it means something. This is fine, but what do we really want?

I look at AI workflows as probability sandwiches. One slice of bread is the input side and the other is the output side. Between them we have a lot of uncertainty.

Ideally, we’d like our sandwich to be like a compiler, giving us deterministic generation of our target code but LLMs are anything but deterministic. That’s both a plus and a minus. Problem-solving requires that sort of slight randomness but then we have the problem of knowing whether we’ve really made the right stuff.

On the input side, it would be great to start to standardize on a specification language with checkable syntax and defined semantics. This can reduce the number of misfires but even if it didn’t, it at least stands as a record of our intention — what we really expect the code to do.

If we’re clever about it, we can use these same specs for testing — the bread of our sandwich becomes much sturdier and the probability filling is kept under control.

I think this is a good way forward. An ideal spec language would let us specify tight constraints and areas where we don’t need as much rigor.

Now, imagine this. We have that spec language and we start to do reinforcement learning on the specs and validated code that has been generated from the specs. At that point, are we close to building a new abstraction layer?

Thoughts?

mechanisms

Discussion about this post