AI Format Wars: Does the Shape of Your Prompt Matter? (1,080 Evals Later)
We tested 5 frontier models across 1,080 evaluations. JSON won. Short prompts won. Everything you thought you knew about prompt engineering is wrong.
Mar 22, 20265 min read22

Search for a command to run...
Series
We have evaluated how different prompts really affect model performance across many domains and 5 different models.
We tested 5 frontier models across 1,080 evaluations. JSON won. Short prompts won. Everything you thought you knew about prompt engineering is wrong.

80% start with "Act as an expert." 62% contain contradictory constraints. 14% emotionally threaten the AI. Here's the data.
