After My Own Heart
Went back in my draft backlog and found this gem from 2020: Unit Testing is Overrated.
In the age of AI-generated code, it feels even more applicable. When a model writes unit tests (especially when it does it in view of the code it’s testing) they’re at risk of being overfitted to the functions under test. They may indeed prove software executes as it’s written, but that has little to do with proving the software meets requirements (for example, Kiro had created hundreds of perfectly passing tests for this project).
The key takeaways from the article are all worth sticking in your coding agent instructions, because without explicit directives, LLMs are probably biased to do the opposite of these recommendations given the weight of training data pushing so-called “best practices.”
In particular, I can see value in using a separate (perhaps even adversarial) agent/model to write the tests. It’ll be less biased on the context used to write the code, and it can be instructed to “aim at the highest level of integration while maintaining reasonable speed and cost.”




