>> Codex (gpt-5-codex) on the other hand, does exactly what you ask, almost to a fault.
I've seriously tried gpt-5-codex at least two dozen times since it came out, and every single time it was either insufficient or made huge mistakes. Even with the "have another agent write the specs and then give it to codex to implement" approach, it's just not very good. It also stops after trying one thing and then says "I've tried X, tests still failing, next I will try Y" and it's just super annoying. Claude is really good at iterating until it solves the issue.
I've spent quite a bit of time with the normal GPT-5 in Codex (med and high reasoning), so my perspective might be skewed!
Oh, one other tip: Codex by default seems to read partial files (~200 lines at a time), so I make sure to add "Always read files in full" to my AGENTS.md file.
I've seriously tried gpt-5-codex at least two dozen times since it came out, and every single time it was either insufficient or made huge mistakes. Even with the "have another agent write the specs and then give it to codex to implement" approach, it's just not very good. It also stops after trying one thing and then says "I've tried X, tests still failing, next I will try Y" and it's just super annoying. Claude is really good at iterating until it solves the issue.