Isn't SWE-bench based on public Github issues? Wouldn't the increase in performa...

killerstorm · 2025-06-03T01:18:12 1748913492

That sounds like a conspiracy theory. If it was just some mysterious benchmark and nothing else then sure, you have reasons to be skeptical.

But there's a plenty of people who actually tried LLMs for actual work and swear they work now. Do you think they are all lying?..

Many people with good reputation, not just noobs.