Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Serious question: If it's an improved 2.5 model, why don't they call it version 2.6? Seems annoying to have to remember if you're using the old 2.5 or the new 2.5. Kind of like when Apple released the third-gen iPad many years ago and simply called it the "new iPad" without a number.


That's why people called the second version of Sonnet v3.5 simply v3.6, and Anthropic acknowledged that by naming the next version v3.7


Only Anthropic has a slightly understandable version scheme.


It's pretty common to refer to models by the month and year they were released.

For example, the latest Gemini 2.5 Flash is known as "google/gemini-2.5-flash-preview-09-2025" [1].

[1]: https://openrouter.ai/google/gemini-2.5-flash-preview-09-202...


If they're going to include the month and year as part of the version number, they should at least use big endian dates like gemini-2.5-flash-preview-2025-09 instead of 09-2025.


Or, you know, just Gemini 2.6 Flash. I don't recall the 2.5 version having a date associated with it when it came out, though maybe they are using dates now. In marketing, at least, it's always known as Gemini 2.5 Flash/Pro.


It had a date, but I also agree this is extremely confusing. Even semver 2.5.1 would be clearer IMO.


It always had dates... They release multiple versions and update regularly. Not sure if this is the first 2.5 Flash update, but pretty sure Pro had a few updates as well...

This is also the case with OpenAI and their models. Pretty standard I guess.

They don't change the versioning, because I guess they don't consider it to be "a new model trained from scratch".


>For example, the latest Gemini 2.5 Flash is known as "google/gemini-2.5-flash-preview-09-2025" [1].

That "example" is the name used in the article under discussion. There's no need to link to openrouter.ai to find the name.


I'm pretty sure Google just does that for preview models and they drop the date from the name when it's released.


If only there was some of versioning nomenclature they could use. Maybe even one that is … semantic? Oh how I wish someone would introduce something like this to the software engineering field. /s

In all seriousness though, their version system is awful.


2.5 is not the version number, it's the generation of the underlying model architecture. Think of it like the trim level on a Mazda 3 hatchback. Mazda already has the Mazda 3 Sport in their lineup, then later they release the Mazda 3 Turbo which is much faster. When they release this new version of the vehicle its not called the Mazda 4... that would be an entirely different vehicle based on a new platform and powertrain etc (if it existed). The new vehicle is just a new trim level / visual refresh of the existing Mazda 3.

That's why Google names it like this, but I agree its dumb. Semver would be easier.


I’d say it’s more like naming your Operating System off of the kernel version number.


Gonna steal this to help explain to non tech friends when it comes up again.


Maybe they’re signalling it’s more of a bug fix?


2.5.1 then .

semantic versioning works for most scenarios.


Would that automatically roll over anyone pinging 2.5 via their API?


If you want role over then you could specify ^2.5.0 or 2.5.x if you want to pin then it would be 2.5.0

This is all solved for a long time now , llm vendors seems to have unlearnt versioning principles.

This is fairly typical - marketing and business wants different things to do with version number than what version number systems are good at .


I suspect Google doesn't want to have to maintain multiple sub-versions. It's easier to serve one 2x popular model than two models where there's flux between the load on each, since these things have a non-trivial time to load into GPU/TPU memory for serving.


Even if switching quickly was a challenge[1], they are using these models in their own products not just selling them in a service, the first party applications could quite easily adapt to this by switching quickly to the available model and freeing up the in-demand one.

This is the entire premise behind the cloud, the reason it was Amazon did it first, they had the largest workloads at the time before Web 2.0 and SaaS was a thing.

Only businesses with large first party apps succeeded in the cloud provider space, companies like HP, IBM all failed and their time to failure strongly correlated to their amount of first party apps they operated. i.e. These apps anyway needed to keep a lot of idle capacity for peak demand capacity they could now monetize and co-mingle in the cloud.

LLMs as a service is not any different from S3 launched 20 years ago.

---

[1] It isn't, at the scale they are operating these models it shouldn't matter at all, it is not individual GPUs or machines that make a difference in load handling at all. Only few users are going to explicitly pining a specific patch version for the rest they can serve either one that is available immediately or cheaply.


That would be even more confusing because then it is unclear whether 2.6 Flash is better than 2.5 Pro.


Is a 2024 Mac boo pro better than a 2025 Mac book?


Good question




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: