Hacker Newsnew | past | comments | ask | show | jobs | submit | more pornel's commentslogin

In Cloudflare's case, Rust is much more productive.

Rust modules can handle any traffic themselves, instead of a split between native code and Lua that is too slow to do more than config (Lua is relatively fast for a scripting language, but on the critical path it was a "peanut butter" slowdown adding latency).

At the size and complexity of a server serving 20% of the Web, Lua's dynamic typing was scary. Modules in Rust can enforce many more requirements.

Rust's solid dependency management also helps share implementations and config logic across modules and even different products, instead of everything having to go literally through the same server.

https://blog.cloudflare.com/20-percent-internet-upgrade/


> instead of a split between native code and Lua that is too slow to do more than config

In case you’re not aware, Cloudflare ran on LuaJIT (not just for config) until not that long ago.

https://news.ycombinator.com/item?id=23856875


Five years ago is a pretty long time, and kornel does (or did) work at Cloudflare.


This is more nuanced in Rust's case.

Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.


> Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

GNU coreutils first shipped in what, the 1980s? It's so old that it would be very hard to find the first commit. Whereas uutils is still beta software which didn't ask to be representative of "Rust", at all. Moreover, GNU coreutils are still sometimes not compatible with their UNIX forebears. Even considering this first, more modest standard, it is ridiculous to hold this software to it, in particular.


You would not be able to find the first commit. The repositories for Fileutils, Shellutils, and Texutils do not exist, at least anywhere that I can find. They were merged as Coreutils in 2003 in a CVS repository. A few years later, it was migrated to git.

If anyone has original Fileutils, Shellutils, or Textutils archives (released before the ones currently on GNU's ftp server), I would be interested in looking at them. I looked into this recently for a commit [1].

[1] https://www.mail-archive.com/coreutils@gnu.org/msg12529.html


In this case I agree. Small, short-running programs that don't need to change much are the easy case for C, and they had plenty of time to iron out bugs and handle edge cases. Any difficulties that C may have caused are a sunk cost. Rust's advantages on top of that get reduced to mostly nice-to-haves rather than fixing burning issues.

I don't mean to tell Rust uutils authors not to write a project they wanted, but I don't see why Canonical was so eager to switch, given that there are non-zero switching costs for others.


>OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.

Yeah, that's such a tired take. If anything this shows how good Rust's guarantees are. We had a bunch of non-experts rewrite a sizable number of tools that had 40 years of bugfixes applied. And Canonical just pulled the rewritten versions in all at once and there are mostly a few performance regressions on edge cases.

I find this such a great confirmation of the Rust language design. I've seen a few rewrites in my career, and it rarely goes this smoothly.


It might be a bit of bad publicity for those who want to rewrite as much as possible in Rust. While Rust is not to blame, it shows that just rewriting something in Rust doesn't magically make it better (as some Rust hype might suggest). Maybe Ubuntu was a bit too eager in adopting the Rust Coreutils, caring more about that hype than about stability.


> Rust is not to blame

Isn't that an unfalsifiable statement until the coreutils get written in another language and can be compared?


> Isn't that an unfalsifiable statement

Sounds pretty axiomatic: Rust is not to blame for someone else's choice to ship beta software?


> OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs

That is the narative that rust fanboys promote. AFAIK rust could be usefull for a particular kind of bugs (memory safety). Rust programs can also have coding errors or other bugs.


>That is the narative that rust fanboys promote.

Strawmanning is not a good look.


People in Europe don't have the automatic anti-regulation sentiment that US has. Regulations, at least from consumer perspective, seem to be working pretty well in the EU.

- My mobile operator wanted to charge me $6/MB for data roaming, until the anti-business EU regulation killed the golden goose. Roaming is free across EU. The mobile operator is still in business.

- USB-C not just on iPhone, but also all the crappy gadgets that used to be micro-USB. Consumer prices on electronics probably rose by $0.01 per unit.

- Chip & pin and NFC contactless payments were supported everywhere many years before ApplePay adopted them. European regulators forced banks to make fraud their problem and cooperate to fix it.

- The card payment system got upgraded despite card interchange fees being legally capped to ~0.3%. The bureaucrats killed an innovative business model of ever-increasing merchant fees given back to card owners as cashback, which made everyone else paying the same prices with cash the suckers subsidising the card businesses.

- Apple insinuates they only give 1 year of warranty, but it magically becomes 2 years if you remind them they're in the EU.


> - Apple insinuates they only give 1 year of warranty, but it magically becomes 2 years if you remind them they're in the EU.

3 actually, if bought after 2021


> How do we boost innovation and deliver cutting-edge products to Europe while navigating complex and untested new rules?

Ask your lawyers? They can parkour through the most complex laws when you need European tax loopholes.


What if you just tried making products that don’t seem to violate the rules on their face?

It’s not like the DMA outlaws software. It deals with certain practices, and makes certain business models pretty untenable.

But it doesn’t just ban everything.


It's hard to find good data sources for this, especially that StackOverflow is in decline[1].

IEEE's methodology[2] is sensible given what's possible, but the data sources are all flawed in some ways (that don't necessarily cancel each other out). The number of search results reported by Google is the most volatile indirect proxy signal. Search results include everything mentioning the query, without promising it being a fair representation of 2025. People using a language rarely refer to it literally as the "X programming language", and it's a stretch to count all publicity as a "top language" publicity.

TIOBE uses this method too, and has the audacity to display it as a popularity with two decimal places, but their historical data shows that the "popularity" of C has dropped by half over two years, and then doubled next year. Meanwhile, C didn't budge at all. This method has a +/- 50% error margin.

[1]: https://redmonk.com/rstephens/2023/12/14/language-rankings-u... [2]: https://spectrum.ieee.org/top-programming-languages-methodol...


By far the most useful and helpful is job ads: it literally defines the demand side of the programming language market.

Yes, that does not show us how much code is running out there, and some companies might have huge armies with very low churn and so the COBOL stacks in banks don’t show up, but I can’t think of a more useful and directly measurable way of understanding a languages real utility.


> the most useful and helpful is job ads

That would certainly be the case, if it were not for the fact that [fake job postings][1] are a thing.

[1]: https://globalnews.ca/news/10636759/fake-job-postings-warnin...


Is there a reason to believe this would skew results?

i.e. Are you assuming (insinuating) jobs for some programming languages are more likely to be fake


I would assume so. I expect there to be a lot of job postings looking for more "sexy" technologies to create the visage that those companies are growing and planning towards the future. And conversely I wouldn't expect any job postings of old "streets behind" technologies like COBOL to be fake, as they wouldn't help with such signalling.


Yes to your point, COBOL which ranks very low here is still fundamental to the infrastructure of several major industries, with some sources [1] reporting that it is used in:

43% of all banking systems.

95% of all US ATM transactions.

80% of all in-person credit card transactions.

96% of travel bookings.

This may very well dramatically change in the next few years with such an emphasis on enterprise AI tools to rewrite large COBOL repositories. [2]

[1] https://www.pcmag.com/articles/ibms-plan-to-update-cobol-wit...

[2] e.g. Blitzy https://paper.blitzy.com/blitzy_system_2_ai_platform_topping...


I can only speak to the two bigger German banks (i.e., Sparkasse and VR banks), but if you look at their outsourced development providers (Atruvia and Sparkasse Informatik), they're still offering incentives for their apprentices to learn COBOL, especially in the german dual apprenticeship programs which they can steer more easily than university courses. My wife has been doing COBOL for one of them since 2012, and the demand has never diminished. If anything, it's increased because experienced developers are retiring. They even pull some of these retired developers back for particularly challenging projects.


Sparkasse and VR aren't the two largest German banks. DB is at least double the size of Commerzbank which is again 100mn in assets ahead of DZ. I don't find it all that surprising that these small banks are still trying to keep their legacy systems alive, but it's not the case for the bigger boys. (Source: work for several of them)


You are right if we only talk about assets. Should've clarified I meant more in regards of retail customers and branches.


Oh, right, consumer banks. Yes I can imagine they're all extremely legacy bound. They're a very small percentage of banking, though.


Cobol is used in pretty much all enterprise legacy systems.

But "used in" doesn't mean that it's actively being developed by more then a tiny team for maintaining it.

As this graph we're commenting on is mostly talking about popularity/most used it's never going to rate higher, because for every one Cobol dev there are more then 100 Java devs employed by the same company


That's a pretty wild claim. What's legacy for you? I'd consider legacy e.g J2EE crap running on web[sphere|logic] as holding most of the points in that league table vs COBOL.


A legacy software to me is whatever the company that employs me says is said legacy software.

Pretty much every business I've worked at to date has had such legacy software, which was inevitably still used in some contexts.

It's not always obvious, because - following with the previous example numbers - only 1-2 Java devs will have to interact with the legacy software again, hence from the perspective of the remaining 98, Cobol doesn't exist anymore.


If they're talking about Cobol, it's usually systems originating before the early 90s that haven't been completely rewritten.

J2EE would be late 90s and 2000s.


In retail banking I'm sure that this could be true. Working in investment banking, I never saw a single COBOL application, or had to have my C++/Java/$MODERNLANGUAGE code interact with one.


Corp bank here, everyone has rumours about COBOL systems but no one I've ever spoke to has seen, interacted or has any other evidence these really exist anymore either.


Me neither.

But I asked for a bank statement from my old savings account a few years old and it took two weeks to print out, printed in monospace dot matrix.

Or the betting company that I was a customer that suspends betting everyday 6:30am for an hour for daily maintainance. Ironically, they would accept bets for football matches played at the time, but the system was shut down.

I suspect both are run on COBOL.


You haven’t seen or heard them because they are abstracted away by APIs, circuit breakers and proxies. Almost ALL banks, credit card companies, travel systems and other high throughput transaction systems run on mainframe that is written in COBOL.


I think the issue here is that people working in fintech don't seem to come across these systems much, if at all - if you know one specifically, please tell us.


It's still there at the accounting/backend level. Automated Financial Systems Level 3 and it's replacement Vision are commercial loan systems.

LVL3 is pure cobol. It has been recently deprecated but there are many banks who own the code and are still self hosting it, along with it's IBM green screen support.

Vision is a java front end in front of an updated cobol backend. When your reputation is based on your reliability and long term code stability, at what point do you risk making the conversion, versus training new developers to work on your system.

https://www.linkedin.com/jobs/view/business-analyst-afs-visi...


No, we are not afraid of our own systems. The idea that there is some fabled computer system which everyone is too scared to touch doesn’t exist (I work in payment processing). There are levels of controls way outside these systems which provide these safety nets (e.g settlement / reconciliation controls).

If the cobol is still there, it’s not due to risk. If anything, the cobol is a much higher operational risk than replacing it.


Analogously, GDSes like SABRE still ran on mainframes until very recently (c. 2023) [0]. SABRE was written in some combination of assembly and some kind of in-house dialect of PL/I, if I recall.

[0] https://www.theregister.com/2022/04/11/gds_gets_over_histori...


I worked briefly at a company that wrote applications that interacted with bank mainframes. Think end point bank teller systems and in branch customer/account management. They definitely do exist - every major bank has a mainframe written in (usually) cobol.

But it's very abstracted, part of our main product offering WAS abstracting it. On top of our ready to use applications, we offered APIs for higher-level data retrieval and manipulation. Under the hood, that orchestrates mainframe calls.

But even then that there could be more level of abstractions. Not every bank used screen-level mainframe access. Some used off the shelf mainframe abstractors like JxChange (yes, there's a market for this).

Fintech would be even more abstracted, I imagine. At that point you can only interact with the mainframe a few levels up, but it's still there. Out of sight.


Yeah when I worked in investment banking it was VBA and Java everywhere, never saw or heard of COBOL.


    > Working in investment banking, I never saw a single COBOL application
What was the back office settlement or wire transfer system written in? There is a good chance that some part of them was written in COBOL. And while Bloomberg terminals are a vendor product, for a bloody long time, many of their screens had some COBOL.

Also, lots of quantitative software at i-banks use LINPACK or BLAS, which use FORTRAN.


Well, I had a very badly specified project to write a library for our back office systems to do Swift payments from our C++ applications, via COM. There was no obvious COBOL involved, on either side, but it has to be said that the whole use case for the library was very murky. And it never worked, due to the lack of spec, not the languages.


First hand knowledge: ERGO and MunichRE both have a lot of cobol still doing the core business. You will most likely never run into the system because they just run batch jobs - sometimes configured via a “nice” web UI… you configure your job, submit and the next morning you have your report… that’s why you never actually see COBOL.


1. Not all roles are advertised. I've actually only been interviewed for two of the jobs I've ever had, both at the same place - my current employer because it's a public institution and so it always advertises and interviews for jobs even if it has an internal candidate who is likely to be a good fit. In fact the first of those jobs was basically my shape on purpose, another candidate was an equally good fit and they hired both of us.

Everywhere else people hired me because they knew who I was and what I could do and so in place of an "interview" maybe I grab lunch with some people I know and they explain what they want and I say yeah that sounds like a job I'd take and maybe suggest tweaks or focus changes. No shortlist of candidates, no tech interview, no tailoring a CV to match an advert. Nothing -> Lunch or Drinks -> Job offer.

So that can cause some distortion, especially for the niche languages where there are like six experts and you know them - an advert is futile there.


> measurable way of understanding a languages real utility

It feels like that metric misses "utility" and instead comes from a very American (or capitalistic maybe is better) mindset.

What about Max/MSP/Jitter? Huge impact in the music scene, probably has very small amount jobs available, so it'd rank fairly low while it's probably the top media/music language out there today. There are tons of languages that provide "the most utility for their domain" yet barely have any public job ads about them at all.

I think such metric would be useful to see the "employability of someone who knows that language" if anything, but probably more pain than gain to link "# of job ads" with "utility".


Thinking about how to measure this properly, why not just the moving average of daily downloads over 30 days from each repository?

… yes CI would be a lot of these downloads, but it’s at least a useful proxy


Yeah except job adverts have enormous lag behind what's actually popular. For example we used Rust quite a lot at my previous company but we didn't advertise for Rust developers at all.

Also then you're looking at which languages were popular in the past whereas the interesting stat is which languages are being used to start new projects.


Interesting might not be the same as useful.

If I'm trying to figure out which language to learn next, knowing what I can get paid for might be more useful, even if it's not that "interesting".

If lots of projects are starting up in Rust, but I can't get interviews because nobody is advertising, how useful is learning Rust?


Well, we have to define what a language's popularity mean. Because Rust is surely more 'hyped', than Java, but Java has at least an order of more developers/software written, etc.

So in which meaning do you use 'popular'?


Ideally we'd like to know both, as they tell us different things.


    > we used Rust quite a lot at my previous company but we didn't advertise for Rust developers at all.
How did you find Rust developers when you needed to hire?


Find developers. Tell them they get to use Rust. You now have Rust developers.


Find a CS program that teaches Rust and hire their graduates.


Existing C++ developers learned Rust.


Plus TIOBE had Perl enter the top 10 suddenly this year but I do not see any new developers. And Ada too! Where are all those Ada programmers?



It's more like people in golf suits agree on corruption schemes rather than actual devs making decisions.


Which nonetheless reveals it is more relevant than others.


https://pkgstats.archlinux.de/packages?compare=ada,gcc,go,ja...

Ada seems pretty popular on Arch

This data is kinda worthless for popularity contests, since they may get picked up by aur packages, but this gives a solid insight into wich languages are foundational

I wish the same was available for other distros

You can do the same with docker images

    curl -s https://hub.docker.com/v2/repositories/library/python/ | jq -r ".pull_count"
    8244552364

    curl -s https://hub.docker.com/v2/repositories/library/golang/ | jq -r ".pull_count"
    2396145586

    curl -s https://hub.docker.com/v2/repositories/library/perl/ | jq -r ".pull_count"
    248786850

    curl -s https://hub.docker.com/v2/repositories/library/rust/ | jq -r ".pull_count"
    102699482


"Top Languages" doesn't mean "better" nor does it mean "best"


That’s a C++ URL parser library, has nothing to do with the programming language.


You want gcc-ada for the programming language.


>>Perl enter the top 10 suddenly this year but I do not see any new developers.

Perl is almost as active as Javascript. And more useful than Python.

https://metacpan.org/recent

I write Perl to do all sorts of thing every week. Its strange its not in the top 5 list.


You're joking right?


If you look at programming language list- Apart from Python, Java. Most are targeted to specific platforms(databases, browsers, embedded systems) or tech(SQL for database).

The general purpose programming languages today are still- Python, Java, and Perl. Make whatever of this you will.

Larry Wall at one point said, if you make something very specific to a use case(like awk, sed, php etc), it sort of naturally starts to come out of general purpose use.

Its just that Kotlin, Rust, Go, SQL, Julia, SQL, Javascript etc. These are not general purpose programming languages.


That was an active debate ... 15 years ago


Yep. And the sources are too often self-reinforcing and self-referential.

Use the "right"/better tool from the toolbox, the tool you know best, and/or the tool that the customer wants and/or makes the most money. This might include Ada[0] or COBOL[1]. Or FORTH[2] or Lua[3]. Popularity isn't a measure of much of anything apart from SEO.

0. https://www2.seas.gwu.edu/~mfeldman/ada-project-summary.html

1. https://theirstack.com/en/technology/cobol

2. https://dl.acm.org/doi/pdf/10.1145/360271.360272

3. https://www.freebsd.org/releases/12.0R/relnotes/#boot-loader


> It's hard to find good data sources for this

I like this:

https://madnight.github.io/githut/#/pull_requests/2024/1

It gives you a count of public repos on GitHub by language used, going back to 2012.


this is much better and aligns with the stackoverflow survey "what are you working on in your free time"


Perhaps the best source would now be the statistics of LLM queries, if they were available.

Edit: I see they raise this point at length themselves in TFA.


The sad thing is that Apple seemed more inviting to developers before they got high on the App Store cut.

Every boxed Mac OS X came with a second disc containing the SDK (Xcode has always been an unstable cow, tho). They used to publish tech notes that explained how the OS works, rather than WWDC videos with high-level overviews that feel more like advertisements.

Back then they've at least made attempts to use some open standards, and allowed 3rd parties to fill gaps in the OS, instead of acting like a Smaug of APIs.


Because they were coming out of being at the edge of bankruptcy and needed any help they could get becoming profitable again.

My graduation thesis was porting a visualisation framework from NeXTSTEP into Windows, Objective-C => C++, because my supervisor saw no future on keeping the NeXT hardware in our campus, if he only knew what would happen a few years later.


Apple has always resented 3rd party developers, right back to the Apple 2 days. They saw them as capturing value that they had created.


They said it in the Epic vs Apple litigation, something along the lines of "we create the entire App Store market", like the 3rd party developers aren't.


Video with ELI5-level overviews that feel like an ad - these do sell. A disk with SDK does not.


Properties of a language shape the tooling and culture that develops around it.

JS has exploded in popularity when Internet Explorer was still around, before ES6 cleanup of the language. JS had lots of gotchas where seemingly obvious code wasn't working correctly, and devs weren't keeping up with all the dumb hacks needed for even basic things. Working around IE6's problems used to be a whole profession (quirksmode.org).

Browsers didn't have support for JS modules yet, and HTTP/1.1 couldn't handle many small files, so devs needed a way to "bundle" their JS anyway. Node.js happened to have a solution, while also enabled reusing code between client and server, and the micro libraries saved developers from having to deal with JS engine differences and memorize all the quirks.


In other languages people build abstraction libraries for that, like apache portable runtime that gives you a consistent api for most things you need to build a web-server, using just one dependency. That would also save you needing to memorise all the micro libraries needed to work around the relevant quirks.

Splitting it into a library per quirk seems like an unforced error in that context.

One could have excused it as being a way to keep the code size down, but if you use npm you also usually use a build step which could drop the unused parts, so it doesn't really hold water


All of the differences are attributable to the language.

The Apache runtime isn't sent over the network every time it's used, but JS in the browser is.

JS ecosystem has got several fix-everything-at-once libraries. However, JS is very dynamic, so even when "compiled", it's very hard to remove dead code. JS compiling JS is also much slower than C compiling C. Both of these factors favor tiny libraries.

Abstractions in JS have a higher cost. Even trivial wrappers add overhead before JIT kicks in, and even then very few things can be optimized out, and many abstractions even prevent JIT from working well. It's much cheaper to patch a few gaps only where they're needed than to add a foundational abstraction layer for the whole app.


I have a vague memory of using dead code removing javascript toolchains pretty long ago, like the closure compiler

I'm not sure the dependency tree madness actually translates to smaller code in the end either, given the bloat in the average web app... but to be fair it would be perfectly plausible that javascript developers opted for micro-libraries motivated by performance, even if it wasn't the effect of that decision.


The press release doesn't give any concrete numbers, but if it doubles efficiency of Peltier coolers, it's still 3-5× less efficient than heat pumps.

Thermoelectric cooling is notable for not having any moving parts and ability to scale down to small sizes, so it might end up having many specialized applications, but for A/C heat pumps are already very effective.


And what about service life? I had a mini-fridge that used this technology, and it stopped working after about 2 years. Was that just bad luck or poor quality, or some inherent lifetime of the components?


In principle peltier elements should be very robust over time, as a solid state system where the only moving parts are fans (versus traditional refrigeration which includes a high pressure pump...).

In practice I strongly suspect most peltier based systems are built very cheaply... because their inefficiency means the majority of the market is bordering on a scam. Sophisticated consumers aren't going to be buying very many fridges built with them (of course you might have a niche use case where they actually make sense and you're willing to pay for a quality product, but do most purchasers?).


Thermal cycles is murder on rigid electronic connections; the mechanical connection between the heatsink on each side of the peltier cell being a prime example.


Rust is over 10 years old now. It has a track record of delivering what it promises, and a very satisfied growing userbase.

OTOH static analyzers for C have been around for longer than Rust, and we're still waiting for them to disprove Rice's theorem.

AI tools so far are famous for generating low-quality code, and generating bogus vulnerability reports. They may eventually get better and end up being used to make C code secure - see DARPA's TRACTOR program.


The applicability of Rice's theorem with respect to static analysis or abstract interpretation is more complex than you implied. First, static analysis tools are largely pattern-oriented. Pattern matching is how they sidestep undecidability. These tools have their place, but they aren't trying to be the tooling you or the parent claim. Instead, they are more useful to enforce coding style. This can be used to help with secure software development practices, but only by enforcing idiomatic style.

Bounded model checkers, on the other hand, are this tooling. They don't have to disprove Rice's theorem to work. In fact, they work directly with this theorem. They transform code into state equations that are run through an SMT solver. They are looking for logic errors, use-after-free, buffer overruns, etc. But, they also fail code for unterminated execution within the constraints of the simulation. If abstract interpretation through SMT states does not complete in a certain number of steps, then this is also considered a failure. The function or subset of the program only passes if the SMT solver can't find a satisfactory state that triggers one of these issues, through any possible input or external state.

These model checkers also provide the ability for user-defined assertions, making it possible to build and verify function contracts. This allows proof engineers to tie in proofs about higher level properties of code without having to build constructive proofs of all of this code.

Rust has its own issues. For instance, its core library is unsafe, because it has to use unsafe operations to interface with the OS, or to build containers or memory management models that simply can't be described with the borrow checker. This has led to its own CVEs. To strengthen the core library, core Rust developers have started using Kani -- a bounded model checker like those available for C or other languages.

Bounded model checking works. This tooling can be used to make either C or Rust safer. It can be used to augment proofs of theorems built in a proof assistant to extend this to implementation. The overhead of model checking is about that of unit testing, once you understand how to use it.

It is significantly less expensive to teach C developers how to model check their software using CBMC than it is to teach them Rust and then have them port code to Rust. Using CBMC properly, one can get better security guarantees than using vanilla Rust. Overall, an Ada + Spark, CBMC + C, Kani + Rust strategy coupled with constructive theory and proofs regarding overall architectural guarantees will yield equivalent safety and security. I'd trust such pairings of process and tooling -- regardless of language choice -- over any LLM derived solutions.


Sure it's possible in theory, but how many C codebases actually use formal verification? I don't think I've seen a single one. Git certainly doesn't do anything like that.

I have occasionally used CBMC for isolated functions, but that must already put me in the top 0.1% of formal verification users.


It's not used more because it is unknown, not because it is difficult to use or that it is impractical.

I've written several libraries and several services now that have 100% coverage via CBMC. I'm quite experienced with C development and with secure development, and reaching this point always finds a handful of potentially exploitable errors I would have missed. The development overhead of reaching this point is about the same as the overhead of getting to 80% unit test coverage using traditional test automation.


You're describing cases in which static analyzers/model checkers give up, and can't provide a definitive answer. To me this isn't side-stepping the undecidability problem, this is hitting the problem.

C's semantics create dead-ends for non-local reasoning about programs, so you get inconclusive/best-effort results propped up by heuristics. This is of course better than nothing, and still very useful for C, but it's weak and limited compared to the guarantees that safe Rust gives.

The bar set for Rust's static analysis and checks is to detect and prevent every UB in safe Rust code. If there's a false positive, people file it as a soundness bug or a CVE. If you can make Rust's libstd crash from safe Rust code, even if it requires deliberately invalid inputs, it's still a CVE for Rust. There is no comparable expectation of having anything reliably checkable in C. You can crash stdlib by feeding it invalid inputs, and it's not a CVE, just don't do that. Static analyzers are allowed to have false negatives, and it's normal.

You can get better guarantees for C if you restrict semantics of the language, add annotations/contracts for gaps in its type system, add assertions for things it can't check, and replace all the C code that the checker fails on with alternative idioms that fit the restricted model. But at that point it's not a silver bullet of "keep your C codebase, and just use a static analyzer", but it starts looking like a rewrite of C in a more restrictive dialect, and the more guarantees you want, the more code you need to annotate and adapt to the checks.

And this is basically Rust's approach. The unsafe Rust is pretty close to the semantics of C (with UB and all), but by default the code is restricted to a subset designed to be easy for static analysis to be able to guarantee it can't cause UB. Rust has a model checker for pointer aliasing and sharing of data across threads. It has a built-in static analyzer for memory management. It makes programmers specify contracts necessary for the analysis, and verifies that the declarations are logically consistent. It injects assertions for things it can't check at compile time, and gives an option to selectively bypass the checkers for code that doesn't fit their model. It also has a bunch of less rigorous static analyzers detecting certain patterns of logic errors, missing error handling, and flagging suspicious and unidiomatic code.

It would be amazing if C had a static analyzer that could reliably assure with a high level of certainty, out of the box, that a heavily multi-threaded complex code doesn't contain any UB, doesn't corrupt memory, and won't have use-after-free, even if the code is full of dynamic memory (de)allocations, callbacks, thread-locals, on-stack data of one thread shared with another, objects moved between threads, while mixing objects and code from multiple 3rd party libraries. Rust does that across millions lines of code, and it's not even a separate static analyzer with specially-written proofs, it's just how it works.

Such analysis requires code with sufficient annotations and restricted to design patterns that obviously conform to the checkable model. Rust had a luxury of having this from the start, and already has a whole ecosystem built around it.

C doesn't have that. You start from a much worse position (with mutable aliasing, const that barely does anything, and a type system without ownership or any thread safety information) and need to add checks and refactor code just to catch up to the baseline. And in the end, with all that effort, you end up with a C dialect peppered with macros, and merely fix one problem in C, without getting additional benefits of a modern language.

CBMC+C has a higher ceiling than vanilla Rust, and SMT solvers are more powerful, but the choice isn't limited to C+analyzers vs only plain Rust. You still can run additional checkers/solvers on top of everything Rust has built-in, and further proofs are easier thanks to being on top of stronger baseline guarantees and a stricter type system.


If we mark any case that might be undecidable as a failure case, and require that code be written that can be verified, then this is very much sidestepping undecidability by definition. Rust's borrow checker does the same exact thing. Write code that the borrow checker can't verify, and you'll get an error, even if it might be perfectly valid. That's by design, and it's absolutely a design meant to sidestep undecidability.

Yes, CBMC + C provides a higher ceiling. Coupling Kani with Rust results in the exact same ceiling as CBMC + C. Not a higher one. Kani compiles Rust to the same goto-C that CBMC compiles C to. Not a better one. The abstract model and theory that Kani provides is far more strict that what Rust provides with its borrow checker and static analysis. It's also more universal, which is why Kani works on both safe and unsafe Rust.

If you like Rust, great. Use it. But, at the point of coupling Kani and Rust, it's reaching safety parity with model checked C, and not surpassing it. That's fine. Similar safety parity can be reached with Ada + Spark, C++ and ESBMC, Java and JBMC, etc. There are many ways of reaching the same goal.

There's no need to pepper C with macros or to require a stronger type system with C to use CBMC and to get similar guarantees. Strong type systems do provide some structure -- and there's nothing wrong with using one -- but unless we are talking about building a dependent type system, such as what is provided with Lean 4, Coq, Agda, etc., it's not enough to add equivalent safety. A dependent type system also adds undecidability, requiring proofs and tactics to verify the types. That's great, but it's also a much more involved proposition than using a model checker. Rust's H-M type system, while certainly nice for what it is, is limited in what safety guarantees it can make. At that point, choosing a language with a stronger type system or not is a style choice. Arguably, it lets you organize software in a better way that would require manual work in other languages. Maybe this makes sense for your team, and maybe it doesn't. Plenty of people write software in Lisp, Python, Ruby, or similar languages with dynamic and duck typing. They can build highly organized and safe software. In fact, such software can be made safe, much as C can be made safe with the appropriate application of process and tooling.

I'm not defending C or attacking Rust here. I'm pointing out that model checking makes both safer than either can be on their own. As with my original reply, model checking is something different than static analysis, and it's something greater than what either vanilla C or vanilla Rust can provide on their own. Does safe vanilla Rust have better memory safety than vanilla C? Of course. Is it automatically safe against the two dozen other classes of attacks by default and without careful software development? No. Is it automatically safe against these attacks with model checking? Also no. However, we can use model checking to demonstrate the absence of entire classes of bugs -- each of these classes of bugs -- whether we model check software written in C or in Rust.

If I had to choose between model checking an existing codebase (git or the Linux kernel), or slowly rewriting it in another language, I'd choose the former every time. It provides, by far, the largest gain for the least amount of work.


People innately admire difficult skills, regardless of their usefulness. Acrobatic skateboarding is impressive, even when it would be faster and safer to go in a straight line or use a different mode of transport.

To me skill and effort is misplaced and wasted when it's spent on manually checking invariants that a compiler could check better automatically, or implementing clever workarounds for language warts that no longer provide any value.

Removal of busywork and pointless obstacles won't make smart programmers dumb and lazy. It allows smart programmers to use their brainpower on bigger more ambitious problems.


These type comments always remind me that we forget where we come from in terms of computation, every time.

It's important to remember Rust's borrow checker was computationally infeasible 15 years ago. C & C++ are much older than that, and they come from an era where variable name length affected compilation time.

It's easy to publicly shame people who do hard things for a long time in the light of newer tools. However, many people who likes these languages are using them longer than the languages we champion today were mere ideas.

I personally like Go in these days for its stupid simplicity, but when I'm going to do something serious, I'll always use C++. You can fight me, but never pry C++ from my cold, dead hands.

For note, I don't like C & C++ because they are hard. I like them because they provide a more transparent window the processor, which is a glorified, hardware implemented PDP-11 emulator.

Last, we shall not forget that all processors are C VMs, anyway.


> It's important to remember Rust's borrow checker was computationally infeasible 15 years ago.

The core of the borrow checker was being formulated in 2012[1], which is 13 years ago. No infeasibility then. And it's based on ideas that are much older, going back to the 90s.

Plus, you are vastly overestimating the expense of borrow checking, it is very fast, and not the reason for Rust's compile times being slow. You absolutely could have done borrow checking much earlier, even with less computing power available.

1: https://smallcultfollowing.com/babysteps/blog/2012/11/18/ima...


> It's important to remember Rust's borrow checker was computationally infeasible 15 years ago.

IIRC borrow checking usually doesn't consume that much compilation time for most crates - maybe a few percent or thereabouts. Monomorphization can be significantly more expensive and that's been much more widely used for much longer.


> It's important to remember Rust's borrow checker was computationally infeasible 15 years ago. C & C++ are much older than that, and they come from an era where variable name length affected compilation time.

I think you're setting the bar a little too high. Rust's borrow-checking semantics draw on much earlier research (for example, Cyclone had a form of region-checking in 2006); and Turbo Pascal was churning through 127-character identifiers on 8088s in 1983, one year before C++ stream I/O was designed.

EDIT: changed Cyclone's "2002" to "2006".


I remember, I was there in the 1980's coding, hence why I know C and C++ were not the only alternatives, rather the ones that eventually won in the end.


> the processor, which is a glorified, hardware implemented PDP-11 emulator.

This specific seems like just gratuitously rewriting history.

I can get how you'd feel C (and certain dialects of C++) are "closer to the metal" in a certain sense: C supports very few abstractions and with fewer abstractions, there are less "things" between you and "the metal". But this is as far as it goes. C does not represent - by any stretch of imagination - an accurate computational model or a memory of a modern CPU. It does stay close to PDP-11, but calling modern CPUs "glorified hardware emulators of PDP-11" is just preposterous.

PDP-11 was an in-order CISC processor with no virtual memory, cache hierarchy, branch prediction, symmetric multiprocessing and SIMD instruction. Some modern CPUs (namely the x86/x64 family of CPUs) do emulate a CISC ISA on that is probably more RISC-like, but that's as far we can say they are trying to behave like a PDP-11 (even though the intention was to behave like a first-gen Intel Pentium).


> we shall not forget that all processors are C VMs

This idea is some 10yrs behind. And no, thinking that C is "closer to the processor" today is incorrect

It makes you think it is close which in some sense is even worse


> This idea is some 10yrs behind.

Akshually[1] ...

> And no, thinking that C is "closer to the processor" today is incorrect

THIS thinking is about 5 years out of date.

Sure, this thinking you exhibit gained prominence and got endlessly repeated by every critic of C who once spent a summer doing a C project in undergrad, but it's been more than 5 years that this opinion was essentially nullified by

    Okay, If C is "not close to the process", what's closer?
Assembler? After all if everything else is "Just as close as C, but not closer", then just what kind of spectrum are you measuring on, that has a lower bound which none of the data gets close to?

You're repeating something that was fashionable years ago.

===========

[1] There's always one. Today, I am that one :-)


Standard C doesn't have inline assembly, even though many compilers provide it as an extension. Other languages do.

> After all if everything else is "Just as close as C, but not closer", then just what kind of spectrum are you measuring on

The claim about C being "close to the machine" means different things to different people. Some people literally believe that C maps directly to the machine, when it does not. This is just a factual inaccuracy. For the people that believe that there's a spectrum, it's often implied that C is uniquely close to the machine in ways that other languages are not. The pushback here is that C is not uniquely so. "just as close, but not closer" is about that uniqueness statement, and it doesn't mean that the spectrum isn't there.


> Some people literally believe that C maps directly to the machine, when it does not.

Maybe they did, 5 years (or more) ago when that essay came out. it was wrong even then, but repeating it is even more wrong.

> This is just a factual inaccuracy.

No. It's what we call A Strawman Argument, because no one in this thread claimed that C was uniquely close to the hardware.

Jumping in to destroy the argument when no one is making it is almost textbook example of strawmanning.


Claiming that a processor is a "C VM" implies that it's specifically about C.


Lots of languages at a higher level than C are closer to the processor in that they have interfaces for more instructions that C hasn't standardized yet.


> Lots of languages at a higher level than C are closer to the processor in that they have interfaces for more instructions that C hasn't standardized yet.

Well, you're talking about languages that don't have standards, they have a reference implementation.

IOW, no language has standards for processor intrinsics; they all have implementations that support intrinsics.


> Okay, If C is "not close to the process", what's closer?

LLVM IR is closer. Still higher level than Assembly

The problem is thus:

char a,b,c; c = a+b;

Could not be more different between x86 and ARM


> LLVM IR is closer. Still higher level than Assembly

So your reasoning for repeating the once-fashionable statement is because "an intermediate representation that no human codes in is closer than the source code"?


To me a compiler's effort is misplaced and wasted when it's spent on checking invariants that could be checked by a linter or a sidecar analysis module.


Checking of whole-program invariants can be accurate and done basically for free if the language has suitable semantics.

For example, if a language has non-nullable types, then you get this information locally for free everywhere, even from 3rd party code. When the language doesn't track it, then you need a linter that can do symbolic execution, construct call graphs, data flows, find every possible assignment, and still end up with a lot of unknowns and waste your time on false positives and false negatives.

Linters can't fix language semantics that create dead-ends for static analysis. It's not a matter of trying harder to make a better linter. If a language doesn't have clear-enough aliasing, immutability, ownership, thread-safety, etc. then a lot of analysis falls apart. Recovering required information from arbitrary code may be literally impossible (Rice's theorem), and getting even approximate results quickly ends up requiring whole-program analysis and prohibitively expensive algorithms.

And it's not even an either-or choice. You can have robust checks for fundamental invariants built into the language/compiler, and still use additional linters for detecting less clear-cut issues.


> Linters can't fix language semantics that create dead-ends for static analysis. It's not a matter of trying harder to make a better linter. If a language doesn't have clear-enough aliasing, immutability, ownership, thread-safety, etc. then a lot of analysis falls apart

this assertion is known disproven. seL4 is a fully memory safe (and also has even more safety baked in) major systems programming behemoth that is written on c + annotations where the analysis is conducted in a sidecar.

to obtain extra safety (but still not as safe as seL4) in rust, you must add a sidecar in the form of MIRI. nobody proposes adding MIRI into rust.

now, it is true that sel4 is a pain in the ass to write,compile+check, but there is a lot of design space in the unexplored spectrum of rust, rust+miri, sel4.


If the compiler is not checking them then it can't assume them, and that reduces the opportunities for optimizations. If the checks don't run on the compiler then they're not running every time; if you do want them to run every time then they may as well live in the compiler instead.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: