More

shikhar · 2025-10-29T15:35:45 1761752145

Postgres is a way better fit than Kafka if you want a large number of durable streams. But a flexible OLTP database like PG is bound to require more resources and polling loops (not even long poll!) are not a great answer for following live updates.

Plug: If you need granular, durable streams in a serverless context, check out s2.dev

shikhar · 2025-09-30T17:13:14 1759252394

multiplayer wall (iykyk) for HN: https://s2.dev/demos/y-s2?room=neon-canyon-7857

shikhar · 2025-09-27T18:03:52 1758996232

Essentially CacheLib in Rust

> foyer draws inspiration from Facebook/CacheLib, a highly-regarded hybrid cache library written in C++, and ben-manes/caffeine, a popular Java caching library, among other projects.

https://github.com/foyer-rs/foyer

shikhar · 2025-09-27T17:34:10 1758994450

TIL https://kubernetes.io/blog/2025/03/25/swap-linux-improvement...

shikhar · 2025-09-27T16:48:30 1758991710

Foyer is a great open source contribution from RisingWave

We built a S3 read-through cache service for s2.dev so that multiple clients could share a Foyer hybrid cache with key affinity, https://github.com/s2-streamstore/cachey

erikcw · 2025-09-27T22:02:21 1759010541

This looks really useful! Am I correct that there isn’t an S3 compatible API, just the “fetch” API?

Being able to set an S3 client’s endpoint to proxy traffic straight through this would be quite useful.

shikhar · 2025-09-28T01:24:02 1759022642

Yes, currently it has its own /fetch endpoint that then makes S3 GET(s) internally. One potential gotcha depending on how you are using it, an exact byte "Range" header is always required so that the request can be mapped to page-aligned byte range requests on the S3 object. But with that constraint, it is feasible to add an S3 shim.

It is also possible to stop requiring the header, but I think it would complicate the design around coalescing reads – the layer above foyer would have to track concurrent requests to the same object.

shikhar · 2025-09-20T22:30:51 1758407451

Hi mertletee, I'd like to understand the request better, mind dropping me an email? It's in my profile

shikhar · 2025-09-20T12:20:56 1758370856

Check out ZeroFS (https://www.zerofs.net), which is using SlateDB (https://slatedb.io/)

ED: Now I catch your drift, it would indeed be cool. ZeroFS requires a commitment to the SlateDB LSM data format.

shikhar · 2025-09-20T12:10:44 1758370244

Yes, that's how we are running it at s2.dev, auto-scaled per-AZ deployments. https://www.reddit.com/r/databasedevelopment/comments/1nh1go...

shikhar · 2025-09-14T21:58:22 1757887102

Very cool! Maybe you'll consider turning it distributed by using s2.dev for the append-only event logs :)

Someone tried this with XTDB, https://github.com/chucklehead-dev/s2-log

shikhar · 2025-09-14T14:40:13 1757860813

How we run it:

Auto-scaled Kubernetes deployments, one for each availability zone, currently on m*gd instances which give us local NVMe. The pods are able to easily push GiBps with 1-2 CPUs used — network is the bottleneck so we made it a scaling dimension (thanks KEDA).

On the client side, each gateway process uses kube.rs to watch ready endpoints in the same zone as itself, and frequently polls /stats exposed by Cachey for recent network throughput as a load signal.

To improve hit rates with key affinity, clients use rendezvous hashing for picking a node, with bounded load (https://arxiv.org/abs/1608.01350) – if a node exceeds a predetermined throughput limit, the next choice for the key is picked.

We may move towards consistent hashing – it would be a great problem to have, if we needed so many Cachey pods in a zone that O(n) hashing was meaningful overhead! An advantage with the current approach is it does not suffer from the cascaded overflow problem (https://arxiv.org/abs/1908.08762).