Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So this sounds like an application layer approach, maybe just shy of a replit or base44, with the twist that you can own the pipeline. While there's something to that, I think there are some further questions around differentiation that need to be answered. I think the biggest challenge is going to be the beachead: what client demographic has the cash to want to own the pipeline and not use SaaS, but doesn't have the staff on hand to do it?


I think that enterprises and small businesses alike need stuff like this, regardless of whether they're software companies or some other vertical like healthcare or legal. I worked at IBM for over a decade and it was always preferable to start with an open source framework if it fit your problem space, especially for internal stuff. We shipped products with components built on Elastic, Drupal, Express, etc.

You could make the same argument for Kubernetes. If you have the cash and the team, why not build it yourself? Most don't have the expertise or the time to find/train the people who do.

People want AI that works out of the box on day one. Not day 100.


Yeah, that’s a fair framing — it is kind of an “application layer” for AI orchestration, but focused on ownership and portability instead of just convenience.

Yeah, the beachead will be our biggest issue - where to find first hard-core users. I was thinking legal (they have a need for AI, but data cannot leave their servers), healthcare (same as legal, but more regualtions), and government (not right now, but normally have deep pockets).

What do you think is a good starting place?


Like your ideas around legal and Healthcare. Both sectors where you've got interest and money. Another way might be to look at partnering with a service org who does transformation/modernization as a way to accelerate delivery. Maybe MSPs? They're always trying to figure out how to lean out.

An idea might be to try and get a vertical sooner rather than later. The only thing better than an interested lawyer would be a selection of curated templates and prompts designed by people in the industry for example. So you get orchestration and industry-specific aligned verts. Much easier sell than a general purpose platform. But then you're fighting with the other vertically integrated offerings.

Maybe there are other differentiators? If this is like bedrock for your network, maybe the angle is private models where you want them. Others are doing that though, so there's pretty active competition there as well.

The more technical and general the audience the more you're going to have to talk them out of just rolling openwebui themselves.


Yeah, totally fair. The “horizontal orchestration” story only goes so far — at some point you need vertical depth.

We’re starting with regulated enterprises (defense, healthcare, legal, fintech) where control and compliance actually matter. The same YAML-defined system can run in AWS, in a hospital, or fully air-gapped — no lock-in, no data leaving.

We’re building a few sample recipes to show it in action:

Legal: doc analysis + precedent search with local vector DBs

Healthcare: privacy-preserving RAG over clinical notes

Industrial/Defense: offline sensor or alert agents that sync later

Partnering with MSPs and modernization firms seems like the obvious path — they already have the relationships and budgets, and LlamaFarm gives them something repeatable to deploy.

Still figuring that out though — what’s the best way to actually get into the MSP funnel?


Follow on, great question (separately from John) can it run on Vulkan? He was saying he has a hard time finding stuff that doesn’t only run on CUDA or ROCm and sees this as a huge opportunity


Yes! Vulcan support is in dev (will be in the next versioned release, probably tomorrow).

You can pull down the repo and run a few easy commands to get inference up and running.

https://docs.llamafarm.dev/docs/models#lemonade-runtime


Yeah! Or maybe the GPU rental and lab market? (At least those that aren’t already fully hyperscaler captured?) like we were chatting about John?


We are adding continuous model fine-tuning soon, and being able to bring extra horsepower into training is an opportunity. There is also a genuine opportunity to do the same thing with shared resources inside an internal server or VPC; timing and utilizing resources, such as GPUs, during off-hours to train, improve, etc., is a never-recovered opportunity that many enterprises leave on the table.

Any GPU that is not being used at 80% capacity needs to be put to work; we have a lot of work that can be done. (A lot of industries cannot lease their GPUs to the public due to regulatory issues).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: