So this sounds like an application layer approach, maybe just shy of a replit or...

mhamann · 2025-10-07T18:03:01 1759860181

I think that enterprises and small businesses alike need stuff like this, regardless of whether they're software companies or some other vertical like healthcare or legal. I worked at IBM for over a decade and it was always preferable to start with an open source framework if it fit your problem space, especially for internal stuff. We shipped products with components built on Elastic, Drupal, Express, etc.

You could make the same argument for Kubernetes. If you have the cash and the team, why not build it yourself? Most don't have the expertise or the time to find/train the people who do.

People want AI that works out of the box on day one. Not day 100.

rgthelen · 2025-10-07T18:00:59 1759860059

Yeah, that’s a fair framing — it is kind of an “application layer” for AI orchestration, but focused on ownership and portability instead of just convenience.

Yeah, the beachead will be our biggest issue - where to find first hard-core users. I was thinking legal (they have a need for AI, but data cannot leave their servers), healthcare (same as legal, but more regualtions), and government (not right now, but normally have deep pockets).

What do you think is a good starting place?

johnthecto · 2025-10-08T21:26:26 1759958786

Like your ideas around legal and Healthcare. Both sectors where you've got interest and money. Another way might be to look at partnering with a service org who does transformation/modernization as a way to accelerate delivery. Maybe MSPs? They're always trying to figure out how to lean out.

An idea might be to try and get a vertical sooner rather than later. The only thing better than an interested lawyer would be a selection of curated templates and prompts designed by people in the industry for example. So you get orchestration and industry-specific aligned verts. Much easier sell than a general purpose platform. But then you're fighting with the other vertically integrated offerings.

Maybe there are other differentiators? If this is like bedrock for your network, maybe the angle is private models where you want them. Others are doing that though, so there's pretty active competition there as well.

The more technical and general the audience the more you're going to have to talk them out of just rolling openwebui themselves.

rgthelen · 2025-10-09T03:09:24 1759979364

Yeah, totally fair. The “horizontal orchestration” story only goes so far — at some point you need vertical depth.

We’re starting with regulated enterprises (defense, healthcare, legal, fintech) where control and compliance actually matter. The same YAML-defined system can run in AWS, in a hospital, or fully air-gapped — no lock-in, no data leaving.

We’re building a few sample recipes to show it in action:

Legal: doc analysis + precedent search with local vector DBs

Healthcare: privacy-preserving RAG over clinical notes

Industrial/Defense: offline sensor or alert agents that sync later

Partnering with MSPs and modernization firms seems like the obvious path — they already have the relationships and budgets, and LlamaFarm gives them something repeatable to deploy.

Still figuring that out though — what’s the best way to actually get into the MSP funnel?

Unheard3610 · 2025-10-08T22:34:28 1759962868

Follow on, great question (separately from John) can it run on Vulkan? He was saying he has a hard time finding stuff that doesn’t only run on CUDA or ROCm and sees this as a huge opportunity

rgthelen · 2025-10-09T03:07:38 1759979258

Yes! Vulcan support is in dev (will be in the next versioned release, probably tomorrow).

You can pull down the repo and run a few easy commands to get inference up and running.

https://docs.llamafarm.dev/docs/models#lemonade-runtime

Unheard3610 · 2025-10-08T22:10:47 1759961447

Yeah! Or maybe the GPU rental and lab market? (At least those that aren’t already fully hyperscaler captured?) like we were chatting about John?

rgthelen · 2025-10-09T03:14:07 1759979647

We are adding continuous model fine-tuning soon, and being able to bring extra horsepower into training is an opportunity. There is also a genuine opportunity to do the same thing with shared resources inside an internal server or VPC; timing and utilizing resources, such as GPUs, during off-hours to train, improve, etc., is a never-recovered opportunity that many enterprises leave on the table.

Any GPU that is not being used at 80% capacity needs to be put to work; we have a lot of work that can be done. (A lot of industries cannot lease their GPUs to the public due to regulatory issues).