I keep hearing that LLMs are trained on "Internet crap" but is it true? For instance we know from Anthropic copyright case that they scanned millions of books to make a training set. They certainly use Internet content for training but I'm sure it's curated to a large degree. They don't just scrap random pages and feed into LLM.
> I keep hearing that LLMs are trained on "Internet crap" but is it true?
Karpathy repeated this in a recent interview [0], that if you'd look at random samples in the pretraining set you'd mostly see a lot of garbage text. And that it's very surprising it works at all.
The labs have focused a lot more on finetuning (posttraining) and RL lately, and from my understanding that's where all the desirable properties of an LLM are trained into it. Pretraining just teaches the LLM the semantic relations it needs as the foundation for finetuning to work.
Pretraining teaches LLMs everything. SFT and RL is about putting that "everything" into useful configurations and gluing it together so that it works better.
It is true. Datasets are somewhat cleaned, but only somewhat. When you have terabytes worth of text, there's only so much cleaning you can do economically.
We are talking about government-curated data here, the bias should be obvious. Popular LLMs still have huge bias problems but it would be way worse with only government-curated data.
I recently had to build a production-ready workflow in N8N - it ended up being a spaghetti flow of custom code nodes and custom http requests (because none of the provided connectors did exactly what we needed) that I was left wondering if this wouldn't be easier to code up in Cursor.
> we could solve this dead time problem and start doing things on the go
Interesting idea, but how do you address car safety concerns? Studies consistently show that cognitive distraction, even with voice interfaces, can significantly increase crash risk. Wouldn’t managing emails and calendars while driving still fall into that category?
Phone calls while driving are pretty clearly in the Overton window of 'safe things you can do in a car', and it doesn't seem a priori obvious to me that this is worse. Though I do agree with you that in an ideal world people wouldn't even take phone calls and would instead focus 100% of their effort on not killing me.
(cofounder here) Fair concern - cognitive distraction is real. We see it more like taking a phone call while driving (which people already do). We're purposely keeping interactions simple to make sure features aren't too distracting, and are working on a 'safe mode' that limits you to basic read-only operations while driving. We're actively researching attention management to make it simpler. Safety comes first.
There are so many times other than driving that voice is the preferred medium here. It feels like just one example. (And as others pointed out, taking a hands-free phone call during a drive is not at all provocative these days, to the point that it feels like an odd thing to fixate on personally.)
Given that they dented their call while doing the demo, it doesn't seem weird to fixate on. I think the criticism is valid, given the framing presented and the pitch used to investors.
Also, audio interfaces incur different amounts of mental load / distraction. I wouldn't be surprised if this was more distracting than just talking to a person.
I think the car dent story originally came from - https://news.ycombinator.com/item?id=45008239. As I clarified there - the dent happened before the April demo, during hackathon stress while taking out the car from parking.
But totally valid criticism about cognitive load - better example could be dog walking, cooking or screen free time.
> We identified a total of 1,862 MCP servers exposed to the internet. From this set, we manually verified a sample of 119. All 119 servers granted access to internal tool listings without authentication.
The tool listings are not necessarily a secret, so not sure how this is "exposed". We have a public MCP, anyone can read our tool listings, but to actually use the tools you need to authenticate.
We have multiple parts of the brain that interact in vastly different ways! Your cerebellum won't be running the role of the pons.
Most parts of the brain cannot take over for others. Self-healing is the exception, not the rule. Yes, we have a degree of neuroplasticity, but there are many limits.
Only if that was a singular system, however, it is not. [0]
For example... The nerve cells in your gut may speak to the brain, and interact with it in complex ways we are only just beginning to understand, but they are separate systems that both have control over the nervous system, and other systems. [1]
General Intelligence, the psychological theory, and General Modelling, whilst sharing words, share little else.
> Modern games provide much stronger feedback. Now, when you hit an enemy, you might see:
> the crosshair briefly changes to confirm the hit, damage numbers pop up above the enemy, sound effects, enemy death animations, a progress bar filling up, a new skill unlocked, random reward and more...
I wonder if we can gamify todo apps in the same way, most are too boring and too corporate. It should implement all gaming bells and whistles for ensuring you complete your tasks.
It is coming along more and more. But I think the core is being able to handle a lot more tasks, and therefore being able to easily break them down into smaller ones. That is really the heart of the game loop.
> or be in epistemically different worlds where international competition is irrelevant (eg clearance filters based in nationality, government and military, market that has exotic languages, etc).
I like this hedging strategy, can probably also apply this to the risk of AI taking over our jobs (licensed professions won’t be going away any time soon).
According to Dutch law, you lose your Dutch citizenship if you accept another nationality. The Dutch embassies (who are responsible for renewing Dutch passports abroad) are well aware of this law and have processes in place to refuse a passport renewal if you can’t provide proof of temporary residence in the country you reside in. The local institutions however, don’t have these processes in place and are generally not aware of this law because it only happens to a tiny little percentage of the population. And nobody updates the national registry with your new nationality because that’s the responsibility of local municipalities, not the Department of Foreign Affairs. So if you decide to simply renew your passport in the Netherlands instead of abroad, they’ll just give you a new passport because you’re still registered as a Dutch citizen at the local level and they don’t have a process in place to check your foreign nationality.
Don’t ask me how I know :) It is one of the few accountability sinks that doesn’t affect me negatively.
This is interesting but why do you make it so hard to view the actual agents.json file? After clicking around in the registry (https://wild-card.ai/registry) for 10 minutes I still haven't found one example.
But they were not trained on government-sanctioned homegrown EU data.
reply