|  |  | Writing an LLM from scratch, part 25 – instruction fine-tuning (gilesthomas.com) | 
|  | 2 points by gpjt 1 day ago  | past | discuss | 
|
|  |  | Writing an LLM from scratch, part 24 – the transcript hack (gilesthomas.com) | 
|  | 1 point by gpjt 2 days ago  | past | discuss | 
|
|  |  | Retro Language Models: Rebuilding Karpathy's RNN in PyTorch (gilesthomas.com) | 
|  | 1 point by ibobev 3 days ago  | past | discuss | 
|
|  |  | Writing an LLM from scratch, part 23 – fine-tuning for classification (gilesthomas.com) | 
|  | 1 point by ibobev 4 days ago  | past | discuss | 
|
|  |  | Retro Language Models: Rebuilding Karpathy's RNN in PyTorch (gilesthomas.com) | 
|  | 3 points by gpjt 6 days ago  | past | discuss | 
|
|  |  | Writing an LLM from scratch, part 23 – fine-tuning for classification (gilesthomas.com) | 
|  | 1 point by gpjt 8 days ago  | past | discuss | 
|
|  |  | Writing an LLM from scratch, part 22 – training our LLM (gilesthomas.com) | 
|  | 254 points by gpjt 15 days ago  | past | 10 comments | 
|
|  |  | Revisiting Karpathy's 'The Unreasonable Effectiveness of RNNs' (gilesthomas.com) | 
|  | 1 point by ibobev 18 days ago  | past | 
|
|  |  | Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks' (gilesthomas.com) | 
|  | 2 points by gpjt 20 days ago  | past | 
|
|  |  | Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com) | 
|  | 1 point by ibobev 22 days ago  | past | 
|
|  |  | Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com) | 
|  | 1 point by gpjt 23 days ago  | past | 
|
|  |  | Writing an LLM from scratch, part 20 – starting training, and cross entropy loss (gilesthomas.com) | 
|  | 41 points by gpjt 28 days ago  | past | 3 comments | 
|
|  |  | How Do LLMs Work? (gilesthomas.com) | 
|  | 2 points by gpjt 44 days ago  | past | 1 comment | 
|
|  |  | How Do LLMs Work? (gilesthomas.com) | 
|  | 1 point by ibobev 45 days ago  | past | 
|
|  |  | The maths you need to start understanding LLMs (gilesthomas.com) | 
|  | 616 points by gpjt 58 days ago  | past | 120 comments | 
|
|  |  | What AI chatbots are doing under the hood (gilesthomas.com) | 
|  | 2 points by gpjt 62 days ago  | past | 
|
|  |  | LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud (gilesthomas.com) | 
|  | 2 points by gpjt 73 days ago  | past | 
|
|  |  | The fixed length bottleneck and the feed forward network (gilesthomas.com) | 
|  | 1 point by gpjt 77 days ago  | past | 
|
|  |  | Writing an LLM from scratch, part 17 – the feed-forward network (gilesthomas.com) | 
|  | 8 points by gpjt 79 days ago  | past | 
|
|  |  | Writing an LLM from scratch, part 16 – layer normalisation (gilesthomas.com) | 
|  | 1 point by gpjt 3 months ago  | past | 
|
|  |  | Leaving PythonAnywhere (gilesthomas.com) | 
|  | 3 points by gpjt 4 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 15 – from context vectors to logits (gilesthomas.com) | 
|  | 7 points by gpjt 5 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 14 – the complexity of self-attention at scale (gilesthomas.com) | 
|  | 1 point by gpjt 5 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 13 – attention heads are dumb (gilesthomas.com) | 
|  | 351 points by gpjt 5 months ago  | past | 67 comments | 
|
|  |  | Writing an LLM from scratch, part 12 – multi-head attention (gilesthomas.com) | 
|  | 3 points by gpjt 6 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 11 – batches (gilesthomas.com) | 
|  | 2 points by gpjt 6 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 10 – dropout (gilesthomas.com) | 
|  | 90 points by gpjt 7 months ago  | past | 8 comments | 
|
|  |  | Adding /Llms.txt (gilesthomas.com) | 
|  | 1 point by gpjt 7 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 9 – causal attention (gilesthomas.com) | 
|  | 4 points by gpjt 7 months ago  | past | 
|
|  |  | Writing an LLM from scratch, part 8 – trainable self-attention (gilesthomas.com) | 
|  | 380 points by gpjt 8 months ago  | past | 31 comments | 
|
|
|  | More |