Main takeaways
LLMs don’t read text—they process tokens, numerical representations of text chunks
Tokenisation breaks text into pieces; ~100 tokens ≈ 75 words; affects cost and limits
Embeddings convert tokens to vectors, capturing meaning in ~4,096 dimensions
The famous king − man + woman ≈ queen shows embeddings encode semantic relationships
Parameters (billions of them!) are the numbers learned during training
Hyperparameters like temperature let you control creativity vs. determinism
Attention mechanisms help models understand context and word relationships