Main takeaways
Three paradigms: Supervised (labels), unsupervised (no labels), reinforcement (rewards)
Supervised learning is most common: Learn from labelled examples to predict
The bias-variance trade-off is something to always keep in mind: Simple models underfit, complex models overfit
Unsupervised learning discovers hidden structure: Clustering, dimensionality reduction
Reinforcement learning learns from interaction: Exploration vs exploitation
RLHF powers modern LLMs like ChatGPT: RL to align with human preferences