Pre-training an LLM is like a toddler. Only the Meta’s and OpenAI’s of the world are resourced enough (time and money) to create these. Pre-trained LLMs (or base models) have a lot of world knowledge baked into it considering the entire internet is thrown at it, but they aren’t good at following instructions.
Fine-tuning a pre-trained model is where we teach it a specific task. The model builds on its strong foundation of knowledge and learns a narrow task - like teaching it to decode legal jargon.
Fine-tuning needs a significant number of training examples showing it inputs-output pairs so it learns the task effectively. Understandably, this is time-consuming to create the dataset.
Instruction-tuning is a newer approach where we draft a set of instructions that describe how to perform a task.
Often, combining fine-tuning and instruction tuning results in more powerful and accurate models, by effectively teaching it a new domain followed by specific instruction-following capability.
In-context Learning is the technical (fancy) term for what we all do with the ChatGPT web app when we task it with doing things for us, essentially boiling down to prompt engineering. It could be zero-shot learning with just the instructions or few-shot learning where we include a few example to further specify the objective.
It is termed ‘in-context’ as we are specifying the task and examples within it’s context, via the prompt. In doing so, the model is learning about the new task without having to change any of its parameters as is the case with pre-training or fine-tuning.