RAG - Retrieval-Augmented Generation
An LLM knows a lot of things. But nothing about recent events, your companies internal help articles, your personal files like resumes etc.
A RAG can be broken down into 👇
Retrieval - organize new information in vector format that is easy to run searches (semantic search)
augmented - for any user query, identify relevant chunks of information from the vector-store
generation - incorporate the newly retrieved information into a response to the user query
Technically, an LLM is only needed for the generation step, while vector embeddings and semantic search do the heavy-lifting in a RAG.
RLHF - Reinforcement Learning with Human Feedback
A base-model itself is “just” a text-completion model that can generate coherent and creative text, but doesn’t quite know how to follow instructions… as a good assistant (like ChatGPT) would.
With some assistance from human feedback, we can steer the model towards outputs that is directly following the instruction.
RLHF is a step in the fine-tuning stage where human-feedback on the base model’s text-completion outputs are incorporated back into model so it can learn to only generate the high quality text outputs.