Machine learning for Large Language Models has two phases: Pre-training & Fine-tuning
During pre-training, the computer is “left alone” with large amounts of text data (billions of pages of text) and an instruction to learn how to create human-like texts. This machine learning phase results in an AI model that possesses an internal mapping of how human language is structured, called a foundational model. This mapping is a high-dimensional table (more than the three dimensions we humans can traverse) that encodes all the different ways different elements of human language relate to each other. In this table, elements that are similar are closer together in the table than elements that are dissimilar. The table can then be used to train a neural network, which in turn can be used to generate rudimentary new texts. Therefore, after pre-training, the AI model can adequately complete the instruction it was given at the start of machine learning, but it is not yet optimized enough to be made available to the public.
During fine-tuning, the goal is to further optimize the model in order to better handle specific tasks or to generate specific outcomes. In this phase, human workers give the model feedback on its responses. This is needed for LLMs, because there are many different language tasks - chatting, summarizing, reporting, joking - which all differ in how a text should look. Furthermore, the foundational model can create false, hurtful, or dangerous statements. What types of feedback are given to the model depends on the ultimate use of the model and intentions of the developers (more on this later). Without the fine-tuning phase, the model output is less reliable and useful, and not fit for use by the general public.