Model fine-tuning¶
Fine-tuning in the LLM Mesh specializes a pre-trained model to perform better on a specific task or domain. It requires annotated data: prompts and their expected completions.
Setup¶
You need full outgoing Internet connectivity for downloading the models. Air-gapped setups are not supported.
Your admin must create a LLM connection with fine-tuning enabled. Fine-tuning is supported for OpenAI and Local Hugginface connections
Using the Fine-tuning recipe¶
Note
The LLM fine-tuning recipe is available to customers with the Advanced LLM Mesh add-on
Standard usage¶
Import a dataset with two required columns : a prompt column (the input of the model) and a completion column (the ideal output). These columns must not contain missing values.
Optionally, the input dataset can include a system message column used to explain the task for a specific row. This column can contain missing values.
Run the recipe to obtain a fine-tuned model, ready for use in your LLM Mesh.
Advanced usage¶
The fine-tuning recipe also supports a validation dataset as input.
When present, the loss graph in the model summary will show the evolution of the loss evaluated against the validation dataset during the fine-tuning.
Additional remarks¶
When fine-tuning a Local Huggingface model, the recipe will use the code environment defined at the connection level. Its container configuration can be set in the recipe settings. It is strongly advised to use a GPU to fine-tune Hugginface models.
In all cases, the fine-tune recipe will not apply the guardrails defined in the connection (e.g PII detection won’t be done)
Using Python code¶
Besides the visual fine-tuning recipe above, you can also fine-tune a LLM using Python code.