πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Instruction Tuning

🟒 Free Lesson

Advertisement

Instruction Tuning

Instruction Tuning PipelineBase ModelPre-trained LLMGeneral language skillsLacks instruction followingInstruction DataInstruction-Response PairsDiverse task formatsHuman-annotated qualityTraining ProcessSFT (Supervised Fine-Tuning)Cross-entropy loss on responsesInstruction-following model

What is Instruction Tuning?

Instruction tuning fine-tunes language models on datasets of instructions and their expected responses. This teaches models to follow human instructions accurately.

Instruction Tuning Data Format

instruction_data = [
    {
        "instruction": "Summarize the following article in 3 sentences.",
        "input": "Artificial intelligence has seen remarkable progress...",
        "output": "AI has advanced significantly in recent years..."
    },
    {
        "instruction": "Translate the following text to Spanish.",
        "input": "The weather is beautiful today.",
        "output": "El clima es hermoso hoy."
    },
    {
        "instruction": "Write a Python function to calculate factorial.",
        "input": "",
        "output": "def factorial(n):\n    if n == 0:\n        return 1\n    return n * factorial(n-1)"
    }
]

Training Implementation

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def instruction_tune(model_name, dataset, epochs=3, lr=2e-5):
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    optimizer = torch.optim.AdamW(model.parameters(), lr=lr)

    for epoch in range(epochs):
        model.train()
        total_loss = 0

        for example in dataset:
            # Format the instruction
            prompt = f"""### Instruction:
{example['instruction']}

### Input:
{example['input']}

### Response:
{example['output']}"""

            inputs = tokenizer(prompt, return_tensors="pt",
                             truncation=True, max_length=512)

            # Train only on the response part
            outputs = model(
                input_ids=inputs['input_ids'],
                attention_mask=inputs['attention_mask'],
                labels=inputs['input_ids']
            )

            loss = outputs.loss
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()

            total_loss += loss.item()

        print(f"Epoch {epoch+1}: Loss = {total_loss/len(dataset):.4f}")

    return model

Popular Instruction Tuning Datasets

DatasetSizeFocus
Alpaca52KGeneral instructions
Dolly15KDatabricks employees
OpenAssistant161KConversational
FLAN1.8MMulti-task

Evaluation

def evaluate_instruction_model(model, tokenizer, test_cases):
    results = []

    for test in test_cases:
        prompt = f"""### Instruction:
{test['instruction']}

### Input:
{test['input']}

### Response:"""

        inputs = tokenizer(prompt, return_tensors="pt")

        with torch.no_grad():
            outputs = model.generate(
                **inputs,
                max_new_tokens=256,
                temperature=0.7
            )

        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
        results.append({
            "instruction": test['instruction'],
            "expected": test['output'],
            "generated": response
        })

    return results

Summary

Instruction tuning transforms base language models into helpful assistants that follow human directions. It's a critical step in creating usable AI systems.

Next: We'll explore RLHF alignment techniques.

⭐

Premium Content

Instruction Tuning

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Generative AI Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement