In-Context Learning
What is In-Context Learning?
In-context learning (ICL) is the ability of large language models to learn new tasks from examples provided in the prompt, without updating model weights. This emergent capability allows models to adapt to new tasks at inference time.
How ICL Works
The Mechanism
# In-context learning example
icl_prompt = """Translate English to French:
English: Hello -> French: Bonjour
English: Goodbye -> French: Au revoir
English: Thank you -> French:"""
# The model infers the translation task from examples
# and generates: "Merci"
Demonstration Selection
Implementing ICL
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
class InContextLearner:
def __init__(self, model, tokenizer):
self.model = model
self.tokenizer = tokenizer
def select_demonstrations(self, query, example_pool, k=5):
"""Select most similar examples to the query."""
query_embedding = self.encode(query)
example_embeddings = [self.encode(ex) for ex in example_pool]
similarities = [
cosine_similarity(query_embedding.reshape(1, -1),
emb.reshape(1, -1))[0][0]
for emb in example_embeddings
]
top_k_indices = np.argsort(similarities)[-k:][::-1]
return [example_pool[i] for i in top_k_indices]
def encode(self, text):
inputs = self.tokenizer(text, return_tensors="pt",
padding=True, truncation=True)
with torch.no_grad():
outputs = self.model(**inputs, output_hidden_states=True)
embedding = outputs.hidden_states[-1][:, 0, :].numpy()
return embedding
def predict(self, query, example_pool, k=5):
demonstrations = self.select_demonstrations(query, example_pool, k)
prompt = self.build_prompt(demonstrations, query)
inputs = self.tokenizer(prompt, return_tensors="pt")
outputs = self.model.generate(**inputs, max_new_tokens=50)
return self.tokenizer.decode(outputs[0], skip_special_tokens=True)
def build_prompt(self, demonstrations, query):
prompt = ""
for demo in demonstrations:
prompt += f"Input: {demo['input']}\nOutput: {demo['output']}\n\n"
prompt += f"Input: {query}\nOutput:"
return prompt
Factors Affecting ICL
| Factor | Impact | Best Practice |
|---|---|---|
| Number of examples | More = better (up to a point) | 4-8 examples typically optimal |
| Example order | Affects performance significantly | Put similar examples near query |
| Example quality | Noisy labels hurt performance | Use verified, correct examples |
| Label format | Consistent format helps | Use clear, consistent labels |
ICL vs Fine-tuning
| Aspect | ICL | Fine-tuning |
|---|---|---|
| Setup | Examples in prompt | Weight updates |
| Data needed | 2-10 examples | 100+ examples |
| Inference cost | Higher (longer prompts) | Lower |
| Adaptation speed | Instant | Minutes to hours |
| Performance ceiling | Good | Better |
Advanced ICL Techniques
# Self-ask: Model generates and answers sub-questions
self_ask_prompt = """Question: Is the Great Wall of China visible from space?
Follow-up question: Is the Great Wall of China very long?
Intermediate answer: Yes, it's over 13,000 miles long.
Follow-up question: Can you see objects that long from space?
Intermediate answer: Astronauts have reported seeing it under perfect conditions.
So, the final answer is: Yes, but only under perfect conditions."""
# Least-to-most: Decompose complex problems
least_to_most_prompt = """Problem: Calculate the area of a room that is 12 feet by 15 feet with a 3 foot by 4 foot closet.
Step 1: What are the dimensions of the main room?
Answer: 12 feet by 15 feet
Step 2: What is the area of the main room?
Answer: 12 x 15 = 180 square feet
Step 3: What are the dimensions of the closet?
Answer: 3 feet by 4 feet
Step 4: What is the area of the closet?
Answer: 3 x 4 = 12 square feet
Step 5: What is the total area?
Answer:"""
Summary
In-context learning enables rapid task adaptation without retraining. Understanding demonstration selection and prompt design is crucial for effective ICL.
Next: We'll explore retrieval-augmented generation.