AI

Fine-tuning

A practical guide to understanding what fine-tuning is and when it makes sense for AI product development.

What fine-tuning does to an AI model, when it is worth doing, and what product and design teams need to know before commissioning it.

22 May 20265 min read

What it is

is the of taking a pre-trained and training it further on a smaller, specific dataset to improve its performance on a particular task or domain.

Rather than training a from scratch — which requires enormous resources — adjusts the existing model's to better suit a specific context.

A general-purpose might be fine-tuned to write in a particular brand's tone, handle specific customer support queries, or generate content in a specialist domain like legal or medical writing.

The result is a that performs better on the target task than a general model prompted in the usual way.

is not always necessary. Many tasks can be handled effectively through well-designed without touching the itself.

When to use it

Understand when is worth the investment.

It is most relevant when:

A general model consistently underperforms on a specific task despite good prompting
You have a large volume of high-quality, labelled training examples
Consistent tone, style, or domain accuracy is critical
Latency or cost savings from shorter prompts would be valuable at scale
The task is well-defined and unlikely to change frequently

It is less relevant when:

Good prompt engineering can achieve the required results
Training data is limited, low quality, or difficult to label
The task or requirements change frequently
Budget and time for training and evaluation are constrained

Key takeaway

Fine-tuning is a serious investment. Before commissioning it, be confident that prompt engineering alone cannot solve the problem.

How it works

Understand the basic mechanism. works by continuing the training on a curated relevant to the target task.

The is exposed to many examples of the input-output pairs you want it to learn — for example, customer queries paired with ideal , or product descriptions paired with desired rewrites.

Through this , the 's parameters are adjusted to make it more likely to produce outputs that match the examples it has been shown.

The fine-tuned retains its general from the original training while becoming more reliable on the specific tasks it has been fine-tuned for.

What this means for designers and product teams. decisions start with . The quality, quantity, and of the training examples directly determine the quality of the fine-tuned model.

As a designer or product person, your most important contribution is defining what good output looks like and helping to curate or evaluate the .

Fine-tuned also need to be evaluated properly — just because a model has been fine-tuned does not mean it is reliable. Testing on unseen examples is essential.

What to look for

Focus on:

Training data quality — whether examples are accurate, consistent, and representative
Task definition clarity — whether the target behaviour is clearly specified
Evaluation rigour — whether the model is tested against real-world inputs after fine-tuning
Overfitting risk — whether the model has become too narrowly optimised and fails on variation
Ongoing maintenance — whether the training data and model will be kept current

Where it goes wrong

Most issues come from: A fine-tuned trained on poor will consistently produce poor outputs — at scale.

Insufficient or low-quality training data
Poorly defined success criteria for what good output looks like
No evaluation process after fine-tuning is complete
Overfitting to the training examples and failing on real variation
Treating fine-tuning as a one-time task rather than an ongoing process

What you get from it

Understanding gives you:

A clearer basis for deciding when fine-tuning is and is not the right approach
Better ability to contribute to the data curation and evaluation process
More realistic expectations about what fine-tuning can and cannot achieve
A stronger brief for working with engineers and ML teams

Key takeaway

Fine-tuning is powerful when the task is clear and the data is strong. It is expensive and slow when either of those conditions is not met.

FAQ

Common questions

A few practical answers to the questions that usually come up around this method.

What is fine-tuning in simple terms?

It is the of taking an existing and training it further on a specific set of examples to make it better at a particular task. Rather than building a new model from scratch, you are adjusting an existing one to suit your needs.

How is fine-tuning different from prompting?

Prompting guides an existing through instructions at the point of use. changes the model itself by training it on new examples. Prompting is faster and cheaper to iterate — fine-tuning is more powerful but requires significant time, , and cost.

How much data do you need to fine-tune a model?

It depends on the task and the , but quality matters more than quantity. A few hundred high-quality, well-labelled examples can be enough for some tasks. For complex or varied tasks, you may need thousands. Poorly labelled in large volumes will produce worse results than a smaller, well-curated set.

Is fine-tuning permanent?

A fine-tuned reflects the it was trained on at a point in time. If the task, content, or requirements change, the model will need to be re-evaluated and potentially retrained. It is not a one-time fix.

Do designers need to be involved in fine-tuning?

Yes, in the sense that defining what good output looks like and evaluating whether the fine-tuned meets user needs are design and product decisions. The technical training is for engineers and ML teams, but the quality criteria, the use cases, and the evaluation of real-world are squarely in the product and design domain.

Quick take

If a general AI model is not performing well enough for your specific use case, fine-tuning adapts it to your needs — but it is not always the right answer.

Related Services

LET'S WORK TOGETHER

Ready to improve your product?

UX, research and product leadership for teams tackling complex digital services. The work usually starts where things have become harder than they need to be: unclear journeys, inconsistent products, competing priorities, or teams trying to move forward without a clear direction. I help simplify the problem, shape the right next step, and turn complexity into something people can actually use.

Previous feedback

Will Parkhouse

Senior Content Designer

01/20