Why Do You Need to Fine-tune Your Conversational LLM with 100’s (If Not 1,000’s) of Examples?

For Consistent LLM Answers, Fine-tune with Examples. LOTS of Examples

LLMs tend to be very creative and introduce diversity and creativity in answers.

That’s good for certain types of questions like:

What can you tell me about La Cibeles?
What gothic buildings should I visit in Madrid?

Questions that do not have a single obvious answer, questions that knowledgeable people might answer very differently, are great questions for a search-based approach like RAG.

For some other questions, the right answer is consistent and precise. In these cases, creativity can be flat-out wrong. Some good examples of these types of questions are:

What time does the Metropolitan Museum open?
Do you need tickets to visit The Cathedral? Can I buy the tickets online?
Who is the architect of Reina Sofia Museum? Does it have paintings by Picasso?
Is there underground service from Atocha to Barajas airport?

For these questions, excessive creativity may cause significant problems; creative answers are far less likely to be the correct answer. In a real life application, getting these questions wrong seriously undermines user confidence.

Does the Museum open at 9am or at 10am? Variability in this answer is risky.

A unique answer that is still consistent and precise is required.

To achieve this consistency in an LLM-based application, like a chatbot, a training dataset with hundreds of variations of these type of questions can help with the task. The dataset should contain:

Variations of the factual questions like:

What time does the Metropolitan Museum open?

What’s the schedule for the Metropolitan Museum

Is the Metropolitan Museum open on Mondays?

Example answers to be fed to the LLM
Optionally, some tagging about the linguistic rationale behind each variant: colloquial vs formal language, etc..

How many variants of the question are required to safely fine-tune the LLM and be sure that the question will be properly understood and answered? Our experimental trial, which you can see here.

Bitext provides an example of this type of dataset for Customer Support, with 3M tokens and 27,000 question-answer pairs (which you can find here), suggests that that number is a little under 1,000.

The dataset is freely available, including for commercial use. It can be used in real-life applications to check how effective additional training data is at preventing both hallucinations and excessively creative answers for factual questions.

admin

Next Can You Use GPT for CX Purposes? Yes, You Can »

Previous « Introducing a New Breed of Data to Fine-tune LLMs: Hybrid Datasets

Published by

admin

2 years ago

Fine-tuning LLM

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Customizing Large Language Models in 2 steps via fine-tuning is a very efficient way to…

10 months ago

Why Do You Need to Fine-tune Your Conversational LLM with 100’s (If Not 1,000’s) of Examples?

For Consistent LLM Answers, Fine-tune with Examples. LOTS of Examples

LLMs tend to be very creative and introduce diversity and creativity in answers.

Recent Posts

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction

Multilingual Named Entity Recognition for Knowledge Graphs: Supporting 70+ Languages with Precision

How LLM Verticalization Reduces Time and Cost in GenAI-Based Solutions

Integrating Bitext NAMER with LLMs

Bitext NAMER Cracks Named Entity Recognition

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Why Do You Need to Fine-tune Your Conversational LLM with 100’s (If Not 1,000’s) of Examples?

For Consistent LLM Answers, Fine-tune with Examples. LOTS of Examples

LLMs tend to be very creative and introduce diversity and creativity in answers.

Related Post

Recent Posts

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction

Multilingual Named Entity Recognition for Knowledge Graphs: Supporting 70+ Languages with Precision

How LLM Verticalization Reduces Time and Cost in GenAI-Based Solutions

Integrating Bitext NAMER with LLMs

Bitext NAMER Cracks Named Entity Recognition

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction