Improving Rasa’s results with artificial training data. Part I

The well-known Rasa chatbot-building platform is gaining weight day after day. But, in all platforms, chatbots are as good as their training material.

Rasa, as other chatbot platforms, still relies on manually written, selected and tagged query datasets. This is a time-consuming and error-prone process, hardly scalable or adaptable.

As everyone with bot training experience knows, it can take months to have enough content to be able to successfully train a conversational bot.

Linguistics-based Natural Language Generation (NLG) is Bitext’s solution to that problem. Bitext NLG solution takes as input a seed query, like “what’s your return policy?” and automatically produces query variants like “information about your return policy”, “tell me about your return policy”, “I want to know about your return policy”, and so on.

This provides a rich and consistent training dataset that is easy to integrate and free of manual errors. It will dramatically improve the NLU performance of your bot.

What are the advantages of this process? Bitext NLP framework is able to take your training set, extract each sentence’s intents and slots, and generate hundreds of variants for each sentence that keep the same meaning but are expressed in a different way.

All these sentences are returned correctly tagged with intents and slots, and come in the same format your bot will requires (the Rasa format).

If you build bots, you must trust process automation, so why wouldn’t you automate the AI training phase as well?

We have tested how Rasa can benefit from this approach, comparing a chatbot trained with a pack of hand-tagged sentences, and a second one trained with the thousands of sentences generated with no manual work via Bitext’s NLG.

Our tests show at least a 30% improvement in the tests done against Rasa when we add NLG variants to the bot’s training dataset.

Do you want to reproduce our test? You can ask for both our training sets and see how a Rasa training corpus can be vastly improved via Bitext’s NLG.

admin

Next Improving Rasa's results by 30% with artificial training data: Part II »

Previous « 6 keys to building a successful human-like chatbot

Fine-tuning LLM

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Customizing Large Language Models in 2 steps via fine-tuning is a very efficient way to…

1 year ago

Improving Rasa’s results with artificial training data. Part I

If you build bots, you must trust process automation, so why wouldn’t you automate the AI training phase as well?

Recent Posts

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction

Multilingual Named Entity Recognition for Knowledge Graphs: Supporting 70+ Languages with Precision

How LLM Verticalization Reduces Time and Cost in GenAI-Based Solutions

Integrating Bitext NAMER with LLMs

Bitext NAMER Cracks Named Entity Recognition

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Improving Rasa’s results with artificial training data. Part I

If you build bots, you must trust process automation, so why wouldn’t you automate the AI training phase as well?

Related Post

Recent Posts

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction

Multilingual Named Entity Recognition for Knowledge Graphs: Supporting 70+ Languages with Precision

How LLM Verticalization Reduces Time and Cost in GenAI-Based Solutions

Integrating Bitext NAMER with LLMs

Bitext NAMER Cracks Named Entity Recognition

Deploying Successful GenAI-based Chatbots with less Data and more Peace of Mind.

Bitext NAMER: Slashing Time and Costs in Automated Knowledge Graph Construction