Technological Alchemy: Using GPT in CX
According to Blake Morgan, a CX expert, ChatGPT has major flaws that prevent it from becoming a useful tool in industries like Customer Experience. In, Cons Of ChatGPT For Customer Experience, an article recently published in Forbes, Morgan wrote:
“One of the lauded benefits of ChatGPT is that it offers a new response every time. But that creates an inconsistent customer service experience. Companies can never count on the bot to provide a specific answer, making monitoring customer needs and requests challenging. One of chatbots’ most common use cases is repetitive questions, such as order status or account information. These questions require a set answer, which ChatGPT can’t provide consistently.”
We are writing this post because we have found a solution to the problems Blake raises. Blake points out three major flaws in ChatGPT, #3 being the real challenge in our opinion:
- The information isn’t always accurate
- It doesn’t have any special training
- It provides different answers every time
We decided to run an experiment with the goal of making GPT provide consistent answers. As a first step we generated synthetic text, supervised by human experts, for 3 different purposes:
- Questions: a set of representative questions on the CX domain, like “I need to return my order”
- Answers: a set of accurate and structured answers for those questions
- Variants: around 1,000 variants for each question and answer, to guarantee a comprehensive training
We then integrated the question-answer pairs in a training dataset, to which we added linguistic tagging. This tagging documents the linguistic reasons behind every variant. “Colloquial”, for example, would be the tag for sentences like “do u wanna send my order asap”.
With this dataset we then fine-tuned an instance of GPT 3.5 and, finally, we evaluated the answers from the two instances:
- Playground Instance of GPT 3.5
- Bitext-Trained Instance of GPT 3.5
For question: “I need help to file a consumer complaint”
As Blake anticipated, the Playground Instance of GPT 3.5 provides two totally different answers with high degrees of content variability (although both make sense):
Answer 1: “Please note that I am an AI language model and cannot provide legal advice. The information provided above is for general guidance purposes only. It is always recommended to consult with a legal professional or consumer protection agency for specific advice related to your situation.”
Answer 2: “I can definitely help you with that! To file a consumer reclamation, you can follow these steps: 1. Gather all relevant information…”
The instance of GPT 3.5 trained with the Bitext Dataset to be a Customer support agent, however, provides multiple accurate responses that match the training and, while they have some wording differences, contain the same general content:
Answer 1: “I’m sorry to hear that you’re in a situation where you need to file a consumer reclamation. Your feedback is incredibly valuable to us, and I want to assure you that we take your concerns seriously. To assist you with this process, I recommend reaching out to our dedicated customer support team. They are well-equipped to guide you through the necessary steps and ensure that your reclamation is properly addressed. Please provide me with your contact information, and I will personally ensure that you receive the assistance you need.”
Answer 2: “I’m sorry to hear that you’re facing an issue and need assistance in filing a consumer reclamation. Your feedback is incredibly valuable to us, and we want to ensure that your concerns are addressed promptly and effectively. To begin the process, could you please provide me with the specific details of your reclamation? This will help us better understand the situation and take the necessary steps to resolve it. Rest assured, we are committed to resolving your issue and improving our services based on your feedback. Thank you for bringing this to our attention, and we appreciate your patience as we work towards a resolution.”
AWS blog reports similar experiments and results, fine tuning GPT-J 6B with SEC Filing Data
Domain-adaptation Fine-tuning of Foundation Models in Amazon SageMaker JumpStart on Financial Data
As a next step, we plan to test if this strategy of fine-tuning using synthetic text works also for non-trained questions, i.e. for questions which are not present in the training dataset. The strategy could provide relevant findings for question answering systems beyond customer service like Siri, Alexa or Google virtual assistants.