Demystifying the development of chatbots - Part 1

Tue, 02 Oct 2018 ǀ Paulo Azevedo

Chatbots — those automatic systems with which you can either speak or interactively text with — are very much en vogue right now. The idea of having computers talk to us in our language and serve us is not new: in the 1960s there was a chatbot called ELIZA, which was good enough to fool many people into thinking they were conversing with another person.

They can be very useful, far from only for gimmicks such as turning off the lights from bed without having to stand up. They allow drivers to ask their phones or cars for directions while keeping their hands on the wheel, or amateur cooks may set timers or ask the boiling point of butter while their hands are too greasy to touch their smartphone’s screen. And, of course, they serve as another channel for consumers to reach companies.

This last use case is particularly useful, as it allows a brand to really project itself out there in an anthropomorphic way, with personalities, voice tones and jokes. And those bots are always available, even when some channels, like shops, may not be. They can truly help customers in need, and that’s why I believe chatbots will continue their growth trend observed in the last couple of years.

In this series of blog posts, I’ll write a bit about chatbot development, giving people a glimpse of how the process was for us at FlixBus, and showing that it is not as hard as many people think. This first post will talk about less technical aspects, such as the history of chatbots at FlixBus and a few considerations about branding, the bot’s personality and usability. The following posts will be more technical, about architecture, the tools used and the programming parts.

Chatbots at FlixBus

Our first expedition into the chatbot world was when we launched our Alexa bot in September 2017. It was launched in German language, with a custom-made webhook that allowed customers to search for bus rides and get a link to finish the booking process of a ride they were interested in.

While the usage of this first bot by our customers wasn't very big, this milestone was very significant to us. FlixBus was very early to launch mobile apps, even without booking capabilities, because to us it really matters to be present wherever customers want to reach us. Thus, accumulating experience early in those different platforms is something necessary, as we do believe that chatbots are bound for big growth in the next couple of years. While there is a risk of that not happening, the chance of it happening in our absence would pose an even greater risk.

Next up, we launched a Google Assistant bot in April 2018. This time, it allowed customers to interact with it in German, French and English. It was also much more capable, allowing customers to directly book their bus rides per voice, and paying with the credit card associated with their Google account. This meant not even having to bother with credit card numbers, saving users a lot of time. As a matter of fact, this trilingual (DE, EN, FR) bot was mentioned by Google on stage at Google IO 2018 as a best-practices example of serving speakers of different languages equally well.

Google IO 2018

FlixBus is a truly international company. Not only do we operate in dozens of countries, but our work colleagues and our customers come from all corners of this Earth. For us it is indeed important to serve people well, beyond language barriers as much as we can. We see chatbots as a way to achieve this vision. Right now we have bold plans to move forward.

Empowered by the multifaceted ways of integrating with other platforms, Google’s Dialogflow is a great tool to build chatbots, be them the voice type, the text type, or even multimodal. I will talk more about it in my next post, but for now it suffices to say that it will empower us to take even bolder steps. Among the next steps we’ll take, one of them will be launching a voice bot on another platform, which will be especially useful to serve our customers well. This will allow us to be present for them fast, at all times. Also, in the future we’ll bring the functionality of our Alexa bot to be on par of that of our Google Assistant bot.

Considerations about branding, the bot’s personality and usability

Before getting down to coding your bot, there are at least four things you need to pay attention to. If they are not tested and corrected until after you’ve coded your bot, it will be much harder to fix it later. The usability of your chatbot will be very much improved from the start if you consider taking the steps delineated below, when applicable to your customer base or product.

Test your interactions with a colleague before coding them

When it comes to usability, the very first advice I can give you is to test the interaction with a colleague. First of all, write a script of how your main use case goes when everything works as expected. For us, it is the customer looking for a ride between two cities between which we offer connections. Now, test it with a colleague that doesn’t know the script, just the task they want to accomplish. You, however, must stick to the script, word by word. Note down any difficulties of your colleague and adapt your script. Test and adapt it again, until it’s well-rounded. At this point, you may want to test with another person just to see if the first tester didn’t just get used to it through repetition.

Now, write scripts for common failures. In our case, it may be when a ride is not available between the cities, or a pair of stations not being served at all, but with alternative rides between the same cities in other stations, etc. Have your colleagues try it again, but this time, decide beforehand which error state you want to test, and see if your user can still be served satisfactorily. This may as well take some iterating for each error you came up with.

Finally, plan how you are going to respond for errors that you haven’t anticipated or can’t help the user go around. That is, for instance, when your backend crashes due to a bug you didn’t yet know about, or unannounced changes in APIs your backend depends on. Write down the response your bot must do when such things happen and refine after gathering feedback. A generic but well-crafted error message will always be better than getting Google Assistant’s default voice to say something like “FlixBus is not responding right now”.

Care about your bot’s personality

As I mentioned in the introduction, a chatbot gives your brand a unique opportunity to show personality traits otherwise impossible to be transmitted in other automated contact channels. Lacking in this aspect may make a bot unappealing to use, preventing users from discovering its usefulness. By making your bot interesting to use, you are helping your consumers to get to the point of helping themselves with the bot.

Some situations in the interaction may make your bot sound cynical or, worse, like it’s lying. It may take some work, but it pays off to think about how to make your bot an artful dodger. This is more easily explained through examples, and I’ll roughly rephrase an example from Google’s Ryan Germick, used by him on Google IO 2018.

If you ask Google Assistant what its favorite kind of ice cream is, it could lie and say “chocolate”, even though you know it doesn’t have a mouth, thus being unable to taste any food. Conversely, it could be cynical and say “I’m a robot, I don’t have a mouth, and therefore no opinion on the matter”. It may be the truth, but few brands would like to be perceived like that. The way Google found out to artfully dodge the question was with “You can’t go wrong with Neapolitan. There’s something in it for everyone”. I find this to be an inspiring example of how to dodge the awkwardness that may arise from your bot sounding close to, but not being fully human.

A key question as well is how do the bot and the user stand in relation to each other? Is your brand professoral, thus being above the user? Or does readiness to serve better suit your users’ taste, not unlike a butler? Could it be, perhaps, that a same-level, informal relationship is what works best for your brand? You should put some thought on these questions, and then write your bot’s answers so as to stay consistent with your choice.

Easter eggs may also be very useful to add a bit of a “spice” to the conversation, making it interesting. You may, for instance, tell our bot on Google Assistant that you’re bored. In fact, you may do that repeatedly, getting different responses. Other resources, such as using filler words or such human idiosyncrasies, may help not only keep the interaction natural and interesting, but may even bring the occasional giggle to your user, if fitting with your branding principles.

Another aspect you may want to consider, especially in case you intend to serve users in multiple languages or multiple countries, is to involve a multicultural team around the project. This will help come up with easter eggs and jokes that are easily translatable, but more importantly, may show you where your bot needs to be different in each language, increasing its acceptance. One of the goals of this multicultural group should be to find the biggest possible common ground, so as to simplify your bot’s development and maintenance.

Prepare for the unexpected

It is inevitable that sometimes communication problems will arise, especially with voice interactions, but even with text-based bots. It could be that your user is asking for the bot to do something it doesn’t yet do, or even to perform its main use case, but in a way you never thought of. To deal with those cases, you need good fallback support. A “fallback”, in chatbot lingo, is the bot’s reaction when the user said something unexpected, that the bot could not map to any of its capabilities.

In order to effectively have fallbacks, you need to think of the reasons why your bot may not have understood the user. If they’re talking to your bot amidst lots of background noise, it is possible that simply allowing your user to repeat themselves is enough. In case the user asked the bot to do something it is ready to do, but with words it didn’t recognize for the task, the user will have to rephrase, though. Your fallback messages may thus need to change so as to encourage the user to rephrase.

However, it may be that something else entirely happened. Your user was talking to the bot, and someone else walked in the room, or called them. For most users the bot takes lower precedence than people around them or even phone calls. Thus, your bot may be hearing excerpts of your user’s conversation. If so, repeatedly asking the user to repeat will just be annoying. If your bot can’t make it out what the user is saying after the second or third attempt, it should finish the interaction gracefully, saying farewell, instead of obliging the user to explicitly pause their conversation to order the bot to stop.

Another possible source of misunderstanding is slot-filling. For some use cases, you may need more information than your user initially volunteered. In our case, a user may ask for a bus ride from Munich to Berlin. You may simply assume it is the next available ride, or you may choose to explicitly put the user in control of that, by asking them when they want to depart. In our case, a date suffices, and thus we ask on what day the user wants to depart.

The user’s input for a mandatory parameter may be invalid, though. For instance, a user may ask for a ride from Portugal to Spain. That is not precise enough. Thus, your prompts for slot-filling need to be specific, such as “What is your departure city?”. This way, the user knows that a city is expected, and not a country. By formulating all questions in this way, you considerably help your user achieve what they want as soon as possible, with fewer fallbacks and reducing repeated questions. This may motivate this user to engage more with your brand in the future.

Get to the point, but not too fast

The final consideration is about the length of the sentences used by your bot. One of the reasons why chatbots are so trendy right now is because they sound almost human. They are not super objective, which helps make the interactions more natural. However, listening to a robot talking for too long is boring, and it is generally harder to keep a lot of information through auditory memory, when compared to visual memory. So, you want to reach a sweet spot in terms of response length. Shorter is usually better, but bear in mind that it is possible to be excessively short. Ask your colleague to give feedback on response length when testing your script with them. Consider that when refining your script.

For instance, say one of our users wants to go from Los Angeles to Las Vegas on September 10th. The response could include details of each of the rides offered for that segment on that date as of this writing. It would be excessive, however, boring our users and driving them away. Conversely, we could pick an arbitrary ride according to some criteria, and simply say “Departure at 7AM. Cost $2.99 plus fees”. It doesn’t leave much room for the user to know whether the bot understood the input correctly or not, besides not being very friendly.

In our bot’s answer, we try to allow the user to know the bot understood their needs, and that our user has other options on that date, while keeping to the point: “I looked through our schedule and found 6 connections. I recommend this ride that departs from LA Downtown to Las Vegas Downtown on Monday, 10th of September at 7:00, costs $2.99 and lasts 4 hours 45 minutes. Would you like to select this option?

Final words

By planning aspects of the bot early in its development, you may have a clearer idea of what you want to offer and how, as well as a good path to get there. In the upcoming blog posts I’ll talk a bit more about the implementation details and then the publication of chatbots.