We’ve talked previously about how businesses are creating digital personas to represent their brands, and while an identity is all well and good, these avatars will have to eventually create relationships with consumers through conversation.
Whilst humans have the most advanced processors known to man in our skulls, traditional computer processors struggle to identify sound tones and thick accents that make each of our voices unique. Machine learning and artificial intelligence programs are being used to evolve this vocal recognition in products such as Google Assistant, Alexa and Siri, by taking recordings of their users’ voices in order to learn how to better identify what they are saying and continue to improve their accuracy.
Now they seem to have a grasp on understanding how to listen, the next milestone is learning how to talk. Some are getting so advanced that they can recognise and replicate emotions through language, but being able to comprehend what a person is saying is only one step; social connotations and abstract meanings are much harder for computers to understand.
For example, asking a digital assistant to, “Put the kettle on,” may return with a question like, “Where would you like the kettle to be placed?” While these services may not yet be perfect, as the algorithms used are improved we will find computers can recognise accents and idiosyncrasies from all over the world and respond accordingly.
Taking this one step further, Google recently announced an update to their Google Assistant called Duplex. This gives you the ability to ask your phone to make, for example, a dentist appointment, and Google’s complex combination of ones and zeros will call the dentist, speak to the receptionist and, speaking in an uncannily human voice, book the appointment you want. They take this even further by incorporating the ability to deal with increasingly complex issues in a more human manner, such as if the clinic doesn’t have an opening until Wednesday, the assistant has been designed to use human language features like umming or pausing-to-think to smoothen the experience.
Apple’s recent updates to Siri create a natural sounding voice and when this and Google’s technology are amalgamated or overtaken by another player, most likely Amazon, it is going to become increasingly hard to differentiate a real voice from a computer-generated voice.