Chatbots, digital assistants, virtual assistants are all based on a conversational user interface, and they are not all equally smart or conversational. Traditionally chatbots have been associated with conversational systems with a chat-like or messenger-type interface, while personal digital assistants in your device can translate speech to text, and do text-to-speech synthesis to read back responses to the user, and are more 'personal' in the sense that can help you do things more quickly, for instance, sending a text or setting a reminder in your calendar. Virtual Digital Assistant or Virtual Assistant is just a generic term for both. Virtual Assistants can differ in many of the steps in the following diagram, which depicts a data-flow that occurs after a user utters a question to the moment they get an answer. For instance, some Virtual Assistants are just Q&A chatbots and have a very limited dialog manager component, others may lack speech recognition or understand more than just the English language. In analyzing virtual assistants, these are some of the things they may or may not do:
I will be comparing some of the commercial virtual assistants in future posts and will use the above 10-point comparison list to highlight features or lack thereof. Generally speaking, virtual assistants are just as smart as the knowledge they rely on and the effectiveness of their NLP algorithm to understand user’s intent and find the answer. When first released, they may not know much, because the knowledge may be lacking or because the NLP algorithm usually learns over time.
The core algorithm is the natural language processing (NLP), the blue steps in the diagram, of written or spoken sentences, to understand user’s intent. NLP comes in many forms and the main reason virtual assistants differ and why some may sound smarter than others. Other key differentiations among virtual assistants conveyed in the diagram is the differentiation between 1) question and answer assistants or conversational, 2) whether they can answer general questions or specific to a domain, and 3) action assistants, those that will book an appointment in your calendar or book a hotel room for you, hence the name of personal digital assistants. In fact, Alexa originated, and its claim to success has been to take actions, i.e., be able to do things for you. Siri too can do things for you, but Amazon rode this horse much faster and introduced APIs and a 3rd party program early on so that 3rd developers could teach Alexa new skills. To take these actions, digital assistants must connect to other devices or apps. The magic here is to have clear connections to devices, e.g., to help them open the garage door or turn on the lights, or to other apps, to book an appointment on your calendar. The digital assistant still needs to understand your spoken words, i.e., natural language, but the chance to get it wrong is lower just because there are so many ways to ask Alexa to play a song, albeit it may play the wrong song or not understand your accent. The digital assistant API and the application’s API help make these connections possible. This is just the same as two applications communicating with each other through their respective API. Some of these connections are pre-built for you, e.g., play a song using the music app on your device, and you may just have to turn them on. If a connection between your favorite calendar application and digital assistant does not exist, then the chatbot cannot create an appointment in your calendar. Alexa is the most flexible here, and it most likely has a connection to your favorite application so you can interact with the app through the speaker. For Q&A or conversational virtual assistants, finding an answer to a question that has no specific structure or missing some essential details is more technically challenging and where most often you may see the virtual assistant failing. There are definitively many more ways to ask a question for which you want an answer than to ask Alexa to play a song. NLP, the blue steps in the diagram, make the difference in understanding the user’s intent and finding the right answer. Some assistants are rule-based, statistical, rely on machine learning models, or are a combination of the above. Depending on how complete is the knowledge base, i.e., if answers are in there, the results will be more or less successful. The dialog component is key if the question lacks essential details. The NLP component needs to understand that and must either infer, ask the user for the missing details, or just return search suggestions. The true conversational chatbots that can engage users in back and forth question and answer are not a commercial reality yet. The most effective ones on this specific aspect are rule-based domain-specific assistants, e.g., travel or doctor assistants, because they typically are coded to prompt user for the missing details, e.g., missing travel dates for a hotel booking. Google, in my opinion, has the best NLP technology of all other companies. This should not come as a surprise given Google has been investing on NLP since their beginning and developed a complex machine-learning model that they have been refining since 2006 when they released Google Translate. Please see related post on Google for more details. The only known commercial implementations that attempt to answer any question, i.e., a general purpose digital assistant, is Siri. It relies on a vast computational knowledge base of curated information and algorithms, Wolfram Alpha. Wolfram and his team took decades to build and release it. Most of the other assistants are typically domain specific. Alexa has only a predefined set of questions she can answer, so it does not fit the general-purpose type. UPDATE: The recent Google Assistant with its Google Knowledge Graph makes the grade as a general-purpose digital assistant. A related topic to this discussion is virtual assistants for businesses, which I will comment in a separate post. While the user experience is similar to a personal digital assistant, there are other considerations, for instance, what skills do you have in-house to build your product or service assistants? I will comment on one in particular. Comments are closed.
|
Categories
All
Archives
January 2019
|