By recent trend, I mean, as usual with technology, they've been around forever (Weizenbaum's ELIZA Rogerian psychotherapist from mid 60's, phone menus or IVF from 70's). But as of 2017, there is an explosion of available chatbot technology and, orthogonally, chatbot marketing.
The point here is to to give a superficial systematization of the different things labeled 'chatbot' with examples.
There are two distinguishing characteristics of chatbots that are only leniently considered defining: sequential response and natural language input (either by text or speech). These two might be combined to be called more formally a Linguistic User Interface (LUI) in contrast with a graphical user interface (GUI). The natural language part underlying many of these is some kind of speech-to-text (S2T) mechanism to get words from speech and some NLP processing to match the words to the expected dialog. The leniency about sequential may come down to a single step (the shortest of sequences possibly not even considered a sequence at all) and about language (a label for a button is language right?). With those caveats, on to the taxonomy.
- linguistic interfaces
- Siri/Alexa/OK Google - intent/entity/action/dialog. stateless giving, commands to evoke an action. Development of the system involves specifying: an 'intent', something that you want to happen, the entities involved (contacts, apps, dates, messages), and actions (the code the really executes based on all that information. Oh, and the more obvious thing, a list of all the obvious varieties of sentences that a person could utter for this. The limitation is that there is no memory of context from one action request to the next.
- chatroom bots - listeners in a chatroom (mostly populated by people writing text).
- helper commands - This kind of chatbot simply listens to text and if a particular string matches, executes an action. This doesn't need S2T, and usually no NLP. It relies on text pattern matching (usually regexes) to extract strings of interest. Usually it turns out the implementation is even simpler and just uses a special character to signal a command for a CLI (command line interface) follows.
- conversational bot
- Like Eliza, finds keywords or more complicated structures in a sentence and tries to respond to it in a human like fashion (good grammar, makes sense). The latest ML and machine translation techniques (RNN, LSTM, NER) seem to apply best here.
- 'AGI' - artificial general Intelligence- these exist only in TV/Movies.
- menu trees - structured tree-like set of possibilities, 'Choose You Own Adventure'. These are very much like (or exactly) finite state automata, where the internal state of the machine, and presumably but not necessarily mirroring the mental state of the user, is changed by a simple action of the user. The user is following a path through the system.
- phone menus - Historically, these are menus, a set of choices, spoken to you, expecting a response of a touch-tone number (Dual-Tone Multi-Frequency - DTMF. A recording lists a number of options and the phone user is expected to press one of the numbers associated with that option. Then another option is provided and so on until an 'end' option is chosen or you're transferred to a human operator. Interactive Voice Response or IVR is this same interface allowing responses by voice also. A next level of feature augmentation is to allow the user to speak a sentence to go to the desired subtree quickly, skipping over some steps. This shows how the strict computery menu as implemented on a phone is slowly evolving towards a conversation.
- app workflow - some desktop/phone apps offer an interface that leads you through data entry sequentially. The user is provided with a set of buttons with labels, and the choice of button leads to a different next question depending. Instead of buttons, one might enter some short text, but again this can lead to different new questions by the interface. The text is not intended to be a full sentence, but simply a vocabulary item, allowing a more open-ended set of possibilities than a strict set of buttons without the necessity of parsing. This is the least chatty of chatbots, but like the phone menus may be considered a sequential but non linguistic UI that can be considered a precursor to a more language based one.
It seems strange to call all these bots. I find it natural to call only the conversational bots by the label 'chatbots'. It turns out that marketers have used the term 'chatbot' for all of these. They surely all share some aspects of a chatbot, but it doesn't feel like the name until you're actually chatting.
No comments:
Post a Comment