Defining the dialog

When you define the dialog, you must consider a number of factors before coming up with a strategy or set of strategies, such as:

  • How the user will navigate through the system (determined to a large extent in the previous topic)
  • How prompts (messages) will be worded to obtain information from the user
  • How information (feedback) will be presented
  • How failure scenarios will be handled

Comprehensive coverage of the entire design process is beyond the scope of this topic. Below is a list of major items to think about, to set you on the right track.

Analyze your application

You did this earlier in Five steps to planning your app. At this point you should have identified:

  • Conversations or dialogs (activities) the application will handle
  • Information to collect for each dialog
  • Sample phrases that capture the user’s intent to start a dialog
  • Words and phrases a user might say to express specific requests
  • Sample conversation flows for each dialog

Map the user interface

Once you’ve decided on general requirements and your approach (tasks to perform, information to collect, information to return to the user, dialog flows), it’s time to sketch or map out how users will interact with the application. In these early stages you might choose to use a diagramming application like Microsoft Visio to represent in a flow chart each dialog state in the application and use arrows to point to other dialog states based on what the user says. In Mix.dialog you will recreate and streamline this graphical representation and make it available to all stakeholders.

In the early stages remember to consider any constraints on information collection. For example, the business requirements of a drink-ordering app may demand that you first ascertain the user’s location before permitting an order to be placed (to ensure that a specific location carries the products requested). Similarly, information dependencies may exist that you need to build into the application; for example, the option to add “shots” may exist for coffees that fit within the “espresso family” but not for regular or “drip coffee”.

A flow chart indicating the information dependencies in your application will give you a visual map of the different branches your application will have to cover, and provide some guidance as to the best strategy to use in designing the dialog.

Here is an example plan for a drink application (not all of the dialog flow is shown), which will offer drip coffee, espresso-style beverages, and tea.

Sample flow for coffee app

Such a sketch or map will help you build the conversational flow in Mix.dialog, with each specific task—such as asking a question, playing a message, and performing recognition—specified via a node. As you add nodes and connect them to one another, the dialog flow takes shape in the form of a graph, allowing you to visualize every piece of the conversational logic.

Add to para above, sometime in future? “Nodes can be nested inside reusable components, keeping designs simple and organized.”

This information belongs somewhere else. Can we also work in the system overview diagram?

At this point you might want to think about the purpose of each node in your application. For example, the most important nodes you will use in Mix.dialog include:

  • Start: To start the conversation. Can also be used to set variables and to override global settings such as error and command handling. (For non-Main components these values can be overridden using the Enter node).
  • Message: To perform non-recognition actions, such as playing a prompt, assigning a variable, or defining the next node in the dialog flow.
  • Question & Answer: To listen for and recognize user responses.
  • Intent Mapper: To connect the dialog flow to other components to get the user’s intent and collect the information to fulfill that intent.
  • Data access: To exchange information with a backend system.

The following nodes will, in turn, send actions back to your client application at specific points in the dialog:

  • Message actions to indicate that the client app should play a message to the user
  • Q&A action to indicate that the app should play a message and return user input to the dialog (such as “What type of coffee would you like today” and the user’s answer “double espresso”)
  • End actions to indicate the end of the dialog
  • Data actions to indicate that the dialog expects data to continue the flow (for example, to retrieve the price of a double espresso).

For more information on Mix.dialog node types, see “Node types” in Mix.dialog. For more information on how to return user input to the dialog service, see “Actions” in the DLGaaS documentation.

Be clear, consistent, and efficient

Design your messages to encourage the simplest and most direct responses possible. Users want speed and efficiency. The fewer the number of steps to complete a task, the greater the perceived efficiency of the system. For more information on directing user responses, see Prompting the user.

Support universal commands such as “main menu”, “escalate”, “goodbye”

Providing the ability to invoke these commands at any point in the conversation gives users control over the dialog. Design the application to allow users to say “main menu” should they miss or forget instructions, “escalate” if they get stuck, and “goodbye” if they wish to leave.

Gracefully handle errors

Errors and misunderstandings are inevitable, just as they are in regular, everyday conversation. Try to anticipate problems and give users effective instructions and feedback to get them back on track smoothly. Common errors include recognition/find-meaning failures such as nomatch and noinput conditions. Provide the appropriate level of instruction given the failure condition/error to move the user along in as natural a way as possible. Suggestions are provided in Handling errors and ambiguity.

Confirm but don’t overdo it

Confirmation has its place: for example, for disambiguation, error handling, and when obtaining confirmation before committing a transaction. However, it’s not efficient to confirm each item at a time; unnecessary confirmation can double the length of the interaction and frustrate users. For this reason, it’s best to confirm when a block of information has been completed rather than after each individual piece of information. See Requesting confirmation.

Avoid cognitive overload

Reduce the short-term memory load on users by providing visual as well as auditory (multimodal) feedback, by limiting options whenever possible, and by splitting up complex tasks into a sequence of smaller interactions.

Maintain context

A well-designed application tracks what the user has said (or typed/tapped) and responds in context. Your strategy for maintaining conversational context should take into account factors such as the number of turns or user interactions to retain and when to release the context (for example, if the user is in the middle of a transaction and clicks the Back button or says “cancel”).

Another consideration is intent switching: Do you want to give your users the ability to switch between intents; for example, to move from the place-order dialog to the location dialog (to view a list of nearby coffee shops)? Are users able to switch back with no loss of contextual awareness, or would you prefer that they finish one task at a time? You’ll need to balance the benefits of usability against application complexity.

Remember, you are guiding the user

The best applications focus on the users’ goals and on achieving them in the most efficient and intuitive way possible. The structure of your application will depend on the natural logic of the application (the dialogs or actions to perform and the corresponding information to collect/return), and also on your users’ responses to your questions and on your messages—how you respond not only to successful results but also to errors, ambiguities, and incomplete information.