Freeform entities

An entity with freeform collection method is used to capture, as a single block, user input that you cannot:

  • Enumerate in a list
  • Specify with a regex pattern
  • Specify with a rule-based grammar
  • Express in terms of other entities using an isA or hasA relationship

Take the example of an intent for sending a text message to a specified user. A text message body could be any sequence of words of any length. In the query “send a message to Adam hey I’m going to be ten minutes late”, the phrase “hey I’m going to be ten minutes late” becomes associated with a freeform entity MESSAGE_BODY.

An important aspect of a freeform entity is that interpreting the meaning of the literal corresponding to the entity is not important or necessary for fulfilling the intent. In the example of sending a text message, the application does not need to understand the meaning of the message; it just needs to send the literal text as a string to the intended recipient.

Having difficulty determining which type to use? See the examples below.

Example sports application – List type

Consider a sports application, where your samples would include many ways of referring to one sports team, for example, the Montreal Canadiens:

  • [SPORTS_TEAM]Montreal Canadiens[/]
  • [SPORTS_TEAM]Canadiens[/]
  • [SPORTS_TEAM]Habs[/]

Since you could enumerate each option, you would make this a list type and annotate it accordingly. Additionally, the NLU engine would learn about the entity from these different ways of referring to the Canadiens. You would not have to enumerate every possible sports team or every possible way to refer to the Canadiens.

Example SMS app message recipient – regex or rule-based type

Consider an SMS messaging application, where samples include the destination phone number. There are billions of possible phone number combinations, so clearly you could not enumerate all the possibilities, nor would it really make sense to try. However, phone numbers would not be considered freeform input, since there is a fixed, systematic structure to phone numbers that falls under a small set of pattern formats. These patterns can be recognized either with a regex pattern (for typed in phone numbers) or a grammar (for spoken numbers). Another problem with handling a phone number as a freeform entity is that understanding the phone number contents will be necessary to properly direct the message. Understanding the semantics of the number is necessary to fulfill the intent.

Example SMS app message contents–Freeform type

When your sample entity includes text that does not have well-defined many-to-one relationships and that cannot be fully enumerated or described with rules or patterns, use the freeform entity type. Consider an SMS app, where it is impossible to list or specify every way that a user may say something to your app. The body of an SMS message could be literally anything. Here is an example of what those annotations might look like:

  • Send a message to adam [MESSAGE_BODY]are you coming soon we’re waiting for you[/]
  • Reply with [MESSAGE_BODY]I saw your message and will pick up milk on the way home[/]
  • Say [MESSAGE_BODY]what is up buddy[/]

MESSAGE_BODY would be a freeform entity because the contents of a message are unpredictable and cannot be fully enumerated. Moreover, understanding the contents is not necessary to send the message, verbatim, to its destination.

Notes on freeform entity annotation

Some important points to remember about annotating freeform entities:

  • Be aware that any words inside the freeform entity annotation do not improve your NLU model. The text marked as the freeform part of the sample (and only that part!) is like a black box that won’t be further analyzed in training. Additionally, the ASR engine won’t be able to improve the recognition of these words as it would be able to do for words in a list type. Use the freeform type with care.
  • You cannot annotate the entire contents of a sample sentence as a single freeform entity. Your samples must contain words leading into the freeform text or following it. This provides context that the NLU engine needs to detect that a chunk of text within a sentence should be recognized as a freeform entity.

Notes on freeform entity recognition

Some important points to remember about recognition of entities using freeform collection method:

  • Mix does not support collecting completely freeform sentences, for example, by inviting the user to provide open-ended comments or feedback. If you want your application to support this sort of scenario, you must handle this outside the regular flow of the dialog or at least bypass NLU interpretation for that user input.
  • The NLU engine may fail to recognize a freeform text block as a freeform entity if the text contains content that fits a predefined entity such as a date, a number, or a distance.
  • When working with the NLU interpretation results related to a freeform entity, you should use the literal rather than the string value.

Best practice

Be careful not to overuse freeform entities, especially when a large base grammar already exists for the information you want to collect, such as SONGS or CITIES. Avoid using a freeform entity to collect this type of information—the NLU engine has already been trained on a huge number of values, and you won’t benefit from this if you use a freeform entity.