Content entities, also known as named entities, are pieces of information in visitor messages that our product gives specific meaning to. For example, understanding that a certain 14-digit combination of letters and numbers is an order number; or, that another combination of numbers, letters, and special characters is an email. These pieces of information will then be saved to the session of the conversation and be available for use at a later stage.
The whole process of identifying information is called Named Entity Recognition (NER).
In this article we will cover:
- Types of Content Entities
- When to use Content Entities
- How it works
- How to Recognize, Collect and Save Entities
- How to set it up?
- Structuring Content Entity List
Types of Content Entities
At Ultimate, we use three types of content entities:
- Regular Expression (also known as Regex)
- Expression List
- Synonym List
Regular Expression (Regex)
All number or pattern-based entities, such as credit cards, IBANs, emails, phone numbers, order numbers, Dates, Social Security Numbers, etc.
Every Ultimate bot comes with the presets below:
US Social Security Number
Personal identity code (Austrian, Chilean, Danish, Estonian, Finnish, French, Icelandic, Italian, Latvian, Norwegian, Polish, Romanian, Slovakian, Spanish, Swedish, Swiss)
List entities like countries, cities, brands, etc.
Product Type - shoes, shirts, pants, accessories, etc.
Product color - white, black, green, blue, etc.
When to use content entities
- When there's Personal Identification Information (PII) that needs to be sanitized before data import
- When you want to collect and verify a certain piece of information. For example, order number, in chat visitor messages in order to create customized dialogue flows
- When you want to understand and identify product-specific descriptions in chat visitor messages
How it works
Once an entity is added in Settings > Content Entities, anything in chat visitor messages that match the entity pattern can be identified.
For example, you can mask any email with the placeholder <EMAIL> by adding this as a content entity in our Dashboard. And if you want to sanitize the entity, simply check the box and the underlying data/string will be redacted and it will not be stored in our database. The sanitization is applied according to GDPR compliancy to sanitize any PII data.
- RegEx - All number or pattern-based entities like Phone Numbers, Order Numbers, Dates, Social Security Numbers, etc.
- List entities like Countries, Cities, Names, Product Types, Brands, etc.
How to Recognize, Collect and Save Entities
Within the Dialogue Builder, there are two ways to identify content entities and then their applications
You can use content entities to identify an entity from a visitor message either within a form or a message, but just use it to move the conversation forward before escalating. A common use case we see is to ask for verification or security information, which can be sanitized before escalation. You just use the entities to see if it follows a format or is part of a list so that the user must enter something, but it can be checked later by the human agent.
Collect and Save
This is used to then later refer to this data later on in the flow, for example, you may want to collect the email address from the user and save it to the conversation data so then if it is needed later you can save yourself the duplicative experience of asking for it again. These can be done with conditional blocks and visitor messages and what's important is to apply the action afterward to save it to the conversation data.
How to set it up?
To set up content entities:
- Go to Settings > Content Entities
- Click Add Preset or + New Entity
- Add Preset - The common ones we create for you: email, credit_card, iban, and personal_identity_code of different countries.
- + New Entity - Add your own set of rules
For more details, read how to create content entities.
Structuring Content Entity List
The Content Entities table should be organized similarly to the visitor message scenarios where the ones with the highest priority but that are the least sensitive should be placed at the top.
You can reorder the content entity list by going to Settings > Content Entities and then clicking the 3-dot menu on the top-right of the table. Select the reorder scenarios option and then the table can be reordered via drag and drop of the rows.
When using numbers to signal scenarios in channels where you cant have buttons, for example in WhatsApp, you can create multi-lingual content entity lists, including typos of each of the number scenarios. These should be ordered towards the top of the Content Entity lists to have the greatest sensitivity.
Please note - if you are using entity recognition of numbers in your scenarios, you wouldn’t want these to be confused with other entities or Intent predictions.
Therefore you would want them to check the most specific first so in case you would like to include an intent predicted like "wait a moment" in that visitor message scenario, you would want to look for that first and then the numbers - you might be looking for options 1, 2, 3 with entity recognized. Otherwise, if someone said "one sec" it would trigger scenario one instead of the intent.