Anonymize user data

When dealing with bots, a user is in very often in a position of giving out personal information. Keeping track of what personal data may be captured by the bot is therefore part of the normal development of bots. The Pre-logging script event can be used to anonymize personal data and in a way where the functionality of the bot is not affected. With the Pre-logging script we can

  • Remove data before it leaves Teneo Engine
  • Redact, remove, or encrypt sensitive data

On this page, we will show you how to anonymize personal data with the following steps:

  • Create a Global variable that stores the values to redact.
  • Add a Post-processing script to capture relevant data, so that it can later be redacted.
  • Add a Pre-logging script to redact the values and remove the variable.

personal-data-redacted

Anonymize a variable

For this example, we will replace the value in Lib_sUserFirstName, which stores the users first name. This variable is located inside the Longberry Barista solution and comes together with the Teneo Dialogue Resources.

  1. While inside your solution, click on the 'Solution' tab in the top left.
  2. Click on 'Globals'.
  3. Select the 'Scripts' tab at the top.
  4. Select the 'Pre-logging' and click on the 'Edit' button in the top right. This will open a new window where you can write your script.
  5. Add the following line into the editing window, _.getDialogHistoryUtilities().replaceVariables(['Lib_sUserFirstName'], 'John Doe'). This will automatically replace the variable value to 'John Doe' right before its starts logging it.

script-explained

This does not affect the functionality in your solution, the variable is replaced once the conversation is over and the tasks is done. This is also possible with Outputs, Metadata and much more. Please visit reference documentations for more details.

Anonymize personal data

One other scenario where its powerful to use Pre-logging scripts is when you want to anonymize personal data. Teneo is powerful at recognizing different values of personal data thanks to the Named-entity recognizer and Part-of-speech annotation tags. In the following example we will go ahead and redact the names mentioned while communicating with our bot, using the PERSON.NER annotation tag.

Here's a list of the Part-of-speech (POS) and Named-entity recognition (NER) tags included in Teneo.

Create a global variable

As a first step, we need to create a new Global variable that will store the values we want to redact:

  1. Open the 'SOLUTION' tab in the solution's window.
  2. Select 'Globals' in the purple bar on the left-hand side, and then select 'Variables'.
  3. Click 'Add'. A panel for specifying the new variable appears on the right-hand side.
  4. Name the variable toRedact, and set its initial value to an empty list: []. (Make sure to edit the "Value" field and not the "Description" field)
  5. Hit 'Save'.

Add a Post-processing script

Next in line is to add a Post-processing script to store the relevant values in the global variable we created in the previous step.

  1. Select the 'Scripts' tab at the top.
  2. Select 'Post-processing'.
  3. Click on the 'Edit' button in the top right and add the following groovy script into the editing window:
    // Find names that have been mentioned by the user
    _.inputAnnotations.getByName('PERSON.NER').each { person ->
    def sentence = _.sentences[person.sentenceIndex]
    def firstWord = sentence.words[person.wordIndices.first()]
    def lastWord = sentence.words[person.wordIndices.last()]
    def beginIndex = firstWord.beginIndex
    def endIndex = lastWord.endIndex
    // Select the items to redact
    def itemToRedact = [
    'historyIndex': _.dialogHistoryLength,
    'beginIndex': beginIndex,
    'endIndex': endIndex,
    'value': sentence.text.substring(beginIndex, endIndex)
    ]
    toRedact.add(itemToRedact)
    }
  4. Hit 'Save'.

The value to store can be changed by replacing the PERSON.NER in line 2, to any other Part-of-speech (POS) or Named-entity recognition (NER) tag.

Add a Pre-logging script

Finally, we will add a Pre-logging script to redact the values stored in the global variable.

  1. While inside the 'Scripts' tab, select 'Pre-logging' and click on the 'Edit' button in the top right. This will open a new window where you can insert your script.
  2. Add the following code into the editing window:
    // Define the values
    def values = [*toRedact.collect {it.value}, Lib_sUserFirstName]
    // Replace output
    _.dialogHistoryUtilities.replaceResponseText(values, '****')
    // Replace input 
    _.dialogHistoryUtilities.replaceUserInputText(values, '****')
  3. Hit 'Save'.

Make sure you return to Try Out and reload the engine before continuing on the next step

Publish and test your bot

In order to see if our scripts work, you will need to publish your bot. Proceed as follows:

  1. Open the 'SOLUTION' tab in the solution's window.
  2. Select 'Publish' in the left sidebar.
  3. Click the 'Manage' button and in the drop-down you will see a lot of different alternatives. Locate the 'Latest' section and choose 'Publish'.

You might see a warning saying 'Publish to 'Default env' stopped with warnings. '

First time publish warning

This is nothing to worry about; the warning is shown when you publish your solution for the first time or when you have made certain global changes. To proceed, just check the checkbox 'Perform full application deployment on Try again' and click the 'Try again' button.

The publication may take a couple of minutes; the video below is sped up slightly. When it has finished, you'll receive a confirmation pop-up.

  1. Once published, click on the blue 'Open' icon. This will open the Teneo Web Chat in a new browser tab.

    Teneo Web Chat is also available in the Bots section in your team console. You need to publish the bot from Teneo Studio before using it.

  2. Click on the blue icon in the bottom right corner to open up the Teneo Web Chat window.
  3. Strike up a conversation with the bot, like:
    • Hi, my name is John Doe
    • Goodbye!
  4. Close the chat window to end the conversation.

Read the logs

Now return to your Teneo Studio and open up Log Data Source to see if the name has been redacted.

  1. Open the 'SOLUTION' tab in the solution's window.
  2. Select 'Optimization' in the left sidebar.
  3. Navigate to 'Log Data' and open up your source by clicking on the 'Manage' button followed by 'Open'.

A new window should now open. This is the Log Data window, described here. The next step is to open a new Session Viewer tab to retrieve the latest session.

  1. In the 'Session Viewer' section, click on 'New Session Viewer Tab'.
  2. Change the values to 'Start date' and 'Descending' to retrieve the most recent session.

You should see the following conversation. If not, please repeat the steps above.
personal-data-redacted

Was this page helpful?