Dialog Tester

The Dialog Tester is a regression testing tool that enables automatic testing of user dialogs against a published virtual assistant.
This document describes how to use Dialog Tester to test entire simulated dialogs in the maintenance phase of a Teneo solution. While Teneo Studio's integrated Auto-test is a tool meant mainly for intent recognition testing on Triggers, Dialog Tester allows for the testing of the following items:

  • Dialogs: Dialog Testers are capable of testing entire conversations involving multiple user inputs in a row or any multi-step process conducted by the virtual assistant.
  • Variables: With Dialog Testers, an important use case is checking the extraction of variable values, either from the user input or from API calls, or testing the effect of parameter values.
  • JSON structure: Testing of correct JSON structure sent from Teneo to a user interface or a middleware, given a particular user input. These can be become fairly intricate data structures and are not currently testable within Studio.
  • Complex flow-to-subflow behavior: This includes the effects of inter-flow variable passing. These variables are tested indirectly via outputs in Dialog Testers but are not possible to test within Studio.
  • Recursive flow structures, e.g. within a bundle flow. In Studio, we cannot check multiple transitions at the same time to see which users are looping back into the flow to higher flow nodes.
  • Regression: Dialog Testers should be used as part of a regression testing strategy to ensure that solution quality is maintained after any changes, and always before publication. Dialog Tester test results provide a performance baseline, below which level solution performance should not drop.
  • Intelligent features within dialogs: Dialog Testers can test other dialog features that cannot be tested by Teneo Studio Auto-test. These features include short and long-term memory, understanding of context, disambiguation, and use of follow-up questions to gather further required information

Prerequisites

You will need to know the URL of the published Teneo solution that you want to test. If you haven't published your solution yet or don't know the URL of your bot, see Publish your bot to learn more.

Preparing the Teneo solution for Dialog testing

Create a Global Variable

The Teneo solution must declare a global variable called zOutputIds. This can be done with the following steps:

  1. In your solution, click the 'Solution' tab and select 'Globals' in the navigation bar.
  2. Select 'Globals' in the left-hand sidebar, and then select 'Variables'.
  3. Click 'Add'. A panel for specifying the new variable appears on the right-hand side.
  4. Name the variable: zOutputIds, and set its initial value to the empty string: "". (Make sure to edit the "Value" field and not the "Description" field.)
  5. Hit 'Save' in the upper right corner.

Add a Post Processing Script

One of the functions of Dialog Tester is to compare the expected ID of an output node with the ID of the output node that was actually received.
In order for ID comparison to work, output IDs must be sent as output parameters from the Teneo solution. To achieve this, the solution needs to include the following lines in the Post-Processing global script:

def list;  
list = _.processingPath.findAll{it.type=="output" && !it.skipped}.vertexId;  
zOutputIds = list.toString();  
if (zOutputIds) zOutputIds = zOutputIds.substring(1,zOutputIds.length()-1);  
_.putOutputParameter('OUTPUT_NODE_IDS',zOutputIds);  
_.putLogVariable('OUTPUT_NODE_IDS',zOutputIds);

Preparing the input test file for Dialog Tester

Input file structure

The input file for the Dialog Tester is an Excel file with the following structure:

excel input picture

It is important to note that every line in the Excel that has a value in the ‘Session’ column will be treated as a new session. To construct a longer dialog, the ‘Session’ column should be left empty for all except the first input in the dialog.

Each of the parameters in the image above are explained as follows:

Column Header Description Allowed Values
Session Any content here indicates the beginning of a new session. Leaving this cell empty indicates a user input that is a continuation of a dialog session from the previous input. New dialog sessions are initiated when a cell in the Session column has a value. Numbers, text, or empty
Question The actual user inputs that are sent to the virtual assistant. Text
Link The link identifier when simulating a click action. Please leave it empty if you're using Teneo Web Chat as it is not in use there. Link identifier or empty
VA answer The expected answer from the virtual assistant for the user input shown in the Question column. Text, empty, or syntax operators
Output id path The expected output ID of the virtual assistant answer. Used when the test should compare output IDs rather than the actual virtual assistant answer text, i.e. when the same output ID node can generate multiple dynamic text values. One or more output node IDs (separated by comma and space), or empty
Compare Tells Dialog Tester how to evaluate the virtual assistant output.
ids: Compares the answer ID from the virtual assistant with a list of output IDs.
outputs: Compares the text in the virtual assistant answer field with the text returned by the virtual assistant. If any character in the returned value is different from the one in the virtual assistant's answer, the comparison and test case fails.
mixed: A combination of output IDs and text comparison is used. Dialog Tester first compares the output IDs and, if they match, it compares the received virtual assistant answer with the expected answer. Tells Dialog Tester how to evaluate the virtual assistant's output.
ids / outputs / mixed
Ignore Case Tells Dialog Tester whether to ignore the case when it compares texts. Leaving this column empty means it will not ignore the case. yes/y/no/n
Hyperlink The URL when simulating a hyperlink. Leave it empty if this is not the case. Hyperlink URL or empty
Parameters Tells Dialog Tester to append one or more parameters supplied to the call Parameters with values, multiple parameters separated by &

Output ID comparison

Output IDs received from the published solution are compared with the expected IDs by literal string matching. For this reason, the comparison is sensitive to white spaces (at the start and end of strings, or between concatenated IDs). The expected IDs must match the received IDs exactly for the Dialog Tester to return a match.

Expected Ids: a26f3fc5-52cc-4f36-ae81-d8653ffab400, f068cebb-31da-42a7-8c2d-14b99d7bdcca

Text comparison syntax

Text comparison can be done very strictly. Consider the following example user input:

Example Input: How much would it cost me to receive a text from Spain?
Received Answer: Let's check out roaming costs. I need to know what kind of customer you are. Are you Bill pay or Pay as you go?
Expected Answer: Let's check out roaming costs. I need to know what kind of customer you are. Are you Bill pay or Pay as you go?

In this case, the received and expected answers are identical, which would give a correct result. Note that invisible characters, such as whitespace, must also match. Therefore, ensure there are no leading or trailing whitespaces present in the expected answer column.

Text comparison can also be done in a less strict manner, by using the following three operators:

  • Wildcard ( * ): The asterisk behaves as a wildcard and can replace any character or series characters.
  • Logical OR ( | ): The pipe operator allows the user to specify two or more variants of the expected virtual assistant answer text to compare with the actual answer text received. These variants must completely match the received answer from the virtual assistant, hence the concurrent use of the wildcard '*'.
  • Logical AND ( && ): This operator allows the user to specify two or more expected parts of the answer text. This means that the variants must all be present in the answer received from the virtual assistant, in any order, but that they do not have to completely match the received answer.

Execution Options

The Dialog Tester jar file accepts either 3 or 4 command parameters. These are outlined below:

Parameter Flag Mandatory/Optional Value
Input relative path -x Mandatory folder/DT_input.xlsx
Server name -u Mandatory Engine URL of the published chatbot.
Output relative path -h Mandatory folder/DT_output.html
Output format -o Optional html or text, by default html
Range of inputs to be tested -g Optional from:to, by default run all the user inputs in the input file
Pause between sending each input -p Optional pause between each user input in millisecond; by default no pause

Sample command to run the jar,

java -jar dialogue-tester-3.5.2-jar-with-dependencies.jar -x "DT_[project_name]s.xlsx" -u "https://longberry-xxxxx.bots.teneo.ai/longberry_baristas_5jz1h5hxjb3j0931gx7qxwbns2" -h "DT_[project_name].html" -o html -g 1:10 -p 1000

This command will run the Dialog Tester (v3.5) to test the chatbot with the engine URL https://studio-dev.server_name.com/bot_name](https://studio-dev.server_name.com/botname with the 1st to the 10th input in the input file DT[projectname].xlsx in the same folder and save the result in DT[project_name].html in the folder 'results'. There will be a pause of 1 second between each input.

If you are intrested in the whole input file, please remove the last line with -g 1:10 -p 1000 and run the following command,

java -jar dialogue-tester-3.5.2-jar-with-dependencies.jar -x "DT_[project_name]s.xlsx" -u "https://longberry-xxxxx.bots.teneo.ai/longberry_baristas_5jz1h5hxjb3j0931gx7qxwbns2" -h "DT_[project_name].html" -o html

Engine URL of the chatbot

Teneo Developers users can easily find the Engine URL of the chatbot from the bots panel of the Team Console:

Free trial bots page

For users who did not access Teneo Studio via Teneo Developers, the target URL from the Publish panel of your solution is usually the Engine URL of the chatbot. If this URL does not work, please contact Artificial Solutions. We can help you to figure out the correct Engine URL.

This is what the published bot looks like:
Published page

Run Dialog Tester by batch files

As the command to run the Dialog Tester jar file is quite long, a better practice is running it via batch files, which allows you to run various Dialog Testers with one click and move the old results to a specific folder to avoid being replaced by the new Dialog Tester outputs.

The file and folder structure

To maintain only one set of batch files and one set of Dialog Tester Excel input files, while also allowing these to be run against separate servers, we need a folder structure that lets us distinguish the Dialog Tester output files based on the test server (DEV or QA).

The following file and folder structure is assumed (indentations signal folder hierarchies, / means folder name):

  • /Dialogue tester
    • Dialogue Tester tool jar file
    • Individual Dialogue Tester test Excel files
    • Individual Dialogue Tester batch files
    • Main batch file
    • /dev
      • Dialogue Tester HTML output files (DEV server)
      • old Dialogue Tester HTML files (DEV server)
      • /z_old
        • even older Dialogue Tester HTML files (DEV server)
    • /qa
      • Dialogue Tester HTML output files (QA server)
      • old Dialogue Tester HTML files (QA server)
      • /z_old
        • even older Dialogue Tester HTML files (QA server)

With this folder structure in place, we can parametrize the batch files to point to either the DEV or QA server when running the Dialog Tester and to put the corresponding output files in either the /dev or /qa folders. The old Dialog Tester HTML files are meant to be kept for reference purposes, but can be skipped depending on project needs (as can the even older files in the /z_old subfolder).

Individual batch files

The individual batch files can be located in the /Dialog tester folder (same as the Dialog Tester jar file and Excel input files). Each individual batch file is responsible for running the Dialog Tester jar file with one specific Excel input file, and the names of the two should be identical, except for the file extension. For instance, the batch file DT_Project.bat runs the Dialog Tester tool with DT_Project.xlsx.

The individual batch files are set up so that if they are simply double clicked, they will run against the DEV server by default. This is done so as to make it easy to run only a single batch file for testing a specific area against the DEV server. This is achieved by using a variable for the following part in the solution URL (shown in bold text):

  • https://solution-dev.server_name.com/bot_name/
  • https://solution-qa.server_name.com/bot_name/

The individual batch files can run with an optional input parameter (if the parameter is missing, the default is to run against the DEV server, as mentioned above). It is assumed that this parameter value corresponds to the server, as shown above.

The two examples below will run the same batch file, with the same Excel input file, against the DEV and the QA environments, respectively. The files are run from the Windows command line.

  • DT_Project.bat dev

  • DT_Project.bat qa

The code below shows a simple case of individual batch files, where the server type (DEV or QA) corresponds to the folder names. Each line of code is commented on in the numbered steps below (except linesbeginning with :: which is a batch file comment).

  1. The first line sets the variable server to the parameter value passed to the file. This is empty if no parameter value was present.
  2. If the server variable is empty, default to DEV (note the use of % to refer to variables after they are created).
  3. Move any old HTML output file for this test set to the folder z_old.
  4. Rename any remaining HTML output file to _old.html.
  5. Run the Dialog Tester tool against the server from the server variable and put the output HTML files in the present folder.

The main batch file

When all the individual batch files are run for testing, the normal procedure is to run them via a main batch file that calls all the individual batch files (or a subset of them, depending on circumstances). The main batch file serves three purposes:

  1. Prompt the user for which server to use (DEV/QA).
  2. Ensure that the server was correctly entered and save it for subsequent operations.
  3. Call each of the individual batch files included in the main batch file with the server as a parameter value.

Check-list before running the batch file

The example below contains all the necessary logic to carry out all these steps. If the user enters any other value than DEV or QA, the code will prompt for the server again. The final line in the batch file shows the call to one of the individual batch files.

Each individual batch file that needs to be called by the main batch file should be listed at the end of the file in this format: call DT_name.bat %server%. The call code ensures that the individual batch file is run in a new command line window, meaning several tests can be run in parallel. The %server% code ensures that the value that the user entered will be passed to the individual batch file.

The following elements should be checked as the Dialog Tester batch files are prepared:

Check the following:

  • the folder structure in the batch files corresponds to the actual folder structure
  • filenames referred to in the batch files correspond to the actual filenames
  • the Dialog Tester jar file version in the batch files corresponds to the one being used in the project
  • the URL in the batch file pointing to the published solution is correct

Dialog Tester Output

The picture below shows the resulting HTML file for a test scenario with four user inputs.

Dialogue tester output on excel

It shows that Dialog Tester reports results in four tables. The first table contains the date and time when the test was run, the virtual assistant URL against which it was run, and the name of the input file.

The second table contains the actual results, per user input. The background color highlights the results and the text or output IDs columns, or all four, depending on whether the user chose to compare actual virtual assistant answer texts, their IDs, or a mixture of the two, respectively.

  • Green highlighting indicates that an answer was correct (the virtual assistant answer texts and/or the output IDs met the expected texts and/or IDs).
  • Red is used for incorrect answers.
  • Orange and yellow are used where a test could not be evaluated as correct or incorrect; usually, this is because an expected answer was not supplied in the input file.

A third table reports the overall results for the test. In this example, both user inputs received the expected answer from the virtual assistant. The percentage of correct answers is calculated as the share of Correct (green) inputs in relation to the total number of inputs.

A fourth table reports any delays in the solution response time. In the example above, the virtual assistant responses were returned with no timeouts or delays.

Download

  • The Dialog Tester can be found and downloaded here.

Was this page helpful?