Data

By downloading the dataset, you agree to terms and conditions outlined here.

Notice: Hotel names, flight information, and vacation packages are fictitious and any resemblance to actual hotel names or vacation packages is coincidental. Use at your own risk.

Dataset Format

We provide the Frames dialogues in JSON format. Each dialogue has five main fields: user_id, wizard_id, id, userSurveyRating and turns. More details on these fields can be found in the paper.

These are some of the important fields in the Frames dataset, however, we do encourage you to go over the paper for further description of the various fields in the dataset.

Global Properties
Key Name Description
id Refers to a unique identification for the dialogue.
user_id Refers to a unique identifier for the user taking part in the dialogue.
wizard_id Refers to a unique identifier for the wizard taking part in the dialogue.
Labels
Key Name Description
userSurveyRating A value that represents the user's satisfaction with the Wizard's service, ranging from 1 - complete dissatisfaction to 5 - complete satisfaction.
wizardSurveyTaskSuccessful A boolean which is true if the wizard thinks at the end of the dialogue that the user's goal was achieved.
Turns
Key Name Description
author The author of the message in a dialogue. i.e. "user" or "wizard".
text The sentence that the author uttered. It is the exact text that the author of a turn said. E.g. "text": "Consider it done. Have a great trip!".
labels JSON object which has three keys: active_frame, acts, and acts_without_refs. The active_frame is the id of the currently active frame. The acts are the dialogue acts for the current utterance. Each act has a name and arguments args. The name is the name of the dialogue act, for instance, offer, or inform. The args contain the slot types (key) and slot values (val), for instance budget=$2000. Slot values are optional. An act contains a ref tag whenever a user or wizard refers to a past frame. The acts_without_refs are similar to the acts except that they do not have these ref tags. We define the frame tracking task as the task that takes as input the acts_without_refs and outputs the acts.
timestamp Unix timestamp denoting the time at which the current turn occurred.
frames List of frames up to the current turn. Each frame has the following keys: frame_id, frame_parent_id, requests, binary_questions, compare_requests, and info.
db It can only occur during a wizard's turn. It is a list of search queries made by the wizard with the associated list of search results.

E.g. "db": {"search": [{"ORIGIN_CITY": "Montreal"}], "result": []}

Frames
Key Name Description
frame_id Id of the frame.
frame_parent_id Id of the parent frame.
requests, binary_questions, compare_requests Requests are questions related to one frame, for instance “what is the price of this package?”. Compare_requests concern several frames. For example, the user might ask to compare different packages: “What is the guest rating of these two hotels?”. Binary_questions are questions with both a slot type and a slot value. These are special cases of requests and compare_requests, for instance “are both hotels 3.5 stars?”.
info

The info contains all the constraints set by the user or the wizard in the frame. These constraints are expressed as slot types which have a value. Note that each slot can have multiple values, which accumulate as long as the frame does not change. For example, the price can be both "1000 USD" and "cheapest". There are two additional fields to keep track of specific aspects of the dialogue:

REJECTED a boolean value expressing if the user negated or affirmed an offer made by the wizard.

MOREINFO a boolean value expressing whether the user wants to know more about the frame in question