Conversational AI Model

ConvAIModel

The ConvAIModel class is used for Conversational AI.

To create a ConvAIModel, you must specify a model_type and a model_name.

  • model_type should be one of the model types from the supported models (e.g. gpt2, gpt)
  • model_name specifies the exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, or the path to a directory containing model files.

    Note: For a list of standard pre-trained models, see here.

    Note: For a list of community models, see here.

    You may use any of these models provided the model_type is supported.

Tip: A GPT model trained for conversation is available from Hugging Face here. You can use it by downloading the model and extracting it to gpt_personachat_cache.

1
2
3
4
5
6
from simpletransformers.conv_ai import ConvAIModel


model = ConvAIModel(
    "gpt", "gpt_personachat_cache"
)

Note: For more information on working with Simple Transformers models, please refer to the General Usage section.

Configuring a ConvAIModel

ConvAIModel has several task-specific configuration options.

Argument Type Default Description
num_candidates int 2 Number of candidates for training
personality_permutations int 1 Number of permutations of personality sentences
max_history int 2 Number of previous exchanges to keep in history
lm_coef float 2.0 Language Model loss coefficient
mc_coef float 1.0 Multiple-choice loss coefficient
do_sample bool 20 If set to False greedy decoding is used. Otherwise sampling is used.
max_length int -1 The maximum length of the sequence to be generated. Between 0 and infinity. Default to 20.
min_length int 1 The minimum length of the sequence to be generated. Between 0 and infinity. Default to 20.
temperature float 0.7 Sampling softmax temperature
top_k int 0 Filter top-k tokens before sampling (<=0: no filtering)
top_p float 0.9 Nucleus filtering (top-p) before sampling (<=0.0: no filtering)
1
2
3
4
5
6
7
8
9
10
11
from simpletransformers.conv_ai import ConvAIModel, ConvAIArgs


model_args = ConvAIArgs()
model_args.max_history = 5

model = ConvAIModel(
    "gpt",
    "gpt_personachat_cache",
    args=model_args
)

Note: For configuration options common to all Simple Transformers models, please refer to the Configuring a Simple Transformers Model section.

Class ConvAIModel

simpletransformers.conv_ai.ConvAIModel(self, model_name, args=None, use_cuda=True, cuda_device=-1, **kwargs,)

Initializes a ConvAIModel model.

Parameters

  • model_type (str) - The type of model to use (model types)

  • model_name (str) - The exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, or the path to a directory containing model files.

  • args (dict, optional) - Default args will be used if this parameter is not provided. If provided, it should be a dict containing the args that should be changed in the default args or a ConvAIArgs object.

  • use_cuda (bool, optional) - Use GPU if available. Setting to False will force model to use CPU only. (See here)

  • cuda_device (int, optional) - Specific GPU that should be used. Will use the first available GPU by default. (See here)

  • kwargs (optional) - For providing proxies, force_download, resume_download, cache_dir and other options specific to the ‘from_pretrained’ implementation where this will be supplied. (See here)

Returns

  • None

Note: For configuration options common to all Simple Transformers models, please refer to the Configuring a Simple Transformers Model section.

Training a ConvAIModel

The train_model() method is used to train the model.

1
model.train_model(train_file)

simpletransformers.conv_ai.ConvAIModel(self, train_file, output_dir=None, show_running_loss=True, args=None, eval_file=None, verbose=True, **kwargs)

Trains the model using ‘train_file’

Parameters

  • train_file - Path to a JSON file containing the training data. If not given, train dataset from PERSONA-CHAT will be used. The model will be trained on this data. Refer to the Conversational AI Data Formats section for the correct formats.

  • output_dir (str, optional) - The directory where model files will be saved. If not given, self.args['output_dir'] will be used.

  • show_running_loss (bool, optional) - If True, the running loss (training loss at current step) will be logged to the console.

  • args (dict, optional) - A dict of configuration options for the ConvAIModel. Any changes made will persist for the model.

  • eval_file (optional) - Evaluation data (same format as train_file) against which evaluation will be performed when evaluate_during_training is enabled. If not given when evaluate_during_training is enabled, the evaluation data from PERSONA-CHAT will be used.

  • kwargs (optional) - Additional metrics that should be calculated. Pass in the metrics as keyword arguments (name of metric: function to calculate metric). Refer to the additional metrics section. E.g. f1=sklearn.metrics.f1_score. A metric function should take in two parameters. The first parameter will be the true labels, and the second parameter will be the predictions.

Returns

  • None

Note: For more details on training models with Simple Transformers, please refer to the Tips and Tricks section.

Evaluating a ConvAIModel

The eval_model() method is used to evaluate the model.

The following metrics will be calculated by default:

  • language_model_loss
  • f1_score
1
result, model_outputs, wrong_preds = model.eval_model(eval_file)

simpletransformers.conv_ai.ConvAIModel.eval_model(self, eval_file, output_dir=None, verbose=True, silent=False, **kwargs)

Evaluates the model using ‘eval_file’

Parameters

  • eval_file - Path to JSON file containing evaluation data OR list of Python dicts in the correct format. The model will be evaluated on this data. Refer to the Conversational AI Data Formats section for the correct formats.

  • output_dir (str, optional) - The directory where model files will be saved. If not given, self.args['output_dir'] will be used.

  • verbose (bool, optional) - If verbose, results will be printed to the console on completion of evaluation.

  • verbose_logging (bool, optional) - Log info related to feature conversion and writing predictions.

  • silent (bool, optional) - If silent, tqdm progress bars will be hidden.

  • kwargs (optional) - Additional metrics that should be calculated. Pass in the metrics as keyword arguments (name of metric: function to calculate metric). Refer to the additional metrics section. E.g. f1=sklearn.metrics.f1_score. A metric function should take in two parameters. The first parameter will be the true labels, and the second parameter will be the predictions.

Returns

  • result (dict) - Dictionary containing evaluation results. (f1_score, language_model_loss)

Note: For more details on evaluating models with Simple Transformers, please refer to the Tips and Tricks section.

Interacting with a ConvAIModel

Two methods are available for interacting with a ConvAIModel.

  1. interact() - Used to start an interactive terminal session with the model
  2. interact_single - Used to communicate with the model through single messages, i.e. by providing the current message and the history of the conversation.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
personality=[
    "My name is Geralt.",
    "I hunt monsters.",
    "I say hmm a lot.",
]

# Interactive session (looped)
model.interact(
    personality=personality
)

# Single interaction
history = [
    "Hello, what's your name?",
    "Geralt",
    "What do you do for a living?",
    "I hunt monsters",
]

response, history = model.interact_single(
    "Is it dangerous?",
    history,
    personality=personality
)

simpletransformers.conv_ai.ConvAIModel.interact(self, personality=None)

Interact with a model in the terminal.

Parameters

  • personality (list, optional): A list of sentences that the model will use to build a personality. If not given, a random personality from PERSONA-CHAT will be picked.

Returns

  • None

simpletransformers.conv_ai.ConvAIModel.interact_single(self, message, history, personality=None, encode_history=True)`

Get Response from the model based on the history and message

Parameters

  • message (str): A message to be sent to the model.

  • history (list): A list of sentences that repersents the interaction history between the model and the user.

  • personality (list, optional): A list of sentences that the model will use to build a personality.

  • encode_history (bool, optional): If True, the history should be in text (string) form. The history will be tokenized and encoded.

Returns

  • out_text (str) - The response generated by the model based on the personality, history and message.

  • history (list) - The updated history of the conversation. If encode_history is True, this will be in text form. If not, it will be in encoded form.

Updated: