Language Generation Model

LanguageGenerationModel

The LanguageGenerationModel class is used for Language Generation.

To create a LanguageGenerationModel, you must specify a model_type and a model_name.

Note: model_name is set to None to train a Language Model from scratch.

  • model_type should be one of the model types from the supported models
  • model_name specifies the exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, or the path to a directory containing model files.

    Note: For a list of standard pre-trained models, see here.

    Note: For a list of community models, see here.

    You may use any of these models provided the model_type is supported.

1
2
3
4
5
6
7
from simpletransformers.language_generation import (
    LanguageGenerationModel,
)

model = LanguageGenerationModel(
    "gpt2", "gpt2"
)

Note: For more information on working with Simple Transformers models, please refer to the General Usage section.

Configuring a LanguageGenerationModel

LanguageGenerationModel has several task-specific configuration options.

Argument Type Default Description
do_sample bool False If set to False greedy decoding is used. Otherwise sampling is used. Defaults to False as defined in configuration_utils.PretrainedConfig.
early_stopping bool True if set to True beam search is stopped when at least num_beams sentences finished per batch.
evaluate_generated_text bool False Generate sequences for evaluation.
length_penalty float 2.0 Exponential penalty to the length. Default to 2.
max_length int 20 The max length of the sequence to be generated. Between 0 and infinity. Default to 20.
max_steps int -1 Maximum number of training steps. Will override the effect of num_train_epochs.
num_beams int 1 Number of beams for beam search. Must be between 1 and infinity. 1 means no beam search. Default to 1.
num_return_sequences int 1 The number of samples to generate.
repetition_penalty float 1.0 The parameter for repetition penalty. Between 1.0 and infinity. 1.0 means no penalty. Default to 1.0.
top_k float None Filter top-k tokens before sampling (<=0: no filtering)
top_p float None Nucleus filtering (top-p) before sampling (<=0.0: no filtering)
prompt str ”” A prompt text for the model..
stop_token str None Token at which text generation is stopped.
temperature float 1.0 Temperature of 1.0 is the default. Lowering this makes the sampling greedier
padding_text str ”” Padding text for Transfo-XL and XLNet.
xlm_language str ”” Optional language when used with the XLM model..
config_name str None Name of a pre-trained config or path to a directory containing a saved config.
tokenizer_name str None Name of a pre-trained tokenizer or path to a directory containing a saved tokenizer.

Note: For configuration options common to all Simple Transformers models, please refer to the Configuring a Simple Transformers Model section.

Class LanguageGenerationModel

simpletransformers.language_generation.LanguageGenerationModel(self, model_type, model_name, args=None, use_cuda=True, cuda_device=-1, **kwargs,)

Initializes a LanguageGenerationModel model.

Parameters

  • model_type (str) - The type of model to use (model types)

  • model_name (str) - The exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, the path to a directory containing model files, or None to train a Language Model from scratch.

  • args (dict, optional) - Default args will be used if this parameter is not provided. If provided, it should be a dict containing the args that should be changed in the default args.

  • use_cuda (bool, optional) - Use GPU if available. Setting to False will force model to use CPU only. (See here)

  • cuda_device (int, optional) - Specific GPU that should be used. Will use the first available GPU by default. (See here)

  • kwargs (optional) - For providing proxies, force_download, resume_download, cache_dir and other options specific to the ‘from_pretrained’ implementation where this will be supplied. (See here)

Returns

  • None

Note: For configuration options common to all Simple Transformers models, please refer to the Configuring a Simple Transformers Model section.

Generating text with a LanguageGenerationModel

The generate() method is used to generate text.

1
model.generate()

simpletransformers.language_generation.LanguageGenerationModel(self, prompt=None, args=None, verbose=True)

Generate text

Parameters

  • prompt (str) - A prompt text for the model. If given, will override args.prompt

  • args (dict, optional) - A dict of configuration options for the LanguageGenerationModel. Any changes made will persist for the model.

  • verbose (optional) - If verbose, generated text will be logged to the console. Default is True.

Returns

  • generated_sequences (list) - Sequences of text generated by the model.

Updated: