T5 Model

`T5Model`

The T5Model class is used for any NLP task performed with a T5 model or a mT5 model.

To create a T5Model, you must specify the model_type and model_name.

model_type should be one of the model types from the supported models (t5 or mt5)
model_name specifies the exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, or the path to a directory containing model files.

Note: For a list of standard pre-trained models, see here.

Note: For a list of community models, see here.

You may use any of these models provided they are a T5 model.

from simpletransformers.t5 import T5Model


model = T5Model(
    "t5",
    "t5-base"
)

Note: For more information on working with Simple Transformers models, please refer to the General Usage section.

Configuring a `T5Model`

T5Model has the following task-specific configuration options.

Argument	Type	Default	Description
dataset_class	Dataset	None	A custom dataset class to use. (Subclass of Pytorch Dataset)
do_sample	bool	False	If set to False greedy decoding is used. Otherwise sampling is used. Defaults to False as defined in configuration_utils.PretrainedConfig.
early_stopping	bool	True	if set to True beam search is stopped when at least num_beams sentences finished per batch.
evaluate_generated_text	bool	False	Generate sequences for evaluation.
length_penalty	float	2.0	Exponential penalty to the length. Default to 2.
max_length	int	20	The max length of the sequence to be generated. Between 0 and infinity. Default to 20.
max_steps	int	-1	Maximum number of training steps. Will override the effect of num_train_epochs.
num_beams	int	1	Number of beams for beam search. Must be between 1 and infinity. 1 means no beam search. Default to 1.
num_return_sequences	int	1	The number of samples to generate.
preprocess_inputs	bool	True	Automatically add : and < /s> tokens to train_model() and eval_model() inputs. Automatically add < /s> to each string in to_predict in predict().
repetition_penalty	float	1.0	The parameter for repetition penalty. Between 1.0 and infinity. 1.0 means no penalty. Default to 1.0.
special_tokens_list	list	[]	The list of special tokens to be added to the model tokenizer
top_k	int	None	Filter top-k tokens before sampling (<=0: no filtering)
top_p	float	None	Nucleus filtering (top-p) before sampling (<=0.0: no filtering)
use_multiprocessed_decoding	bool	True	Use multiprocessing when decoding outputs. Significantly speeds up decoding (CPU intensive).

from simpletransformers.t5 import T5Model, T5Args


model_args = T5Args()
model_args.num_train_epochs = 3

model = T5Model(
    "t5-base",
    args=model_args,
)

Note: For configuration options common to all Simple Transformers models, please refer to the Configuring a Simple Transformers Model section.

`Class T5Model`

simpletransformers.t5.T5Model(self, model_name, args=None, use_cuda=True, cuda_device=-1, **kwargs,)

Initializes a T5Model model.

Parameters

model_type (str) - The type of model (t5, mt5).
model_name (str) - The exact architecture and trained weights to use. This may be a Hugging Face Transformers compatible pre-trained model, a community model, or the path to a directory containing model files.
args (dict, optional) - Default args will be used if this parameter is not provided. If provided, it should be a dict containing the args that should be changed in the default args or a T5Args object.
use_cuda (bool, optional) - Use GPU if available. Setting to False will force model to use CPU only. (See here)
cuda_device (int, optional) - Specific GPU that should be used. Will use the first available GPU by default. (See here)
kwargs (optional) - For providing proxies, force_download, resume_download, cache_dir and other options specific to the ‘from_pretrained’ implementation where this will be supplied. (See here)

Returns

None

Training a `T5Model`

The train_model() method is used to train the model.

model.train_model(train_data)

simpletransformers.t5.T5Model.train_model(self, train_data, output_dir=None, show_running_loss=True, args=None, eval_data=None, verbose=True, **kwargs)

Trains the model using ‘train_data’

Parameters

train_data - Pandas DataFrame containing the 3 columns - prefix, input_text, target_text.
- prefix: A string indicating the task to perform. (E.g. "question", "stsb")
- input_text: The input text sequence. prefix is automatically prepended to form the full input. (: )
- target_text: The target sequence
output_dir (str, optional) - The directory where model files will be saved. If not given, self.args['output_dir'] will be used.
show_running_loss (bool, optional) - If True, the running loss (training loss at current step) will be logged to the console.
args (dict, optional) - A dict of configuration options for the T5Model. Any changes made will persist for the model.
eval_data (optional) - Evaluation data (same format as train_data) against which evaluation will be performed when evaluate_during_training is enabled. Is required if evaluate_during_training is enabled.
kwargs (optional) - Additional metrics that should be calculated. Pass in the metrics as keyword arguments (name of metric: function to calculate metric). Refer to the additional metrics section. E.g. f1=sklearn.metrics.f1_score. A metric function should take in two parameters. The first parameter will be the true labels, and the second parameter will be the predictions.

Returns

None

Note: For more details on evaluating T5 models with custom metrics, please refer to the Evaluating Generated Sequences section.

Note: For more details on training models with Simple Transformers, please refer to the Tips and Tricks section.

Evaluating a `T5Model`

The eval_model() method is used to evaluate the model.

The following metrics will be calculated by default:

eval_loss - Model loss over the evaluation data

result = model.eval_model(eval_data)

simpletransformers.t5.T5Model.eval_model(self, eval_data, output_dir=None, verbose=True, silent=False, **kwargs)

Evaluates the model using ‘eval_data’

Parameters

eval_data - Pandas DataFrame containing the 3 columns - prefix, input_text, target_text.
- prefix: A string indicating the task to perform. (E.g. "question", "stsb")
- input_text: The input text sequence. prefix is automatically prepended to form the full input. (: )
- target_text: The target sequence
output_dir (str, optional) - The directory where model files will be saved. If not given, self.args['output_dir'] will be used.
verbose (bool, optional) - If verbose, results will be printed to the console on completion of evaluation.
silent (bool, optional) - If silent, tqdm progress bars will be hidden.
kwargs (optional) - Additional metrics that should be calculated. Pass in the metrics as keyword arguments (name of metric: function to calculate metric). Refer to the additional metrics section. E.g. f1=sklearn.metrics.f1_score. A metric function should take in two parameters. The first parameter will be the true labels, and the second parameter will be the predictions.

Returns

result (dict) - Dictionary containing evaluation results.

Note: For more details on evaluating T5 models with custom metrics, please refer to the Evaluating Generated Sequences section.

Note: For more details on evaluating models with Simple Transformers, please refer to the Tips and Tricks section.

Making Predictions With a `T5Model`

The predict() method is used to make predictions with the model.

to_predict = [
    "binary classification: Luke blew up the first Death Star",
    "generate question: In 1971, George Lucas wanted to film an adaptation of the Flash Gordon serial, but could not obtain the rights, so he began developing his own space opera.",
]

predictions = model.predict(to_predict)

Note: The input must be a List even if there is only one sentence.

simpletransformers.t5.T5Model.predict(to_predict)

Performs predictions on a list of text to_predict.

Parameters

to_predict - A python list of text (str) to be sent to the model for prediction.

Returns

preds (list) - A python list of the generated sequences.

T5Model

Configuring a T5Model

Class T5Model

Training a T5Model

Evaluating a T5Model

Making Predictions With a T5Model

`T5Model`

Configuring a `T5Model`

`Class T5Model`

Training a `T5Model`

Evaluating a `T5Model`

Making Predictions With a `T5Model`