Skip to content

Lab 1: Summarization

Topics

  • Prompt engineering
  • Zero, one-shot, few-shot inference

Libraries

  • torch
  • torch-data
  • transformers (huggingfaces)
  • datasets (huggingfaces)

Models

  • FLAN-T5 (pre-trained LLM from hugging faces)

Code components

  • dialogsum: a dataset of dialogs. One of many datasets in the datasets library. The goal is to teach summarization task. The test subset contains a summary of the dialog written by a human.
  • AutoModelForSeq2SeqLM: FLAN-T5 is loaded with this instruction.
  • tokenizer: It seems that each model defines its own tokenizer method. We can load the tokenizer for a particular model by using AutoTokenizer.from_pre_trained.

Notes

  • We use the correspondent tokens of the dialog as input of the model.generate command. This method will itself output tokens that we are going to use as the input of the tokenizer.decode method.
  • In my experiments, the FLAN-T5 was not good on the summarization task. The output of zero-shot inference was exactly the same of a few-shot inference for some of the dialogs. Using the pre-build prompt templates helped, though.

Additional material