Skip to content

Natural Language Processing

Lab 1: Summarization

Initializing search

Natural Language Processing

About
Problems
Quick notes
Courses
Courses
- Prompt Engineering for Developers
- CS224U NLU Stanford
  CS224U NLU Stanford
  - About
  - Lectures
    Lectures
    
    Lecture 1
    
    Lecture 2
    
    Lecture 3
    
    Lecture 4
    
    Lecture 5
    
    Lecture 6
    
    Rabbit holes
    Rabbit holes
    
    Lecture 1 Rabbit Hole
    
    Lecture 3 Rabbit Hole
  - Papers
    Papers
    
    Language Modelling
    
    Learning Models
    
    NLP Reflexions
    
    Sentimental Analysis
    
    Word Embeddings
  - Studies
    Studies
    
    Word Embeddings
- LLM and Generative Models Coursera
  LLM and Generative Models Coursera
  - Labs
    Labs
    
    Lab 1: Summarization Lab 1: Summarization
    Table of contents
    
    Topics
    
    Libraries
    
    Models
    
    Code components
    
    Notes
    
    Additional material
- Selected videos
  Selected videos
  - Debate: Do Language Models Need Sensory Grounding for Meaning and Understanding?
Ground knowledge
Ground knowledge
- Combinatorics
- Statistics Review
- The illustrated attention mechanism
- The illustrated transformer
- Goodfellow book
  Goodfellow book

Table of contents

Topics

Lab 1: Summarization

Topics

Prompt engineering
Zero, one-shot, few-shot inference

Libraries

torch
torch-data
transformers (huggingfaces)
datasets (huggingfaces)

Models

FLAN-T5 (pre-trained LLM from hugging faces)

Code components

dialogsum: a dataset of dialogs. One of many datasets in the datasets library. The goal is to teach summarization task. The test subset contains a summary of the dialog written by a human.
AutoModelForSeq2SeqLM: FLAN-T5 is loaded with this instruction.
tokenizer: It seems that each model defines its own tokenizer method. We can load the tokenizer for a particular model by using AutoTokenizer.from_pre_trained.

Notes

We use the correspondent tokens of the dialog as input of the model.generate command. This method will itself output tokens that we are going to use as the input of the tokenizer.decode method.
In my experiments, the FLAN-T5 was not good on the summarization task. The output of zero-shot inference was exactly the same of a few-shot inference for some of the dialogs. Using the pre-build prompt templates helped, though.

Additional material

HuggingFaces datasets
HuggingFaces transformers
HuggingFaces FLAN-T5
Amazon Blog Post on Prompt Engineering
Amazon Blog Post on Zero-Shot learning
FLAN-T5 Prompt Templates
HuggingFaces blog post on Few-shot learning

Made with Material for MkDocs