Skip to content

Lecture 1

Date: 28 March 2022

Video Lecture

NLU history

Logic models were popular in the 80s; got somewhat unpopular in the 90s with advances in Machine Learning (many predicted the logic-driven models were doomed to oblivion); in the 2000s logic models were revived to experiment another decline in 2010s (deep learning appearence, Watson won Jeopardy!).

Text generation with GPT-3

GPT-3 stands for Generative Pre-trained Transformer. It is an autoregressive language model that uses deep learning to produce human-like text.

Examples:

Benchmarks

  • GLUE: The General Language Understanding Evaluation benchmark: https://gluebenchmark.com/
  • SwitchBoard: Speech-to-text transcription.
  • SQuAD 1.0/2.0: Stanford Question and Answering Dataset.

A peek behind the curtain: It is not all about the success

It is not difficult to come up with examples that demonstrate that the system behind artificial inteligence algorithms do not deeply understanding what is going on.

  • Siri in the Colbert Show (2014)
  • GPT-3 text generation: Liquify cats; and medical advice that advises you to kill yourself.
  • Image captioning: A traffic sign is captioned as a refrigerator full of foods and drinks.

Because the models make mistakes that no human would make, this is considerd an indicator that these models do not have a clear understanding of what the task really is.

Does Anne Hathaway news drive Berkshire Hathaway stocks? This was a headline in 2011. This is the type of correlation that is not related at all with causation. But several machine learning models are correlation freak machines that can consider this a very plausible indicator to make decisions.

The bankruptcy story republishing. Due to a republishing of the headline of an important industry that wen bankruptcy, automated systems almost provoked a crash in the financial systems.

How to overcome that?

Adversarial testing. Come up with tests where humans would certainly not make a mistake and check what the model answers. Check for variations in order to certificate that the model is not being supported by some spurious pattern matching strategy.

Adversarial testing reveals overfitting.

Leisure books

  • Superintelligence: Nick Bostrom
  • How to survive a robot uprising

Difficult task: Ask for theater tickets

  • Domain knowledge: Movie theaters and shows.
  • Discourse knowledge: Linking pronouns to the nouns they refer.
  • World knowledge: 1 adult and 2 kids refer to ticket categories.

Vocabulary

Reweighting

  • probabilitites
  • length norm
  • TF-IDF
  • O/E
  • PMI
  • Positive PMI

Dimensionality Reduction

  • LSA
  • GLoVe
  • word2vec
  • autoencoders

Vector Comparison

  • Euclidean
  • Cosine
  • Dice
  • KL

  • BERT

Assignments

  • Word similarity
  • Sentimental analysis
  • Context integration

Material

Jupyter Notebook Tutorial

  • Use ------- in Markdown to create a divider.
  • You can toogle the output of cells (O).
  • You can add a scroll bar to the output of cells (Shift+O).
  • You can write Latex in Markdown.
  • Kernel is a jupyter notebook concept. In a kernel you have everything needed by the jupyter core to run a notebook. For example, there is a Kernel for Julia language, or SageMath.

Paper reading