home..

Negated and Misprimed Probes for Pretrained Language Models - Birds Can Talk, But Cannot Fly (2020)

huyi / December 2022

0. Abstract

two new probing tasts of PLM:

LAMA (LAnguage Model Analysis):

investigate whether PLMs can recall factual knowledge that is part of their training corpus.

negated LAMA dataset –> insert ‘not’ in the LAMA dataset

e.g. The theory of relativity was not developed by [MASK].

conclusions:

Models are equally prone to generate facts (“Birds can fly”) and their incorrect negation (“Birds cannot fly”).
In a second experiment, we show that BERT can in principle memorize both positive and negative facts correctly if they occur in training, but that it poorly generalizes to unseen sentences (positive and negative).
However, after finetuning, BERT does learn to correctly classify unseen facts as true/false.

e.g.“Talk? Birds can [MASK]” –> LM easily fill in ‘talk’