home..

Making Pre-trained Language Models Better Few-shot Learners (2021 danqi chen)

huyi / January 2023

[toc]

0. Abstract

GPT-3 few-shot 表现很好，只需要一些 natural-language prompt 和 task demonstrations。本文用一些更小的language model。

LM-BFF

a suite of simple and complementary techniques for finetuning language models on a small number of annotated examples.

prompt-based fine-tuning together with a novel pipeline for automating prompt generation;
a refined strategy for dynamically and selectively incorporating demonstrations into each context.

performance

dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks.

1. Introduction

task setting

task：小样本微调中型的语言模型。

such models can be trained on typical research hardware;
few-shot settings are realistic, as it is generally both easy to acquire a few annotations (e.g., 32 examples) and efficient to train on them;
updating parameters typically leads to better performance.

prompt

人工prompt不精准且费力
auto-prompt
- introducing automatic prompt generation, including a pruned brute-force search to identify the best working label words, and a novel decoding objective to automatically generate templates using the generative T5 model

demonstration

randomly sample a single example at a time from each class to create multiple, minimal demonstration sets.
We also devise a novel sampling strategy that pairs inputs with similar examples, thereby providing the model with more discriminative comparisons.