home..
Making Pre-trained Language Models Better Few-shot Learners (2021 danqi chen)
huyi / January 2023
[toc]
0. Abstract
GPT-3 few-shot 表现很好,只需要一些 natural-language prompt 和 task demonstrations。本文用一些更小的language model。
LM-BFF
a suite of simple and complementary techniques for finetuning language models on a small number of annotated examples.
- prompt-based fine-tuning together with a novel pipeline for automating prompt generation;
- a refined strategy for dynamically and selectively incorporating demonstrations into each context.
performance
dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks.
1. Introduction
task setting
task:小样本微调中型的语言模型。
-
such models can be trained on typical research hardware;
-
few-shot settings are realistic, as it is generally both easy to acquire a few annotations (e.g., 32 examples) and efficient to train on them;
-
updating parameters typically leads to better performance.
prompt
- 人工prompt不精准且费力
- auto-prompt
- introducing automatic prompt generation, including a pruned brute-force search to identify the best working label words, and a novel decoding objective to automatically generate templates using the generative T5 model
demonstration
- randomly sample a single example at a time from each class to create multiple, minimal demonstration sets.
- We also devise a novel sampling strategy that pairs inputs with similar examples, thereby providing the model with more discriminative comparisons.