
GreaseLM (2022 Jure Leskovec)



remain an open question how to effectively fuse the KG and LM representations in a truly unified manner, where the two representations can interact in a non-shallow way to simulate structured, situational reasoning.


2. Related Work

integrating KG information for QA

earlier, previous work can be characterized into two classes:

  1. use two-tower model: no interaction between LM & GNN

    Improving natural language inference using external knowledge in the science questions domain.(2019)

  2. use one modality to ground the other

    • using an encoded representation of a linked KG to augment the textual representation of a QA example

      Knowledgeable reader(2018)



    • use a representation of the text (e.g., final layer of LM) to provide an augmentation to a graph reasoning model over an extracted KG for the example


most recent, intergration of two models:

  1. access implicit knowledge encoded in LMs –> cons: discard the static KG once they train the LM on its facts, losing important structure that can guide reasoning
  2. join LM and GNN representation via message passing –> cons: use a single pooled representation of the LM to seed the textual component of this joint structure, limiting the updates that can be made to the textual representation 5. GreaseLM enabling representations of both modalities to reflect particularities of the other
    • knowledge grounds language;
    • language nuances specifies which knowledge is important.
  3. some works explore integrating knowledge graphs with language models in the pretraining stage –> cons: the modality interaction is typically limited to knowledge feeding language

3. Proposed Approach: GreaseLM

3.1 input representation

  1. input concatenate our context paragraph c, question q, and candidate answer a with separator tokens to get our model input [c; q; a] and tokenize the combined sequence into ${w_1, \cdots , w_T}$.
  2. use the input sequence to retrieve a subgraph of the KG($\mathcal{G}_{sub}$) 其中的nodes叫${e_1,\cdots,e_J}$

    KG retrieval

    given each QA context, 跟 QA-GNN 一样的 retrieval

    interaction bottlenecks

    a special interaction token $w_{int}$ and a special interaction node $e_{int}$

3.2 Language Pre-Encoding

In the unimodel encoding component: $l=0$,sum the token, segment, and positional embeddings for each token:${w_1, \cdots , w_T}$–>$h_{int}^{0},\cdots,h_T^{0}$. 之后过 N 层 LM-layer,用 transformer 架构.

3.3 GreaseLM


