home..

GraiL-Inductive Relation Prediction by Subgraph Reasoning

huyi / August 2022

0.Abstract

previous work

dominant paradigm for relation prediction in KGs involves learning and operating on latent representations (i.e., embeddings) of entities and relations
problem of previous work
- these embedding-based methods do not explicitly capture the compositional logical rules underlying the knowledge graph
- limited to the transductive setting, where the full set of entities must be known during training
work of this paper

GraIL: reason over local subgraph structures and has a strong inductive bias to learn entity-independent relational semantics

–> outperforms existing rule-induction baselines in the inductive setting

–> complementary inductive bias

task

link prediction

the relation prediction task can also be viewed as a logical induction problem
embedding-based methods

transE、rotatE、…
- pros and cons
  - Embedding-based methods have enjoyed great success by exploiting such local connectivity patterns and homophily.
  - However, it is not clear if they effectively capture the relational semantics of knowledge graphs—i.e., the logical rules that hold among the relations underlying the knowledge graph.
naturally inductive methods
- pros and cons
  - One of the key advantages of learning entity-independent relational semantics is the inductive ability to generalise to unseen entities
  generalise to unseen entities?
  - suffer from scalability issues and lack the expressive power of embedding-based approaches

key idea predict relation between two nodes from the subgraph structure around those two nodes

–> do not use any node attributes in order to test GraIL’s ability to learn and generalize solely from structure

–> only ever receives structural information (i.e., the subgraph structure and structural node features) as input

–> the only way GraIL can complete the relation prediction task is to learn the structural semantics that underlie the knowledge graph

The overall task is to score a triplet (u, rt, v), i.e., to predict the likelihood of a possible relation rt between a head node u and tail node v in a KG, where we refer to nodes u and v as target nodes and to rt as the target relation.
总的任务可以分成三个sub-task:
- 1. extracting the enclosing subgraph around the target nodes
  这个subgraph随便取的吗？不是，下面说了，是取两个target nodes k-hop neighbourhood 的交
- 1. labeling the nodes in the extracted subgraph
- 1. scoring the labeled subgraph using a GNN

score一个subgraph到score一个triplet? 在这个subgraph中去score那个target link

we extract the enclosing subgraph around the target nodes.

enclosing subgraph between nodes u and v: the graph induced by all the nodes that occur on a path between u and v
具体步骤
- $\mathcal{N}_k(u)$ 和 $\mathcal{N}_k(v)$ 是两个 nodes 的 k-hop undirected neighbourhood.
- enclosing subgraph: $\mathcal{N}_k(u)\cap\mathcal{N}_k(v)$
- 这样取出来的 subgraph 当中连接 u 和 v 的路径最长是 k+1 这么长。(整条 path 都在 subgraph 当中)
注：
- 虽然在这个 extracting subgraph 的步骤中我们忽略了边的directions，但是在GNN 的 message passing 的过程中，还是看有向图的。
为什么message passing看的是有向图呢？KG中的relation感觉都对应着一个反关系，u--r-->v 即对应着 u<--r'--v ?
- the target tuple/edge (u, rt, v) is added to the extracted subgraph to enable message passing between the two target nodes.
why? 通过其他的path不能传么

子图中的i：用 (d(i, u), d(i, v)) 来label
- 其中 d(i, u) denotes the shortest distance between nodes i and u without counting any path through v (likewise for d(i, v)).

用GNN来做scoring