CBR-SUBG ———— Knowledge Base Question Answering by Case-based Reasoning over Subgraphs
huyi / August 2022
- Abstract
- 1.Introduction
- 2.Related work
- 3.Model
- 4. Experiments
Abstract
理念
we hypothesize in a large KB, reasoning patterns required to answer a query type reoccur for various entities in their respective subgraph neighborhoods.
同一子图里面的不同节点倾向于使用同样的reasoning pattern?
model
we introduce a semiparametric model (CBR-SUBG) with
- (i) 找到 KNN training queries
- a nonparametric component that for each query, dynamically retrieves other similar k-nearest neighbor (KNN) training queries along with query-specific subgraphs
- (ii) 辨别 KNN queries 的 reasoning patterns,并且把它们用到 target query 的子图上
- a parametric component that is trained to identify the (latent) reasoning patterns from the subgraphs of KNN queries and then apply them to the subgraph of the target query.
1.Introduction
weakly-supervised
- It is very laborious to annotate the reasoning patterns for each query at scale and hence it is important to develop weakly-supervised knowledge base question answering (KBQA) models that do not depend on the availability of the annotated reasoning patterns.
challenges
- many queries require a model to reason over a set of facts jointly
-
- the query in Figure 1 cannot be answered by considering individual paths
-
- a model has to learn a large number of reasoning patterns because of the diverse nature of possible questions
-
- a model may encounter very few examples of a particular pattern during training, making it challenging for the models to learn and encode the patterns entirely in its parameters
-
Case-based Reasoning (CBR)
- In a CBR framework, a new problem is solved by retrieving other similar problems and reusing their solution to derive a solution for the given problem. In other words, models, instead of memorizing patterns in its parameters, can instead reuse the reasoning patterns of other similar queries, retrieved dynamically during inference.
Contributions.
- To summarize, this paper introduces CBR-SUBG, a semiparametric model for weakly-supervised KBQA that retrieves similar queries and utilizes the similarities in graph structure of local subgraphs to answer a query.
- KNN We also propose a practical algorithm for gathering query-specific subgraphs that utilizes the retrieved KNN queries to produce compact query-specific subgraphs.
- We show that CBR-SUBG can model (latent) subgraph reasoning patterns, more effectively than parametric models; can reason with new entities and new evidence .
- Lastly, we perform competitively with state-of-the-art KBQA models on multiple benchmarks. For example, on the FreebaseQA dataset , we outperform most competitive baseline by 14.45 points.
2.Related work
略
3.Model
In CBR, a case is defined as an abstract representation of a problem along with its solution.
In our KBQA setting, a case is a natural language query along with its answer(entities).
Task Description
训练集:$\mathcal{D}={(q_1,\mathcal{A}_1),(q_2,\mathcal{A}_2),…,(q_N,\mathcal{A}_n)}$
Method Overrview
For input $q$ and $\mathcal{K}$,
-
- retrieves $k$ similar query-answer w.r.t $q$ from $\mathcal{D}$. 提取出来的这一套问题答案组叫 $kNN_q$
-
-
CBR-SUBG finds query-specific subgraphs $\mathcal{K}_{qi}$ for each query in ${q}\cup kNN_q$.
definition for $T(P_q)$:差不多的问题所涉及到的 reasoning patterns 是一样的。Similarly for our KBQA setting, we hypothesize that the reasoning pattern type, $T(P_q)$ repeats across the neighborhood of query subgraphs of $kNN_q$.
-
-
- Next, CBR-SUBG uses GNNs to encode local structure into node representations
-
- Now, if the CBR hypothesis holds true, then the representation of the answer nodes in each query subgraphs will be similar as the local structure around them share similarities. Hence, the answer node of the given query q can be identified by searching for the node that has the most similar representations to the answer nodes in the query subgraphs of $kNN_q$.
3.1. Retrieval of Similar Cases
model
- use a ROBERTA-base encoder to encode each question independently
- A single representation is obtained by mean pooling over the token-level representations
具体方法
- To obtain entity-agnostic representation, we replace the entity spans with a special
[MASK]
token in the query, i.e. the original query becomes ‘Who played[MASK]
in[MASK]
?’. - The similarity score between two queries is given by the inner product between their normalized vector representations.
- We pre-compute the representations of queries in the train set. the most similar query representations are obtained by doing a nearest neighbor search over the representations stored in the case-base.
3.2. Query-subgraph Selection
寻找子图时的目标
- 不能太小:所有需要的 reasoning patterns 和 answers 要在里面
- 不能太大:GPU存储不够
naive strategy
- consider all edges in 2-3 hops around the query entities
- 问题
- subgraphs which are independent of the query
- 对large KG来说,计算复杂度太高
nonparametric approach of query subgraph collection
看训练集里面的每一个case,怎么选这个子图呢?就是把能够从 query 中的 entity 连到 answer 的 path 全部都留下,形成一个子图。
3.3. Reasoning over Multiple Subgraphs
overview
-
This section describes how CBR-SUBG reasons across the subgraphs of the given query and the subgraphs of the retrieved cases
前面说 given query 的 subgraph 怎么取了吗 -
用 GNN 把 subgraph structure – encode –> node representation
- 使得不同子图中的 answer node 在训练中逐渐彼此靠近
- 所以最后找 answer node 就找和 KNN-subgraph 的 answer node 最相近的
Input node representations
用 node 伸出来的边的类型来 represent 这个 entity,即$x_v\in{0, 1}^{|\mathcal{R}|}$
Relative distance enbedding
- 把 query 中提到的 entity 作为 center entity,使所有其他图中的节点有一个 relative distance 被 enbed 进去。
-
具体方式: for each node, the representation$x_v$ is appended with an one-hot distance embedding$x_d\in{0, 1}^{ d }$ where the component corresponding to the shortest distance from the query entity is set to 1.
Message passing
R-GCN
Training
Inference
4. Experiments
- dataset
- reasoning pattern 是由 query2box 启发的
- 这个模型和q2b的区别
Note that in their work, the task was to execute the input structured query on an incomplete KB, i.e,
the shape of the input patterns are known to the model
. In contrast, in our setting, the model has to find the answer node (marked ), which is nestled in each of the structured patternwithout the knowledge of the pattern structure
.Data Generation Process
运用一个 type system,其中包括了 entity type $\mathcal{E}$ 和 relation type $\mathcal{R}$ ‘allowes relation types’只能在某些确定的 entity type 之间连接
- entity-independent
- To ensure that models only rely on the graph structure, each graph has a ‘unique’ set of entities and no two graphs share entities.
- This also effectively tests how much the nonparametric property of CBR-SUBG can reason with unseen entities.
Pattern Generation
grounding
grounding a pattern shape involves
- assigning each nodes with an entity present in a generated graph
如何确保我找的这些entity之间可以形成需要的pattern?每种entity之间都有一个特定relation type? - each edge of the pattern type is assigned a relation from the set of allowed relation types
inserting
把 grounded pattern 对应的 nodes 之间缺的边补上
- ‘pattern type’ that refers to a pattern whose edges have been assigned relation types but the nodes have not been assigned to specific entities
一些训练数据
- 生成1000个图
- 200 pattern types