Question Answering with Subgraph Embeddings
keywords: question answering, knowledge graph, embeddings
This post will walk through an interesting paper by Facebook Research published in 2014 by Antoine Bordes, Jason Weston, Sumit Chopra: https://research.fb.com/publications/question-answering-with-subgraph-embeddings/
This paper presents an approach where low dimension embeddings are learned for input question, answer candidates and a segment of knowledge graph that captures actual answer.
Key Idea
The key idea of this paper is that embedding corresponding to knowledge subgraph containing answer captures a lot more context that helps system to answer the question unlike typical question-answering systems where answer embedding is learned from the textual context it lies within. Knowledge graph has lot more entities and relationships connected to actual answer that when represented as embedding, encodes lot more information as compared to just learning embedding for a span of text containing answer. The two main contributions this paper makes are: an inference procedure that is efficient and can handle large paths( i.e. multi-hop queries in knowledge bases) and semantically rich representation of answers which encodes graph path from question entity to answer entity and surrounding entities in subgraph.
Looking at figure above, given an input question “q”, the corresponding entity in question, calling it as “question entity” is identified in Freebase Knowledge subgraph. For training, the binary encoding of question is obtained which is then linearly transformed to obtain question embedding. The correct answer entity is identified in same subgraph. The answer entity and entities connected to it are binary encoded first and then a linear transformation is done to obtain subgraph embedding. A scoring function is applied on question and answer embedding to obtain a score. During training, the objective is to maximize the score for correct question-answer pair.
Background
There are mainly two techniques widely used for Question Answering — Information Retrieval based and Semantic Parsing based. Information retrieval based systems retrieves a set of candidate answers for a given question and then rank them to pick the best answer. Semantic Parsing based systems are focused on correct interpretation of the meaning of the question in order to answer. After successful parsing, usually the structured representation of question is converted into a database query that returns the correct answer. One major limitation of both these approaches is that although both the approachs can handle large knowledge bases(KB), they require domain expertise to create grammers, KB schemas. This need human intervation when scaling such systems to new languages and new knowledge bases.
Task setup
The task is to come up with a embeddings based QnA model using a training set where Question-Answer pairs are given along with corresponding entities in Knowledge Base(KB). Knowledge Base (also known as Knowledge Graph) is a graph with entities as nodes and relationships between entities as edges.
The dataset used in this paper consists of: WebQuestions(used for evaluation): filtered such that question has entities that are present in Freebase, Freebase: taking subject-predicate-object triples and generation question-answer pairs using it, ClueWeb is used to expand the lexicon and vocab of system. In addition authors have used paraphrase generation techniques to supplement the training data. Freebase generated question-answer pairs, Clueweb extraction triples and questions from paraphrasing are used to train their model.
Embedding Questions and Answers
Let “q” denote a question and “a” denote a candidate answer. Embeddings for questions and answers is learned by learning a scoring function.
Here q and a represent a combination of embeddings of constituent words, thus learning scoring function involves learning these embeddings.
W shown below is the matrix consisting of embeddings of words, entities and relationships. Entity embedding corresponding to entity “Obama” is highlighted in figure below:
The function that maps the questions into the embedding space and the function which maps the answer into the same embedding space as the questions are defined as:
Representing Candidate Answers
Authors present 3 different types of representation of a single candidate answer.
Single Entity
The answer candidate is represented as a single entity from Freebase, as 1-of-Ns coded vector with 1 corresponding to the entity of answer and rest 0.
Path Representation
The answer is represented as path from entity in question to the answer entity. In their experiments authors considered only 1-hop and 2-hop paths. Thus the encoding of the answer is either a 3-of-Ns coded vector
or 4-of-Ns coded vector:
Subgraph Representation
In this case the path representation and the subgraph of entities connected to answer entity are encoded. For each entity connected to answer entity, the relation between them and connected entity are included in encoded representation. In order to differentiate entity representation if it is in the path representation versus it is present as connected entity to answer entity (subgraph representation), a dictionary double the size is used and they will have different embeddings.
Training and Loss function
Authors have used a margin-based ranking loss function:
Minimizing equation learns the embedding matrix W such that the score of a question with a correct answer is greater than with any incorrect answer by atleast margin of m. Each incorrect answer are sampled from a pool of incorrect answers. Incorrect answer is obtained by sampling entities connected to question entity except the actual answer entity.
Inference
At inference time, given a question q, the model predicts the answer with:
- One possibility of this candidate set — is to take complete knowledge base, but has speed and precision issues. So authors decided to go ahead with generation of this answer candidate set for each question.
- Authors went ahead with two approaches: C1 -> where they consider only the Freebase triples that have the question entity in them such that all 1-hop answer entities are included in the candidate set.
- C2 -> In order to be able to answer answers which are at 2-hop, authors are leveraging beam-search to expand the 2nd hop relations candidates. The model first ranks relations type that are likely to be expressed in question and only add 2nd hop candidates to answer candidate set when these relations appear in those paths.
Experiments and Results
- Authors have shared numbers in the table that indicates their strategy C2 performs the best in terms of F1 score and Precision@1. Replacing C2 by C1 shows a drop in performance as system is not able to handle any questions other than 1-hop ones.
- Overall, the results presented in this paper verify the hypothesis that a richer representation for answers using local subgraph can store the pertinent information which helps in correctly answering the question.