Knowledge Graph Embeddings

I am convinced that the crux of the problem of learning is recognizing relationships and being able to use them

Christopher Strachey in a letter to Alan Turing, 1954

Knowledge graphs represent information via entities and their relationships. This form of relational knowledge representation has a long history in logic and artificial intelligence. More recently, it has also been the basis of the Semantic Web to create a “web of data” that is readable by machines.

In this project, we explore embedding methods for learning from knowledge graphs. These methods combine multiple advantages:

  • State-oft-the-art results for complex tasks like link prediction and entity resolution
  • Scalable to knowledge graphs with millions of entities and billions of facts
  • Provide access to relational information for deep learning methods

Relational embedding methods are therefore not only interesting from a knowledge graph perspective, but can also be an important step towards relational reasoning in modern AI systems.

One of the first knowledge graph embedding methods is RESCAL which computes a three-way factorization of an adjacency tensor that represents the knowledge graph. Alternatively, it can be interpreted as a compositional model, where pairs of entities are represented via the tensor product of their embeddings. RESCAL is a very powerful model that can capture complex relational patterns over multiple hops in a graph. Figure 1 shows an illustration of the factorization model.

Figure 1: Illustration of the RESCAL factorizaion.

Figure 1: Illustration of the RESCAL factorizaion.

However, RESCAL can be hard to scale to very large knowledge-graphs because its has a quadratic runtime and memory complexity with regard to the embedding dimension. For this reason, we also explored compositonal operators that are more efficient than the tensor product. One outcome of this research direction are Holographic embeddings of knowledge graphs (HolE) which use circular convolution as the compositional operator. Due to this, it can also be interpreted as a multi-relational associative memory, where the relation embeddings store which latent pairs (or pairs of prototypes) are true for a certain relation type. HolE retains the excellent performance of RESCAL while being far more scalable, as it only has a linear dependency on the embedding dimension.

Figure 2: Holographic associative memory. Figure adapted from Willshaw (1981).

An extensive review of statistical machine learning for knowledge graphs is available here. Moreover, a python library that implements many state-of-the-art knowledge graph embeddings methods (e.g., RESCAL, HolE, TransE, ER-MLP) is available on Github.

Publications

. Fast Linear Model for Knowledge Graph Embeddings. AKBC, 2017.

PDF Project

. Holographic Embeddings of Knowledge Graphs. AAAI, 2015.

PDF Code Project

. A Review of Relational Machine Learning for Knowledge Graphs. Proc. IEEE, 2015.

Preprint PDF Project

. Querying Factorized Probabilistic Triple Databases. ISWC, 2014.

Preprint PDF Project Video

. Tensor Factorization for Multi-Relational Learning. ECML/PKDD Nectar Track, 2013.

Preprint PDF Project

. Factorizing YAGO: Scalable Machine Learning for Linked Data. WWW, 2012.

PDF Project Slides