PyTorch-BigGraph 从实体嵌入到边得分

From entity embeddings to edge scores

The goal of training is to embed each entity in ℝ^D so that the embeddings of two entities are a good proxy to predict whether there is a relation of a certain type between them.

To be more precise, the goal is to learn an embedding for each entity and a function for each relation type that takes two entity embeddings and assigns them a score, with the goal of having positive relations achieve higher scores than negative ones.

key point: 实体->D维向量 关系->函数

All the edges provided in the training set are considered positive instances. In order to perform training, a set of negative edges is needed as well. These are not provided by the user but instead generated by the system during training (Negative sampling), usually by fixing the left-hand side entity and the relation type and sampling a new right-hand side entity, or vice versa. This sampling scheme makes sense for large sparse graphs, where there is a low probability that edges generated this way are true positives edges in the graph.
训练集中提供的所有边被视为正例。为了执行训练,还需要一些负边(负样本)。 这些不是由用户提供的,而是由系统在训练过程中生成的(负采样),通常是通过固定左侧实体和关系类型并采样新的右侧实体,反之亦然。 对于大的稀疏图,这种采样方案是有意义的,因为在这样的情况下,以这种方式生成的边为图中的正例的可能性很小。

A priori, entity embeddings could take any value in ℝ^D. Although, in some cases (for example when restricting them to be within a certain ball, or when comparing them using cosine distance), their “angle” will have greater importance than their norm.

Per-relation scoring functions, however, must be expressible in a specific form (the most common functions in the literature can be converted to such a representation). In the current implementation, they are only allowed to transform the embedding of one of the two sides, which is then compared to the un-transformed embedding of the other side using a generic symmetric comparator function, which is the same for all relations. Formally, for left- and right-hand side entities 𝑥 and 𝑦 respectively, and for a relation type 𝑟, the score is:


where 𝜃𝑥 and 𝜃𝑦 are the embeddings of 𝑥 and 𝑦 respectively, 𝑓𝑟 is the scoring function for 𝑟, 𝑔𝑟 is the operator for 𝑟 and 𝑐 is the comparator.

然而,对于每个关系的得分计算函数,必须以特定的形式表现出来(文献中最常见的函数可以转换为这种表示形式)。在目前的实现中,只允许对一侧的实体进行嵌入转换,通过使用通用的对称比较器将其与另一侧未转换的实体嵌入进行比较,这样的操作对于所有的关系类型都是一样的。形式上,分别对于左侧和右侧实体𝑥和,,以及对于关系类型𝑟,得分为:𝑓𝑟(𝜃𝑥,𝜃𝑦)=𝑐(𝜃𝑥,𝑔𝑟(𝜃𝑦)), 其中,𝜃𝑥和𝜃𝑦分别是𝑥和𝑦的嵌入,𝑓𝑟是关系𝑟的评分函数,𝑔𝑟是关系r的算子,𝑐是比较器。

Under “normal” circumstances (the so-called “standard” relations mode) the operator is solely applied to the right-hand side entities. This is not the case when using dynamic relations. Applying the operator to both sides would oftentimes be redundant. Also, preferring one side over the other allows to break the symmetry and capture the direction of the edge.


Embeddings live in a 𝐷-dimensional real space, where 𝐷 is determined by the dimension configuration parameter.

Normally, each entity has its own embedding, which is entirely independent from any other entity’s embedding. When using featurized entities however this works differently, and an entity’s embedding will be the average of the embeddings of its features.

If the max_norm configuration parameter is set, embeddings will be projected onto the unit ball with radius max_norm after each parameter update.

To add a new type of embedding, one needs to subclass the torchbiggraph.model.AbstractEmbedding class.

Global embeddings

When the global_emb configuration option is active, each entity’s embedding will be translated by a vector that is specific to each entity type (and that is learned at the same time as the embeddings).

The operators that are currently provided are:
• none, no-op, which leaves the embeddings unchanged;
• translation, which adds to the embedding a vector of the same dimension;
• diagonal, which multiplies each dimension by a different coefficient (equivalent to multiplying by a diagonal matrix);
• linear, which applies a linear map, i.e., multiplies by a full square matrix
• affine, which applies a affine transformation, i.e., linear followed by translation.
• complex_diagonal, which interprets the 𝐷-dimensional real vector as a 𝐷/2-dimensional complex vector (𝐷 must be even; the first half of the vector are the real parts, the second half the imaginary parts) and then multiplies each entry by a different complex parameter, just like diagonal.
• 无,无操作,使嵌入保持不变;
• 平移算子,将相同尺寸的向量添加到嵌入中;
• 对角算子,将每个维度乘以不同的系数(相当于乘以对角矩阵);
• 线性算子,运用一个线性映射,例如,讲嵌入和一个全方阵点乘;
• 仿射算子,应用仿射变换,即线性变换后再进行平移;
• 复对角算子,将D维实向量转化为D/2维复矢量(𝐷必须是偶数;矢量的前半部分是实数部分,后半部分是虚数部分),然后将每个项乘以不同的复数参数,就像对角算子一样。

All the operators’ parameters are learned during training.

To define an additional operator, one must subclass the torchbiggraph.model.AbstractOperator class (or the torchbiggraph.model.AbstractDynamicOperator one when using dynamic relations; their docstrings explain what must be implemented) and decorate it with the torchbiggraph.model.OPERATORS.register_as() decorator (respectively the torchbiggraph.model.DYNAMIC_OPERATORS.register_as() one), specifying a new name that can then be used in the config to select that comparator. All of the above can be done inside the config file itself.
如果要自定义新的算子,需要实现torchbiggraph.model.AbstractOperator的子类(动态关系情况下实现torchbiggraph.model.AbstractDynamicOperator子类,docstrings解释了必须实现什么)并且在torchbiggraph.model.OPERATORS.register_as()装饰器中注册(或者torchbiggraph.model.DYNAMIC_OPERATORS.register_as() )指定一个新名称,然后在配置中使用该名称来选择比较器。上述所有操作都可以在配置文件内部完成。

The available comparators are:
• dot, the dot-product, which computes the scalar or inner product of the two embedding vectors;
• cos, the cos distance, which is the cosine of the angle between the two vectors or, equivalently, the dot product divided by the product of the vectors’ norms.
• l2, the negative L2 distance, a.k.a. the Euclidean distance (negative because smaller distances should get higher scores).
• squared_l2, the negative squared L2 distance.
• 点乘,计算两个实体嵌入向量的标量或内积;
• 余弦距离,两个实体嵌入向量的余弦夹角,或者等效的说是 dot(a,b)/(sqrt(a^2)*sqrt(a^2))
• 负L2距离,又称欧几里得距离(使用负的二范数是因为真正比较的是两者的相似度,较小的距离应获得更高的分数,这里的分数其实就类似于相似度)
• L2的负平方距离。

Custom comparators need to extend the torchbiggraph.model.AbstractComparator class (its docstring explains how) and decorate it with the torchbiggraph.model.COMPARATORS.register_as() decorator, specifying a new name that can then be used in the config to select that comparator. All of the above can be done inside the config file itself.


If the bias configuration key is in use, then the first coordinate of the embeddings will act as a bias in the comparator computation. This means that the comparator will be computed on the last 𝐷−1 entries of the vectors only, and then both the first entries of the two vectors will be added to the result.

Coherent sets of configuration parameters

While the parameters described in this chapter are exposed as uncoupled knobs in the configuration file (to more closely match the implementation, and to allow for more flexible tuning), some combinations of them are more sensible than others.

Apart from the default one, the following configuration has been found to work well: init_scale = 0.1, comparator = dot, bias = true, loss_fn = logistic, lr = 0.1.
除默认配置外,还发现以下配置可以正常运行:init_scale = 0.1,comparator = dot, bias = true, loss_fn = logistic, lr = 0.1。

Interpreting the scores

The scores will be tuned to have different meaning and become more suitable for certain applications based on the loss function used during training. Common options include ranking what other entities may be related to a given entity, determining the probability that a certain relation exists between two given entities, etc.


