# TransE-Pytorch **Repository Path**: jiwenfie/TransE-Pytorch ## Basic Information - **Project Name**: TransE-Pytorch - **Description**: An implementation of TransE in Pytorch. - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2019-12-09 - **Last Updated**: 2021-12-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # TransE-Pytorch An implementation of TransE* in Pytorch. ## Overview Test results on FB15K with default parameters: -----Result of Link Prediction (Raw)----- | Mean Rank | Filter@10 | | tensor(353.5773, device='cuda:0') | 0.101488039816 | -----Result of Link Prediction (Filter)----- | Mean Rank | Filter@10 | | tensor(276.4282, device='cuda:0') | 0.171166900848 | Better performance can be achieved by tunning the parameters. * Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Advances in neural information processing systems. 2013: 2787-2795. ## Parameters Please configure parameters in Train.py: self.inAdd = "./data/FB15K" # input address self.outAdd = "./data/outputData" # output address self.preAdd = "./data/outputData" # address of the existing pre-trained embeddings self.preOrNot = False # continue training based on the existing embeddings in self.preAdd or not self.entityDimension = 100 # the dimension of entity embedding self.relationDimension = 100 # the dimension of relation embedding self.numOfEpochs = 1000 # number of epoch self.outputFreq = 50 # output the learning results every self.outputFreq epoches self.numOfBatches = 100 # the number of batches self.learningRate = 0.01 # 0.01 # the learning rate of SGD optimizer self.weight_decay = 0.001 # 0.005 0.02 #the weight decay of SGD optimizer self.margin = 1.0 # the margin of the loss function self.norm = 2 # the norm of the loss function self.top = 10 # the test metric Hit@self.top self.patience = 10 # change the learning rate and weight decay when the validation result is not getting better after self.patience epoches self.earlyStopPatience = 5 # stop the training and output the learning results after changing the learning rate and weight decay self.earlyStopPatience times ## Data `Training Data` train2id.txt: the first line is the number of triples; in the following, each line is in the format of "head_id tail_id relation_id". entity2id.txt: the first line is the number of entities; in the following, each line is in the format of "entity \t entity_id". relation2id.txt: the first line is the number of relations; in the following, each line is in the format of "relation \t relation_id". valid2id.txt: the first line is the number of validation triples; in the following, each line is in the format of "head_id tail_id relation_id". Note: head_id and relation_id are consistent with entity_id and relation_id. For example, if there is a triple "head tail relation" for training, and we can find "head \t 0" and "tail \t 1" in entity2id.txt, and "relation \t 0" in relation2id.txt. Then, train2id.txt should contain a line "0 1 0". `Test Data` test2id.txt: the first line is the number of test triples; in the following, each line is in the format of "head_id tail_id relation_id". `Output Data` entity2vec.pickle: the pickle file which stores the embedding vectors of entities (refer to transE.entity_embeddings.weight.data in TransE.py). relation2vec.pickle: the pickle file which stores the embedding vectors of relations (refer to transE.relation_embeddings.weight.data in TransE.py). Note: the function preRead() implementated in Train.py can be used to read the pickle files.