# THUMT **Repository Path**: mirrors_THUNLP-MT/THUMT ## Basic Information - **Project Name**: THUMT - **Description**: An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group - **Primary Language**: Unknown - **License**: BSD-3-Clause - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2022-01-07 - **Last Updated**: 2025-12-07 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # THUMT: An Open Source Toolkit for Neural Machine Translation ## Contents * [Introduction](#introduction) * [Online Demo](#online-demo) * [Implementations](#implementations) * [Notable Features](#notable-features) * [Documentation](#documentation) * [License](#license) * [Citation](#citation) * [Development Team](#development-team) * [Contact](#contact) * [Derivative Repositories](#derivative-repositories) ## Introduction Machine translation is a natural language processing task that aims to translate natural languages using computers automatically. Recent several years have witnessed the rapid development of end-to-end neural machine translation, which has become the new mainstream method in practical MT systems. THUMT is an open-source toolkit for neural machine translation developed by [the Natural Language Processing Group at Tsinghua University](http://nlp.csai.tsinghua.edu.cn/site2/index.php?lang=en). The website of THUMT is: [http://thumt.thunlp.org/](http://thumt.thunlp.org/). ## Online Demo The online demo of THUMT is available at [http://translate.thumt.cn/](http://101.6.5.207:3892/). The languages involved include Ancient Chinese, Arabic, Chinese, English, French, German, Indonesian, Japanese, Portuguese, Russian, and Spanish. ## Implementations THUMT has currently three main implementations: * [THUMT-PyTorch](https://github.com/thumt/THUMT): a new implementation developed with [PyTorch](https://github.com/pytorch/pytorch). It implements the Transformer model (**Transformer**) ([Vaswani et al., 2017](https://arxiv.org/abs/1706.03762)). * [THUMT-TensorFlow](https://github.com/thumt/THUMT/tree/tensorflow): an implementation developed with [TensorFlow](https://github.com/tensorflow/tensorflow). It implements the sequence-to-sequence model (**Seq2Seq**) ([Sutskever et al., 2014](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf)), the standard attention-based model (**RNNsearch**) ([Bahdanau et al., 2014](https://arxiv.org/pdf/1409.0473.pdf)), and the Transformer model (**Transformer**) ([Vaswani et al., 2017](https://arxiv.org/abs/1706.03762)). * [THUMT-Theano](https://github.com/thumt/THUMT/tree/theano): the original project developed with [Theano](https://github.com/Theano/Theano), which is no longer updated because MLA put an end to [Theano](https://github.com/Theano/Theano). It implements the standard attention-based model (**RNNsearch**) ([Bahdanau et al., 2014](https://arxiv.org/pdf/1409.0473.pdf)), minimum risk training (**MRT**) ([Shen et al., 2016](http://nlp.csai.tsinghua.edu.cn/~ly/papers/acl2016_mrt.pdf)) for optimizing model parameters with respect to evaluation metrics, semi-supervised training (**SST**) ([Cheng et al., 2016](http://nlp.csai.tsinghua.edu.cn/~ly/papers/acl2016_semi.pdf)) for exploiting monolingual corpora to learn bi-directional translation models, and layer-wise relevance propagation (**LRP**) ([Ding et al., 2017](http://nlp.csai.tsinghua.edu.cn/~ly/papers/acl2017_dyz.pdf)) for visualizing and anlayzing RNNsearch. The following table summarizes the features of three implementations: | Implementation | Model | Criterion | Optimizer | LRP | | :------------: | :---: | :--------------: | :--------------: | :----------------: | | Theano | RNNsearch | MLE, MRT, SST | SGD, AdaDelta, Adam | RNNsearch | | TensorFlow | Seq2Seq, RNNsearch, Transformer | MLE| Adam | RNNsearch, Transformer | | PyTorch | Transformer | MLE | SGD, Adadelta, Adam | N.A. | We recommend using [THUMT-PyTorch](https://github.com/thumt/THUMT) or [THUMT-TensorFlow](https://github.com/thumt/THUMT/tree/tensorflow), which delivers better translation performance than [THUMT-Theano](https://github.com/thumt/THUMT/tree/theano). We will keep adding new features to [THUMT-PyTorch](https://github.com/thumt/THUMT) and [THUMT-TensorFlow](https://github.com/thumt/THUMT/tree/tensorflow). ## Notable Features * Transformer ([Vaswani et al., 2017](https://arxiv.org/abs/1706.03762)) * Multi-GPU training & decoding * Multi-worker distributed training * Mixed precision training & decoding * Model ensemble & averaging * Gradient aggregation * TensorBoard for visualization ## Documentation The documentation of PyTorch implementation is avaiable at [here](docs/index.md). ## License The source code is dual licensed. Open source licensing is under the [BSD-3-Clause](https://opensource.org/licenses/BSD-3-Clause), which allows free use for research purposes. For commercial licensing, please email [thumt17@gmail.com](mailto:thumt17@gmail.com). ## Citation Please cite the following paper: > Zhixing Tan, Jiacheng Zhang, Xuancheng Huang, Gang Chen, Shuo Wang, Maosong Sun, Huanbo Luan, Yang Liu. [THUMT: An Open Source Toolkit for Neural Machine Translation](https://www.aclweb.org/anthology/2020.amta-research.11/). AMTA 2020. > Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng, Maosong Sun, Huanbo Luan, Yang Liu. 2017. [THUMT: An Open Source Toolkit for Neural Machine Translation](https://arxiv.org/abs/1706.06415). arXiv:1706.06415. ## Development Team Project leaders: [Maosong Sun](http://www.thunlp.org/site2/index.php/zh/people?id=16), [Yang Liu](http://nlp.csai.tsinghua.edu.cn/~ly/), Huanbo Luan Project members: Theano: Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng TensorFlow: Zhixing Tan, Jiacheng Zhang, Xuancheng Huang, Gang Chen, Shuo Wang, Zonghan Yang PyTorch: Zhixing Tan, Gang Chen ## Contact If you have questions, suggestions and bug reports, please email [thumt17@gmail.com](mailto:thumt17@gmail.com). ## Derivative Repositories * [UCE4BT](https://github.com/THUNLP-MT/UCE4BT) (Improving Back-Translation with Uncertainty-based Confidence Estimation) * [L2Copy4APE](https://github.com/THUNLP-MT/L2Copy4APE) (Learning to Copy for Automatic Post-Editing) * [Document-Transformer](https://github.com/THUNLP-MT/Document-Transformer) (Improving the Transformer Translation Model with Document-Level Context) * [PR4NMT](https://github.com/THUNLP-MT/PR4NMT) (Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization)