# pytorch-bert-ner **Repository Path**: bluew11/pytorch-bert-ner ## Basic Information - **Project Name**: pytorch-bert-ner - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-09-19 - **Last Updated**: 2024-10-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # pytorch-bert-ner 基于bert的命名实体识别,pytorch实现,支持中英文 # Requirements - `python3` - `pip3 install -r requirements.txt` # Run Exmaple **--bert_model** is the pre_trained pytorch bert model path(pytorch), must contains: **pytorch_model.bin、vocab.txt、bert_config.json** If tensorflow bert model(download from https://github.com/google-research/bert), should convert to pytoch bert model as follow command: `python3 convert_tf_checkpoint_to_pytorch.py --tf_checkpoint_path ../bert_model.ckpt --bert_config_file ../bert_config.json --pytorch_dump_path ../pytorch_model.bin` Recommend to download and use [the converted model](#data_model) ### English NER `python3 run_ner.py --data_dir=data/ --bert_model=base-cased --task_name=ner --output_dir=output --max_seq_length=64 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.4` ### Chinese NER(train corpus in data folder is small part of people daily news for quick start, recommend to [download](#data_model)) `python3 run_ner.py --data_dir=data/ --bert_model=chinese-base-uncased --task_name=chinese_ner --output_dir=output --max_seq_length=64 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.4 ` ## Pretrained pytorch model and data download 链接:https://pan.baidu.com/s/1TNcsx6zGCKjN_KY2It7hyA 提取码:mlmd # Result ### Validation Data ``` precision recall f1-score support MISC 0.9407 0.9304 0.9355 273 LOC 0.9650 0.9881 0.9764 419 PER 0.9844 0.9783 0.9813 322 ORG 0.9794 0.9852 0.9822 337 avg / total 0.9683 0.9734 0.9708 1351 ``` ### Test Data ``` precision recall f1-score support ORG 0.9152 0.9073 0.9113 464 PER 0.9767 0.9692 0.9730 260 LOC 0.9397 0.9263 0.9330 353 MISC 0.8276 0.9014 0.8629 213 avg / total 0.9198 0.9240 0.9217 1290 ``` # Inference `python3 predict.py` ``` {'2': {'tag': 'B_T', 'confidence': 0.9999847412109375}, '0': {'tag': 'I_T', 'confidence': 0.9989903569221497}, '1': {'tag': 'I_T', 'confidence': 0.9995298385620117}, '4': {'tag': 'I_T', 'confidence': 0.9996459484100342}, '年': {'tag': 'I_T', 'confidence': 0.9996104836463928}, '新': {'tag': 'O', 'confidence': 0.9995424747467041}, '的': {'tag': 'O', 'confidence': 0.9997028708457947}, '开': {'tag': 'O', 'confidence': 0.9999663829803467}, '始': {'tag': 'O', 'confidence': 0.9999591112136841}, ',王': {'tag': 'O', 'confidence': 0.9999748468399048}, '兴': {'tag': 'I_PER', 'confidence': 0.9997753500938416}, '很': {'tag': 'O', 'confidence': 0.9993890523910522}, '高': {'tag': 'O', 'confidence': 0.9992743134498596}, '兴': {'tag': 'O', 'confidence': 0.9999097585678101}} ``` or refer: ``` from bert import Ner model = Ner("output/") output = model.predict("Steve went to Paris") print(output) # { # "Steve": { # "tag": "B-PER", # "confidence": 0.999879002571106 # }, # "went": { # "tag": "O", # "confidence": 0.9968552589416504 # }, # "to": { # "tag": "O", # "confidence": 0.9996656179428101 # }, # "Paris": { # "tag": "B-LOC", # "confidence": 0.999504804611206 # } # } ```